Detect EOS faster.
Try to reuse one of the input buffer as the output buffer. This usually works
and avoids an allocation and a memcpy.
Be smarter with GAP buffers so that they don't get mixed or cleared at all. Also
try to use a GAP buffer as the output buffer when all input buffers are GAP
buffers.