We in fact get the size of the header (including stuffing bytes), therefore use that instead of trying to skip 0xff bytes ourselves since some media streams do start with 0xff (like mpeg audio's initial 0xfff).