The RTP payload seems to be required as it carries the frame count
information. Also, gst_rtp_base_payload_allocate_output_buffer had
the second argument incorrect.
Strangely some devices like Shanling MP4 and Sony XM3 would still
work without this while some like the Sony XM4 do not.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/1797>