Make it work a little more like RTP. Have the source interact with the
clock and set the capture time on each packet. Then the other elements
can use that to do adjustments. Since AVTP is always very low latency,
it can be assumed that the gPTP clock at the packet reception is very
close to the sending time, never more than 2 seconds off, so the
timestamps can be compared directly.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/9412>