Prevent race condition where gst_play_sink_do_reconfigure() could be called
from a pad probe while stream synchronizer pads are being released during
GST_STATE_CHANGE_PAUSED_TO_READY transition.
The race occurred when:
1. State change starts releasing stream synchronizer pads
2. Pads are unblocked earlier in the state change, allowing events to flow
3. A streaming thread triggers sinkpad_blocked_cb -> gst_play_sink_do_reconfigure
4. Reconfiguration tries to use already-released pad pointers
5. New pad creation fails with assertion in gst_pad_iterate_internal_links
The fix adds GST_PLAY_SINK_LOCK around the pad cleanup to ensure atomic
cleanup and prevent concurrent access during state transitions.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/9233>
When using the custom WebKitMediaSrc element (used by WebKit and able to
perform an initial seek in playbin), a stall caused by streamsynchronizer
was detected during an initial seek. The flow of events revealed that the
intertwining of the initial configuration of the streams with the reset
caused by the flush events from the seek left streamsynchronizer in an
inconsistent state:
streamsynchronizer0:sink_0 (video) events, starting before the seek:
stream-start --> Sets the stream to wait
flush-stop --> Clears the stream wait flag
caps
tag
segment
stream-collection
(buffers start to come and flow properly)
streamsynchronizer0:sink_1 (audio) events, happening after seek:
(no flush events, because the stream hadn't been initialized when the seek happened)
stream-start --> Sets the stream to wait
caps
segment
(stalled because the stream is in wait mode!)
The code in streamsynchronizer expects that all the streams are in wait
state before releasing all of them at once. The flush on the video stream
broke that assumption and that's why the audio stream is never released in
that scenario.
Avoiding the clearing of the wait flag on flush-stop isn't an actual solution
to the problem, as it creates other side effects and at least makes the
gst-editing-services/seek_with_stop test to timeout. The alternate solution
implemented in this patch consists on analyzing if the other streams different
from the one newly added (after the flush) aren't waiting (which would mean
that they've all been unlocked after all of them were waiting before), and,
in that case, mark the new stream as also not waiting.
A new test_stream_start_wait test case has been added to demonstrate this
problem. The test case creates a video stream, pushes a buffer, then
simulates a seek by pushing flush-start, flush-stop, stream-start and segment
events. Note that the flush-stop clears the video stream waiting flag.
After that, a new audio stream is created and stream-start and new segment
events are sent. Note that stream-start will set the audio stream to wait.
Then a buffer is pushed on each stream. In the failing case, the test hangs.
In the working case (after this fix), the test runs properly because the
fact of having seen a stream-start also helps to clear the wait flag.
A second new test_stream_start_wait_sparse test has also been added to prove
that this mechanism can also work with sparse streams (a special case of the
current stream-start handling code). This test behaves like the previous one,
but there's no video buffer after the seek (it'll come in the future, as the
stream is sparse, but actually never comes). The buffer after the seek in the
audio stream starts at its due time. Streamsynchronizer is able to ignore
the wait for the video stream and produce audio buffers on time.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/4544>
I didn't find the behavior and purpose of streamsynchronizer documented
or intuitive. Eventually I got Edward to explain it to me, which was
very helpful. Now I'm contributing some docs so that the next person
doesn't have to figure it out by asking around and hoping for an answer.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/7084>
To simplify the description, I'm assuming we only have two streams: video and audio.
For the video stream, we have the following events :
- STREAM_START => stream->wait set to true
- NEW_SEGMENT(1) => blocked waiting in gst_stream_synchronizer_wait
- FLUSH_START => unblocked
- FLUSH_STOP => stream->wait reset to false
- NEW_SEGMENT(2) => not waiting, since stream->wait is false
Then for the audio stream, we have the following events :
- STREAM_START => stream->wait set to true
- NEW_SEGMENT(2) => blocked waiting in gst_stream_synchronizer_wait for ever.
Note: The first NEW_SEGMENT event and the FLUSH_START, FLUSH_STOP events of the audio stream
are dropped before being received by the streamsynchronizer element, because the decodebin audio pad src
is not yet linked to the playsink audio pad sink.
To fix this deadlock, we don't reset stream->wait to false in the FLUSH_STOP event when it is not
waiting for the EOS of the other streams.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/6763>
self->eos was never reset after streamsynchronizer has sent EOS
(except on explicit flush or switching back to PAUSED).
As a result, synchronization was broken if new streams were pushed later
as gst_stream_synchronizer_wait() does not wait if self->eos is set.
Fix this by reseting self->eos on STREAM_START as that means a new
stream is being sent upstream and so a new EOS will follow later on.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/4749>