Otherwise we might start the scheduled playback before the audio or video streams are
actually enabled, and then error out later because they are enabled to late.
We enable the streams when getting the caps, which might be *after* we were
set to PLAYING state.