The current way only selects the best video format from the first
structure of the caps. The caps like:
video/x-raw(memory:VAMemory),drm-format=(string)NV12; \
video/x-raw(memory:VAMemory),format=(string){ NV12, Y210 }
Will just choose NV12 as the result, even the bitstream is 10 bits.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/4928>