Allows reducing the initial stack size of GPU threads. Cuda should
automatically increase this value if a kernel requires a larger stack.
Can save roughly 40MB of GPU memory for a single nvh264enc instance.
Part-of: <https://gitlab.freedesktop.org/gstreamer/gstreamer/-/merge_requests/8158>