Files
ffmpeg/fftools
Niklas HaasandNiklas Haas 03dfac5630 fftools/ffmpeg_sched: allow throttling decoder outputs
This is a departure from the conventional idea of decoders always outputting
data as fast as possible. Instead, this allows decoders to be throttled in the
same way filter graphs can be.

This comes into play when e.g. a demuxer is feeding into two decoders, but
only one of the two decoders is actually currently needed (e.g. due to
A/V misalignment). In that case, what typically happens is that the unneeded
decoder alse decodes all frames, and then piles them up on the "buffersrc"
filter's downstream link (growing indefinitely).

Another issue this solves manifests when e.g. a single demuxer is feeding many
decoders that all try to feed frames to the same filter graph. In this case,
all decoders run as fast as posssible, leading to lock contention on the
filter graph input queue; resulting in (again) many frames piling up on the
buffersrc (or downstream filters) for the unneeded inputs that are not actually
the bottleneck, while the input that's actually undersatisfied can end up
starved for CPU time, possibly for long enough to exhaust memory limits. The
normal rate limiting fails to apply in this scenario because all decoders share
a single demuxer, and are hence rate-limited only by the demuxer speed; whereas
the demuxer is not choked because from the PoV of the scheduler, the filter
graph is simply not getting enough frames.

In a more general sense, there's a philosophical argument to be made here.
Since a decoder is typically also a decompressor, it produces more data than
it consumes. So, it a sense, it's acting like a type of producer also - in
the same way that a filter graph can produce more input that outputs.

Solve all of these issues by allowing decoders to be output-choked, which
gives the scheduler control over when decoders are allowed to output frames.
This does mean we have to add some sort of internal packet queue, because the
decoder thread may need to continue *accepting* upstream packets from the
demuxer (or else we risk stalling the demuxer), but defer the actual decoding
by placing them inside an internal "overflow" queue.

This effectively simulates a sort of "filter graph"-type semantics but
for the decoder queue.

This overflow logic is fairly self-contained inside `sch_dec_receive`, though
it is quite nontrivial. I have added as much documentation as is hopefully
needed to understand the logic.

Importantly, we cannot simply unlimit the decoder input thread queue because
the demuxer relies on backpressure from the decoder to rate limit itself. (Note
that demuxers may only be active if there is at least one downstream decoder
that is alse active, so we always have at least one decoder providing
backpressure)

Sponsored-by: nxtedition AB
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-05-23 08:41:12 +00:00
..
2025-11-22 18:38:40 +00:00
2025-11-23 12:53:43 +00:00