124952 Commits
Author SHA1 Message Date
Timo Rothenpieler 5f998e304d avcodec/nvenc: fix b_ref_mode capability check
Turns out it's a bitfield, not straight values.

Fixes #23061
2026-06-10 20:17:44 +02:00
James Almer cf0244aa38 avformat/movenc: avoid negative cts offsets when using an edit list with CMAF output
Fixes issue #23417.

Signed-off-by: James Almer <jamrial@gmail.com>
2026-06-10 18:14:42 +00:00
Martin Storsjö d05786cf23 aarch64: vp9lpf: Fix GCS violations
The aarch64 VP9 loopfilters actually violate aarch64 GCS
(Guarded Control Stack), even though we marked the code as GCS
compliant in 846746be4b.

This means that builds with GCS enabled, after that commit,
will crash when decoding VP9, on future hardware (or current
QEMU) that supports GCS. This also goes for ffmpeg version 8.1.1
where the GCS enabling was backported.

This matches the fix that was done for hevcdsp in
1f7ed8a78d.

This issue wasn't observed if running checkasm in QEMU - therefore,
I thought all GCS issues had been fixed by
846746be4b. (If I would have
tested the full "make fate" with QEMU, the issue would
have appeared though.)

However with the new checkasm, some of the GCS violations
do appear even in checkasm.

The reason is that the checkasm vp9 test intentionally craft
input pixels that attempt to trigger all the individual
separate cases in each input buffer (in
randomize_loopfilter_buffers). This means that the checkasm
tests actually never test or exercise the early exit cases,
which are the ones that violate GCS.

With the new checkasm, the call to "bench_new" always test
running the code at least once, even if not benchmarking.

As the input buffers weren't reinitialized between the test
and "bench_new", the pixel differences now differ from the
initial setup, so that the code now some times (often) would
end up hitting the early exit cases.

Ideally, the vp9 checkasm test would be repeated to cover all
cases of input buffers that allow early exits, in addition to
covering the case with all different cases in one block.
2026-06-10 18:03:01 +00:00
DROOdotFOOandRamiro Polla cc7c567920 swscale/aarch64/yuv2rgb_neon: add BE 16bpp output formats
BE counterparts to the LE paths in 2e142e52ae; pack adds rev16 before
store. nv12/nv21 paths are added but bench-only (no C ref, same as
2e142e52ae).

Test Name                              A55-gcc           M1-clang             A76-gcc
-------------------------------------------------------------------------------------
yuv420p_rgb565be_1920_neon    15086.1 ( 3.91x)    5507.0 ( 4.34x)    19229.1 ( 2.02x)
yuv420p_bgr565be_1920_neon    15291.7 ( 3.84x)    5476.9 ( 4.37x)    19229.4 ( 2.02x)
yuv420p_rgb555be_1920_neon    15091.5 ( 3.67x)    5569.0 ( 3.97x)    19229.3 ( 1.90x)
yuv420p_bgr555be_1920_neon    15298.6 ( 3.62x)    5600.6 ( 3.98x)    19228.8 ( 1.90x)
yuv422p_rgb565be_1920_neon    16862.3 ( 4.00x)    6378.8 ( 4.64x)    22110.3 ( 2.07x)
yuv422p_bgr565be_1920_neon    17139.3 ( 3.93x)    6448.1 ( 4.50x)    22104.1 ( 2.07x)
yuv422p_rgb555be_1920_neon    16853.3 ( 3.98x)    6468.8 ( 4.12x)    22106.4 ( 1.98x)
yuv422p_bgr555be_1920_neon    17202.2 ( 3.89x)    6467.0 ( 4.12x)    22110.2 ( 1.98x)
yuva420p_rgb565be_1920_neon   15050.2 ( 3.92x)    5452.5 ( 4.39x)    19229.5 ( 2.02x)
yuva420p_bgr565be_1920_neon   15346.6 ( 3.84x)    5462.4 ( 4.36x)    19228.9 ( 2.02x)
yuva420p_rgb555be_1920_neon   15050.8 ( 3.69x)    5463.3 ( 3.95x)    19228.6 ( 1.90x)
yuva420p_bgr555be_1920_neon   15352.8 ( 3.61x)    5543.6 ( 3.89x)    19228.6 ( 1.90x)

Co-authored-by: Ramiro Polla <ramiro.polla@gmail.com>
Signed-off-by: DROOdotFOO <drew@axol.io>
2026-06-10 17:54:20 +00:00
DROOdotFOOandRamiro Polla 7ab5aebc08 swscale/yuv2rgb: add explicit BE/LE 565/555 cases
ff_yuv2rgb_get_func_ptr() now returns the C reference for explicit
BE/LE 16bpp formats, not only the NE alias.

Signed-off-by: DROOdotFOO <drew@axol.io>
2026-06-10 17:54:20 +00:00
Romain Beauxis 590d775a66 avcodec/codec_id: add .props to AV_CODEC_ID_APPLE_APAC
Signed-off-by: Romain Beauxis <romain.beauxis@gmail.com>
2026-06-10 10:08:39 -05:00
wangbinandmichaelni c73476e107 avfilter/vf_scale_d3d11: ensure guids are defined
fix vf_scale_d3d11.o : error LNK2001: unresolved external symbol IID_ID3D11VideoContext
2026-06-10 15:08:12 +00:00
Niklas HaasandNiklas Haas 976e18fdef swscale/x86: use correct HOSTCC_E flag instead of CC_E
HOSTCC and CC might be completely different compilers.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-06-10 15:04:50 +00:00
Niklas Haas 7b59a86633 swscale/uops_tmpl: move attributes before static keyword
This fails to compile with C23 standard attributes otherwise.

Technically only av_unused requires this, but move the other attributes
as well for consistency / future proofing.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-06-10 16:27:58 +02:00
Frank Plowman b899f7e8b5 lavc/vvc: Fix num_entry_points derivation when using RPR
Context:
1. In the case sps_subpic_info_present=0, there is a single subpicture
   which includes the entire picture.
2. When sps_subpic_info_present=0, we might be using Reference Picture
   Resampling (RPR), in which picture sizes might differ in the PPS,
   rather than in the SPS.

Because of 2., we can't rely on the sequence-level variables
sps_subpic_width_minus1 and sps_subpic_height_minus1 to derive the
picture-level variable num_entry_points, as the picture might have a
different size to the picture used when deriving those sequence-level
variables.
2026-06-10 14:02:29 +00:00
Hunter Kvalevogandmichaelni 54749da98a avdevice/gdigrab: make overlay window layered
WS_EX_LAYERED allows input events to pass through to windows beneath.

WS_EX_NOACTIVATE prevents the window from stealing focus when created.
2026-06-10 12:56:20 +00:00
Hunter Kvalevogandmichaelni dc1128e475 avdevice/gdigrab: process window in a separate thread
Move window creation and event processing to a dedicated thread.

GetMessage only processes events from the calling thread's message
queue. Because gdigrab_read_header and gdigrab_read_packet don't run on
the same thread, the message queue was not being drained.

Fixes: #11539
2026-06-10 12:56:20 +00:00
Robert Nagyandmichaelni 97491ce0d5 libavfilter/libplacebo: gamma22 and gamma28 aliases 2026-06-10 12:34:38 +00:00
Robert Nagyandmichaelni 06e11c87c6 libavcodec/options_table: gamma22 and gamma28 aliases 2026-06-10 12:34:38 +00:00
Romain Beauxisandtoots c19949ae0f avformat/isom_tags: Add support for detecting apple_apac
Signed-off-by: Romain Beauxis <romain.beauxis@gmail.com>
2026-06-10 11:59:35 +00:00
Romain Beauxisandtoots f98eaa3ea9 avcodec/codec_id: Add Apple Positional Audio Codec.
Signed-off-by: Romain Beauxis <romain.beauxis@gmail.com>
2026-06-10 11:59:35 +00:00
Lynne f1b4b5b5f6 aacdec_usac: apply volume normalization settings 2026-06-10 18:04:22 +09:00
Lynne 71b59582e2 aacdec_usac: implement basic DRC parameter decoding 2026-06-10 18:04:22 +09:00
Michael NiedermayerandNiklas Haas d3a56ed37b swscale/tests/sws_ops: fix uops leak on translate success path
Fall through to the existing cleanup so uops is freed on both the success
and failure paths.

Signed-off-by: Michael Niedermayer <michael@niedermayer.cc>
2026-06-10 07:48:37 +00:00
BejoyandGyan Doshi a5ee6ff720 docs: refine issue tracker transition references 2026-06-10 06:35:21 +00:00
David Korczynskiandmichaelni 331b3e9dea avcodec/on2avc: reject subframe count whose * SUBFRAME_SIZE product overflows 32-bit
Found-by: Anthropic agents; validated and reported by Ada Logics.
Signed-off-by: David Korczynski <david@adalogics.com>
2026-06-10 02:15:53 +00:00
Ramiro Polla 2576e09434 swscale/aarch64/ops: simplify process function generation
There was no good reason to have it as an SwsAArch64OpType.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Ramiro Polla <ramiro.polla@gmail.com>
2026-06-10 01:47:11 +02:00
Ramiro Polla 19250a1846 swscale/aarch64/ops: use plain ret instruction
Use a call/ret pair instead of awkwardly exporting and then jumping
back to the return label.

This is similar to c29465bcb6, but for aarch64.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Ramiro Polla <ramiro.polla@gmail.com>
2026-06-10 01:47:10 +02:00
Ramiro Polla 061dc9ab6d swscale/aarch64/rasm: add blr instruction
And a64op_lr() helper for LR register.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Ramiro Polla <ramiro.polla@gmail.com>
2026-06-10 01:46:29 +02:00
Ramiro Polla ecba7e1d42 swscale/aarch64/rasm: split conditional and unconditional branch instructions
Sponsored-by: Sovereign Tech Fund
Signed-off-by: Ramiro Polla <ramiro.polla@gmail.com>
2026-06-10 01:46:29 +02:00
Ramiro Polla 692951d1c2 swscale/tests/sws_ops_aarch64: fix skipping of scaling ops
Scaling ops were add to ff_sws_enum_op_lists() in 1d841635. But the
code that skipped scaling ops in convert_to_aarch64_impl() wasn't
taking into consideration that, in sws_ops_aarch64, the scaling ops
aren't folded into read ops.

Also updates libswscale/aarch64/ops_entries.c with the new entries.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Ramiro Polla <ramiro.polla@gmail.com>
2026-06-10 01:46:29 +02:00
Diego de SouzaandTimo Rothenpieler 0a7c5e507b avcodec/nvenc: fix compatibility with Video Codec SDK 13.1
NV_ENC_CLOCK_TIMESTAMP_SET was changed in SDK 13.1: countingType was
replaced by countingTypeLSB and countingTypeMSB.

Signed-off-by: Diego de Souza <ddesouza@nvidia.com>
Signed-off-by: Timo Rothenpieler <timo@rothenpieler.org>
2026-06-10 01:28:40 +02:00
Frank PlowmanandJames Almer 6bfc7214d1 configure: Declare MPEG decoder/SEI dependencies on ITU-T35
Forgotten in 070bd112be
2026-06-09 19:07:14 +00:00
James Almer 9eb6f2f450 avcodec/aacenc: fix PCE layouts for 7.1 and 7.1(wide)
Signed-off-by: James Almer <jamrial@gmail.com>
2026-06-09 15:33:20 -03:00
Martin Storsjö b20c4c6f98 checkasm: Update to the latest upstream version
This update was done by running this command:

    $ git subtree pull --squash --prefix=tests/checkasm/ext \
      https://code.ffmpeg.org/FFmpeg/checkasm.git master

This includes fixes for a couple regressions noted after integrating
the new external checkasm into ffmpeg:

- Fixes spurious errors about missing vzeroupper in C code generated
  by MSVC, fixing https://code.ffmpeg.org/FFmpeg/FFmpeg/issues/23360
- Fixes building for WINAPI_FAMILY_PHONE_APP, and for UWP with older
  Windows SDKs, https://code.videolan.org/videolan/checkasm/-/work_items/37
- Fixes building in x86_32 mode for Windows with --disable-asm,
  https://code.videolan.org/videolan/checkasm/-/work_items/36
2026-06-09 20:57:59 +03:00
Martin Storsjö b3bcd320f5 Squashed 'tests/checkasm/ext/' changes from 0df02535c7..e13b0bb3ff
e13b0bb3ff x86: Skip the vzeroupper checks when built with MSVC
3cbf066c51 longjmp: Use raw arch defines for checking for x86_32
54817bd68f include: Fix a mismatched include guard comment
162f15c861 github: Build the UWP job as WINAPI_FAMILY_PHONE_APP
34d920e8bb arm/cpu: Avoid the Windows registry API in UWP builds
9a0cb83b69 utils: Avoid the GetStdHandle and GetConsoleScreenBufferInfo APIs in UWP builds
8d1609d583 ci: Test building ffmpeg with the latest checkasm
d93232845f ci: Test building latest dav1d and dav2d with the current checkasm
e15a8efbfc readme: Add me to the list of maintainers
01b4334a95 meson: Bump version to v1.3.0

git-subtree-dir: tests/checkasm/ext
git-subtree-split: e13b0bb3ff0935b7d2a1c2cc91163370f2cc8f40
2026-06-09 20:57:50 +03:00
Kacper Michajłow f43f609bb8 avutil/x86/tx_float: add missing vzeroupper to 15xM PFA FFT
The AVX2 15xM PFA FFT calls its second-dimension subtransform with dirty
YMM. That subtransform may be a legacy-SSE codelet (fft4 is SSE2 only),
causing AVX<->SSE transition penalties. Clear them after the first
dimension, before the calls.

Detected with `sde64 -ast` FATE job.

Fixes: ace42cf581
2026-06-09 17:54:21 +00:00
Lynne 4406f5ba5b prores_raw: document vendor-specific metadata location 2026-06-10 02:38:36 +09:00
Lynne 4cf96187e4 prores_raw: set frame crop fields
Some sensors or cameras put junk in the frame boundaries. We should
crop them out.
2026-06-10 02:38:35 +09:00
Lynne 0def4ceb18 prores_raw: export raw camera color data values 2026-06-10 02:38:35 +09:00
Lynne 12dc67b6fe lavu/frame: add camera raw codec side data
Required to correctly present raw video.
Codec-specific since I'd like to support ARRIRAW in the future, which
has a different format.
2026-06-10 02:38:35 +09:00
Niklas Haas 941a35149b swscale/x86/ops_int: switch to SWS_UOP_MOVE
Instead of SWS_UOP_PERMUTE/SWS_UOP_COPY.

No real measurable difference in performance (it just eliminates a few
practically free register renames), but definitely simpler.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-06-09 18:27:20 +02:00
Niklas HaasandRamiro Polla 36004d681f swscale/uops: add SWS_UOP_MOVE for optimal register-register swizzles
This decomposes a swizzle mask into a series of optimal register-register
moves, using at most two temporary scratch registers.

This is a better match for ASM-style backends than the existing PERMUTE/COPY
uops that are designed for the needs of the C backend (or other backends which
either apply the swizzle mask directly or permute pointers).

I originally had logic equivalent to this written in NASM macros, but it was
just such a complicated mess that I think it's better to rewrite it in C and
have the resulting metadata be an explicit part of the uop definition.

This commit only adds the uop, I'll update the x86 implementation in the
next step.

Co-authored-by: Ramiro Polla <ramiro.polla@gmail.com>
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-06-09 18:27:20 +02:00
Niklas Haas 228ef8d97b swscale/ops: make compile() take const SwsOpList *
The old x86 backend was the only backend that actually mutated the ops list.
With this gone, we can constify this parameter.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-06-09 18:27:20 +02:00
Niklas Haas a7c6a5f74e swscale/ops_chain: remove dead code
This is no longer needed now that both C and x86 are ported to uops.
The other ff_sws_setup_*() functions are still used by the aarch64 backend.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-06-09 18:27:20 +02:00
Niklas Haas 43a8e2da01 swscale/x86/ops: rewrite based on uops_macros.h
This is a ground-up refactor of the existing x86 ops code, using the new
uops macros to auto-generate every single kernel instance without guesswork.

While I was at it, I also cleaned up the file a bit and made sure we have only
a single, consistent way of writing/defining the kernels. This also gets rid
of some of the old boilerplate like decl_pattern.

Most kernels are trivial ports, but a few deserve attention or note:

- SWS_UOP_LINEAR is now generated more efficiently, thanks to the distinction
  between 0/1/arbitrary components. I also rewrote the code to keep track of
  whether the output was initialized yet or not, which lets us skip the
  initial `xorps` and `addps` for the first component.

- SWS_UOP_PERMUTE is generated automatically by using some NASM logic to
  detect permutation cycles and emit the minimal sequence of `mova`
  instructions. SWS_UOP_COPY, on the other hand, is implemented naively. I
  originally had a more complex implementation that could handle both, but
  I decided it really isn't worth the complication just to save 2-3 cycles.

- SWS_UOP_SCALE now has a native 8-bit implementation, which is faster than
  falling back to C code.

- SWS_UOP_SWAP_BYTES is no longer compiled as a type-agnostic pshufb, instead
  we hard-code the shuffle mask

- SWS_UOP_DITHER is now much simpler and avoids branching etc. entirely

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-06-09 18:27:20 +02:00
Niklas Haas 257f1438a5 swscale/x86/ops: simplify mmsize determination
No reason for this to be a separate function also, it just obscures
the error path for no reason.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-06-09 18:27:20 +02:00
Niklas Haas 2a09d0346e swscale/x86/ops_include: clarify/fix some comments
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-06-09 18:27:20 +02:00
Niklas Haas 6deae052a2 swscale/x86/uops: generate NASM macros using uops_macros.h
Rather than hard-coding a separate set of NASM macros, or generating them
with a separate function, we can just leverage the C preprocessor to generate
a NASM source file *from* the existing ops macros.

This is maybe a bit unorthodox, but it avoids unnecessary overhead from
re-generating the macros twice, avoids manual updating of the NASM macros,
and generally does not come with any real downside except being a bit ugly.

The main source of ugliness is the fact that the C preprocessor expands
everything into a single line, whereas NASM expects separate statements to
be on separate lines. Very fortunately, we can work around this by writing a
another NASM macro to take its arguments and dump them onto multiple lines.

It may seem premature, but I went ahead and defined all the macros, since
it was easy enough to do.

I added the %include in this commit to trigger build errors that occur only
as a result of introducing this file in the same commit that introduces it.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-06-09 18:27:20 +02:00
Niklas Haas 6057759ffc swscale/uops: parametrize filter op result type
The ops.h infrastructure currently hard-codes this as SWS_PIXEL_F32,
but I want to at least properly parametrize this in case we ever
decide to revisit this decision in the future. In particular, it
may become relevant for trivial kernels or kernels whose intermediates
are bounded, exact integers (which could possibly be output directly
as e.g. U16 or U32).

The FATE change is just because the filter op names gained a suffix.

Sponsored-by: Sovereign Tech Fund
Signed-off-by: Niklas Haas <git@haasn.dev>
2026-06-09 18:27:20 +02:00
Niklas Haas 4a8a1f5b8b swscale/uops: add SWS_UOP_READ_PLANAR_FV_FMA
Analog of SWS_UOP_READ_PLANAR_FV for FMA-enabled backends.
The logic for determining when we can safely use FMA is maybe a bit
obtuse, given that a `return type == SWS_PIXEL_U8` would have just done
the trick as well, but better to be safe than sorry, if we ever decide to
tune this constant in the future.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-06-09 18:27:20 +02:00
Niklas Haas dbe961b4cd swscale/uops: add SWS_UOP_LINEAR_FMA and SWS_UOP_FLAG_FMA
This is like SWS_UOP_LINEAR but parametrized by which matrix entries can use
FMA instead of bitexact IEEE mul/add instructions.

I decided to make these a separate uop to avoid bogging down the reference
backend with arch-specific details like FMA. However, I think FMA ops are quite
common/universal so I pre-emptively split it into its own separate flag rather
than defining something like SWS_UOP_FLAG_X86.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-06-09 18:27:20 +02:00
Niklas Haas 4e18068165 swscale/uops: also generate macros under SWS_BITEXACT
And SWS_BITEXACT|SWS_ACCURATE_RND, for completeness. This roughly doubles
the runtime of the uops macros generation. Let's hope it doesn't explode
further.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-06-09 18:27:20 +02:00
Niklas Haas 157f586e5c swscale/uops: thread SwsContext through ff_sws_ops_translate()
Needed to access ctx->flags, in particular SWS_BITEXACT.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-06-09 18:27:20 +02:00
Niklas Haas f97ba8cbe7 swscale/uops: loop over all flags when generating macros
This list is currently empty but will be expanded by the following commit.

I briefly tested whether it would be worth avoiding the free/realloc on
the uops array, but found the performance difference to be negligible.

Signed-off-by: Niklas Haas <git@haasn.dev>
2026-06-09 18:27:20 +02:00