2799 Commits
Author SHA1 Message Date
Cameron CawleyandRonald S. Bultje 6d681d5144 Replace platform-specific APIs for getting the program name in getopt fallback 2026-06-10 10:59:12 +00:00
Matthias Dressel 46e9017355 subprojects: Update checkasm to v1.2.0
Among various fixes it no longer installs the checkasm library, header
files and pkgconfig when installing dav1d.
2026-06-07 23:15:02 +02:00
Martin StorsjöandJean-Baptiste Kempf 720adf9b5b ci: Add -Dtrim_dsp=false in a couple of aarch64/arm configurations
For the "release" build configurations, trim_dsp defaults to true,
while it defaults to false for "debugoptimized". This means that
the configurations with release mode, without -Dtrim_dsp=false
actually run checkasm before.

In practice, checkasm is covered by later, full-test configurations,
but this ensures that we do test it at this stage as well, as
intended.
2026-06-07 22:36:10 +02:00
Nathan E. Egge beda1b3cda riscv64/itx: Match stack allocation of 16x16 itx 2026-06-07 02:52:33 -04:00
Arpad PanyikandMartin Storsjö 62501cc7db AArch64: Optimize ipred_smooth_8bpc_neon
Optimize ipred_smooth_8bpc_neon using simpler arithmetic operations and
the removal of jump table.

Relative runtime after this patch on some Cortex CPUs:

ipred_smooth:   w4      w8      w16     w32     w64
Cortex-A55:   1.041x  0.839x  0.705x  0.765x  0.802x
Cortex-A510:  1.055x  0.880x  0.669x  0.694x  0.729x
Cortex-A520:  1.113x  0.922x  0.659x  0.737x  0.783x
Cortex-A76:   0.763x  0.733x  0.608x  0.707x  0.791x
Cortex-A78:   0.840x  0.712x  0.704x  0.748x  0.786x
Cortex-A715:  0.814x  0.655x  0.798x  0.837x  0.858x
Cortex-A725:  0.813x  0.653x  0.791x  0.830x  0.854x
Cortex-X1:    0.825x  0.686x  0.667x  0.729x  0.756x
Cortex-X3:    0.865x  0.617x  0.649x  0.674x  0.688x
Cortex-X925:  0.825x  0.677x  0.641x  0.686x  0.700x
2026-05-26 12:30:26 +00:00
Arpad Panyik dbed372b70 AArch64: Optimize ipred_smooth_v_8bpc_neon further
Optimize ipred_smooth_h_8bpc_neon even further using vertical inner
loop for w >= 16 cases.

Relative runtime after this patch on some Cortex CPUs:

ipred_smooth_v:    w4      w8      w16     w32     w64
Cortex-A55:      0.985x  0.981x  0.810x  0.873x  0.907x
Cortex-A510:     0.966x  0.951x  0.950x  1.013x  1.047x
Cortex-A520:     0.924x  0.924x  0.890x  0.984x  1.030x
Cortex-A76:      0.978x  1.036x  0.899x  0.919x  0.918x
Cortex-A78:      0.997x  0.993x  0.986x  0.972x  0.983x
Cortex-A710:     1.002x  0.973x  0.984x  0.958x  1.002x
Cortex-A715:     1.073x  1.049x  1.005x  1.018x  1.012x
Cortex-A720:     1.001x  1.004x  0.990x  1.007x  1.008x
Cortex-A725:     1.002x  1.001x  0.985x  1.007x  1.006x
Cortex-X1:       0.996x  1.077x  0.927x  0.962x  0.970x
Cortex-X2:       1.012x  0.989x  0.881x  0.971x  0.981x
Cortex-X3:       1.006x  1.034x  0.841x  0.966x  0.962x
Cortex-X4:       1.020x  1.022x  0.915x  0.964x  0.985x
Cortex-X925:     1.000x  0.947x  0.936x  0.982x  0.996x
2026-05-20 13:48:18 +02:00
Arpad Panyik a38236491a AArch64: Optimize ipred_smooth_h_8bpc_neon further
Optimize ipred_smooth_h_8bpc_neon even further using vertical inner
loop for w >= 16 cases. Reorder instructions in the w = 4 handler for
Small CPUs.

Relative runtime after this patch on some Cortex CPUs:

ipred_smooth_h:    w4      w8      w16     w32     w64
Cortex-A55:      0.964x  1.003x  0.891x  0.979x  1.030x
Cortex-A510:     0.952x  0.936x  0.928x  1.004x  1.050x
Cortex-A520:     0.921x  0.925x  0.921x  0.995x  1.032x
Cortex-A76:      0.993x  1.005x  0.977x  0.995x  0.996x
Cortex-A78:      0.991x  0.998x  1.042x  0.978x  1.015x
Cortex-A710:     1.020x  0.966x  1.015x  1.015x  1.008x
Cortex-A715:     1.026x  1.051x  1.039x  1.007x  1.024x
Cortex-A720:     0.954x  0.999x  1.018x  0.999x  1.020x
Cortex-A725:     0.962x  1.000x  1.018x  1.000x  1.021x
Cortex-X1:       1.019x  0.993x  0.924x  0.983x  0.989x
Cortex-X2:       1.013x  0.991x  0.872x  0.964x  1.023x
Cortex-X3:       1.030x  0.996x  0.840x  0.953x  1.024x
Cortex-X4:       1.026x  1.005x  0.952x  0.970x  0.986x
Cortex-X925:     1.000x  0.980x  0.865x  0.899x  0.892x
2026-05-20 13:44:22 +02:00
Najmus Sakib AfsanandRonald S. Bultje 1718ff9ade riscv64/ipred_h: Implement ipred_h in RISC-V asm 2026-05-15 15:40:14 +00:00
Martin Storsjö c85856e360 aarch64: Fix a name mismatch in a macro error message
For the 64 bit assembly, the macro is just named "sub_sp", while it
was named "sub_sp_align" in the 32 bit form.
2026-05-15 14:24:57 +03:00
Najmus Sakib Afsan 1cfad6dbca riscv64/ipred_v: Remove redundent vxrm set instr
In function ipred_v_8bpc_rvv, rvv instructions vsetvli, vle8.v,
vse8.v do not use vxrm.

Kendryte K230                     Before            After         Delta

intra_pred_v_w4_8bpc_c:       419.2 ( 1.00x)    405.2 ( 1.00x)   -3.34%
intra_pred_v_w4_8bpc_rvv:      56.7 ( 6.88x)     48.5 ( 7.73x)  -14.46%
intra_pred_v_w8_8bpc_c:       772.9 ( 1.00x)    753.3 ( 1.00x)   -2.54%
intra_pred_v_w8_8bpc_rvv:      69.9 (10.54x)     61.5 (11.67x)  -12.02%
intra_pred_v_w16_8bpc_c:     1209.7 ( 1.00x)   1221.9 ( 1.00x)    1.01%
intra_pred_v_w16_8bpc_rvv:     88.5 (13.25x)     79.4 (14.93x)  -10.28%
intra_pred_v_w32_8bpc_c:     1898.5 ( 1.00x)   1888.9 ( 1.00x)   -0.51%
intra_pred_v_w32_8bpc_rvv:    104.9 (17.49x)     95.3 (19.18x)   -9.15%
intra_pred_v_w64_8bpc_c:     3266.0 ( 1.00x)   3138.6 ( 1.00x)   -3.90%
intra_pred_v_w64_8bpc_rvv:    196.1 (16.24x)    184.6 (16.59x)   -5.86%

SpacemiT K1                       Before            After         Delta

intra_pred_v_w4_8bpc_c:       419.2 ( 1.00x)    403.5 ( 1.00x)   -3.75%
intra_pred_v_w4_8bpc_rvv:      56.7 ( 6.88x)     31.9 (11.57x)  -43.74%
intra_pred_v_w8_8bpc_c:       772.9 ( 1.00x)    756.8 ( 1.00x)   -2.08%
intra_pred_v_w8_8bpc_rvv:      69.9 (10.54x)     43.9 (16.39x)  -37.20%
intra_pred_v_w16_8bpc_c:     1209.7 ( 1.00x)   1136.5 ( 1.00x)   -6.05%
intra_pred_v_w16_8bpc_rvv:     88.5 (13.25x)     61.1 (18.00x)  -30.96%
intra_pred_v_w32_8bpc_c:     1898.5 ( 1.00x)   1837.0 ( 1.00x)   -3.24%
intra_pred_v_w32_8bpc_rvv:    104.9 (17.49x)     77.5 (22.93x)  -26.12%
intra_pred_v_w64_8bpc_c:     3266.0 ( 1.00x)   3110.6 ( 1.00x)   -4.76%
intra_pred_v_w64_8bpc_rvv:    196.1 (16.24x)    166.2 (18.28x)  -15.25%

Blackhole p100a                  Before             After         Delta

intra_pred_v_w4_8bpc_c:       368.5 ( 1.00x)    370.1 ( 1.00x)    0.43%
intra_pred_v_w4_8bpc_rvv:      36.7 ( 9.37x)     23.7 (13.99x)  -35.42%
intra_pred_v_w8_8bpc_c:       666.6 ( 1.00x)    670.2 ( 1.00x)    0.54%
intra_pred_v_w8_8bpc_rvv:      44.4 (14.34x)     33.2 (18.92x)  -25.23%
intra_pred_v_w16_8bpc_c:      970.4 ( 1.00x)    971.9 ( 1.00x)    0.15%
intra_pred_v_w16_8bpc_rvv:     58.5 (16.07x)     48.5 (19.28x)  -17.09%
intra_pred_v_w32_8bpc_c:     1577.3 ( 1.00x)   1575.8 ( 1.00x)   -0.10%
intra_pred_v_w32_8bpc_rvv:     81.5 (18.79x)     65.9 (23.11x)  -19.14%
intra_pred_v_w64_8bpc_c:     2720.1 ( 1.00x)   2724.9 ( 1.00x)    0.18%
intra_pred_v_w64_8bpc_rvv:    134.9 (19.65x)     91.6 (28.67x)  -32.10%

Benchmark results provided by Sungjoon Moon.
2026-05-12 11:26:56 +00:00
Najmus Sakib Afsan de223ad6ab riscv64/cdef: Fix up code style
The functions of cdef_filter did not use the conventional names and
the macros for declarations.

This commit matches the style used for other archs and adjusts the
following:

 - decl_cdef_fn() macro for declaration
 - dav1d_cdef_filter_wxh as the name
2026-05-10 19:58:55 +06:00
Arpad PanyikandMartin Storsjö 51b67010e2 AArch64: Optimize ipred_smooth_v_8bpc_neon
Optimize ipred_smooth_v_8bpc_neon using simpler arithmetic operations
and the removal of jump table.

Relative runtime after this patch on some Cortex CPUs:

ipred_smooth_v:    w4      w8     w16     w32     w64
Cortex-A55:     1.025x  0.847x  0.821x  0.830x  0.852x
Cortex-A510:    1.017x  0.923x  0.915x  0.883x  0.840x
Cortex-A520:    1.080x  0.972x  0.999x  0.934x  0.876x
Cortex-A76:     0.818x  0.575x  0.599x  0.723x  0.744x
Cortex-A78:     0.782x  0.571x  0.595x  0.641x  0.685x
Cortex-A715:    0.801x  0.586x  0.593x  0.651x  0.694x
Cortex-A725:    0.801x  0.579x  0.596x  0.649x  0.692x
Cortex-X1:      0.782x  0.560x  0.553x  0.623x  0.682x
Cortex-X3:      0.792x  0.594x  0.526x  0.526x  0.604x
Cortex-X925:    0.757x  0.678x  0.525x  0.554x  0.577x
2026-05-06 20:18:03 +00:00
Arpad PanyikandMartin Storsjö 4db1a05aad AArch64: Optimize ipred_smooth_h_8bpc_neon
Optimize ipred_smooth_h_8bpc_neon using simpler arithmetic operations.

Relative runtime after this patch on some Cortex CPUs:

ipred_smooth_h:    w4      w8     w16     w32     w64
Cortex-A55:     1.015x  0.857x  0.819x  0.835x  0.862x
Cortex-A510:    0.988x  0.860x  0.915x  0.879x  0.837x
Cortex-A520:    0.999x  0.883x  0.967x  0.929x  0.873x
Cortex-A76:     0.804x  0.637x  0.517x  0.573x  0.613x
Cortex-A78:     0.800x  0.586x  0.548x  0.639x  0.640x
Cortex-A715:    0.722x  0.642x  0.563x  0.627x  0.646x
Cortex-A725:    0.710x  0.639x  0.567x  0.622x  0.645x
Cortex-X1:      0.758x  0.570x  0.565x  0.548x  0.557x
Cortex-X3:      0.789x  0.589x  0.528x  0.563x  0.571x
Cortex-X925:    0.855x  0.739x  0.541x  0.551x  0.567x
2026-05-06 20:18:03 +00:00
Martin Storsjö 037430193a arm: Fix up code style slightly
The existing code has been written striving to align columns so
that the largest register names can be typed, e.g. r10 on ARM
(and similarly for x10 or q10 on AArch64), or v31.16b for AArch64
vectors.

Fix some cases, where the current forms were clearly
inconsistent/wrong. Not all cases have been fixed up to match this
norm, but some individual ones that were clearly wrong have been
fixed.
2026-05-06 15:32:26 +00:00
Martin Storsjö 7b9ab8373e ci: Update the main CI image
This version includes llvm-symbolizer, which should improve
backtraces in sanitizer builds with Clang.
2026-05-06 15:11:16 +00:00
Martin Storsjö ac5dfb0a85 examples: Treat SDL2 headers as system headers
This makes those headers included with -isystem rather than -I,
which makes the compiler skip producing any warnings about them
(as they're expected to be out of the user code's control).

This avoids warnings with newer versions of the
dav1d-debian-unstable CI image, warnings (treated as errors in CI)
like this:

    In file included from /usr/include/SDL2/SDL_config.h:51,
                     from /usr/include/SDL2/SDL_stdinc.h:33,
                     from /usr/include/SDL2/SDL_main.h:25,
                     from /usr/include/SDL2/SDL.h:31,
                     from ../examples/dav1dplay.c:33:
    /usr/include/SDL2/SDL_config_unix.h:186:9: error: 'HAVE_GETAUXVAL' redefined [-Werror]
      186 | #define HAVE_GETAUXVAL 1
          |         ^~~~~~~~~~~~~~
    In file included from ../examples/dav1dplay.c:27:
    ./config.h:66:9: note: this is the location of the previous definition
       66 | #define HAVE_GETAUXVAL 0
          |         ^~~~~~~~~~~~~~

Recently, Debian Unstable has switched from providing the
actual SDL 2 to providing the SDL 2 API through the sdl2-compat
package on top of SDL 3.

The SDL 2 headers expose their full config.h as part of their
installed headers (that the user code ends up including). This
includes unnamespaced defines, such as "#define HAVE_GETAUXVAL 1".

This issue hasn't shown up with the original SDL 2 package in
Debian, due to a Debian packaging detail. While most SDL 2
headers are installed in /usr/include/SDL2 (and user code
includes it as <SDL.h>, requiring the build system to include
/usr/include/SDL2), the Debian packaging has replaced
/usr/include/SDL2/SDL_config.h with a header that includes
<SDL2/_real_SDL_config.h>, which then gets resolved in
/usr/include/x86_64-linux-gnu/SDL2. Due to this being included
from a compiler default system include path
(/usr/include/x86_64-linux-gnu), no warnings about the header
was printed, even though that one also produced the same kind
of conflicting redefinitions. (We could also avoid the same issue
by attempting to include <SDL2/SDL.h> instead of <SDL.h>,
avoiding the use of the build system provided include directory,
resolving that from /usr/include, and having the compiler consider
it a system header.)

The sdl2-compat package in Debian doesn't redirect that header
in the same way, but includes SDL_config_unix.h in the same
directory in /usr/include/SDL2. Due to this being included
from a user specified -I (as long as it is included as <SDL.h>,
not <SDL2/SDL.h>), it's considered a user header, and warnings
are printed for it.

It seems like SDL 3 no longer exposes their config.h headers as
part of the installed headers.

The conflict between SDL 2's config.h's HAVE_GETAUXVAL and
our stems from the fact that we only try to detect GETAUXVAL
on architectures where we want to use it (arm/aarch64, loongarch,
ppc or riscv). On x86, where we don't need it, we don't try
to detect it, and set "#define HAVE_GETAUXVAL 0" in our
config.h.

To avoid warnings due to the conflict, we can declare the
SDL 2 dependency with the argument "include_type: 'system'",
which should silence any warnings in the SDL headers. This
Meson feature is available since Meson 0.52.0 (and we currently
require Meson 0.54.0).

An alternative way to avoid the redefinition conflict would be
to always try to detect getauxval on all architectures, to make
our config.h agree with SDL 2's config headers.

A third (and much more hacky way) around the conflict would be
to avoid the public SDL headers including the SDL_config header
by defining "SDL_config_h_" before including SDL.h. Doing this
also requires manually including a couple more standard headers
before SDL.h (stdint.h, stdio.h, stddef.h).
2026-05-06 14:00:31 +03:00
Martin Storsjö 556c5202b4 ci: Add testing on macOS on Apple Silicon too 2026-05-03 22:39:01 +03:00
Martin Storsjö e1bd6f76c2 checkasm: Readd a dependency on threads
3a2a874994, which switched to using
the checkasm core from the separate checkasm project, removed the
thread dependency from the checkasm executable, as the checkasm
library itself has a thread dependency.

However, checkasm doesn't always include that thread dependency,
it only does that when pthread_setaffinity_np is detected.

The dav1d object files themselves use pthreads as well, causing
undefined symbols if checkasm doesn't link in pthreads.

This should fix linking on OpenBSD after
3a2a874994, fixing issue #467.
2026-05-03 12:42:39 +03:00
Martin Storsjö 5cfc383268 arm: mc: Optimize prep_neon for the w4/w8 cases
Use alternating registers for immediately sequential loads/stores,
pack two 4 pixel rows into one register.

Before:                           Cortex A7      A8     A53     A55     A72     A73     A76
mct_8tap_regular_w4_0_8bpc_neon:      112.0    68.6    79.7    82.9    45.3    39.4    24.4
mct_8tap_regular_w8_0_8bpc_neon:      158.2    89.5   108.4   113.4    55.4    53.0    30.0
After:
mct_8tap_regular_w4_0_8bpc_neon:       89.7    69.9    76.3    85.1    36.2    35.2    25.0
mct_8tap_regular_w8_0_8bpc_neon:      149.0    92.7   102.6   115.8    56.6    52.8    31.4

The numbers aren't entirely consistent, but this is mostly favourable.
2026-04-29 15:56:01 +03:00
Martin Storsjö 727d0f984b arm: mc: Fix a comment typo
This seems to be right in all the other similar places
(arm/64/mc.S, arm/32/mc16.S and arm/64/mc16.S).
2026-04-29 15:56:01 +03:00
Victorien Le Couviour--Tuffet f995e1fbf9 threading: Schedule TILE tasks for all passes at once
Closes #465.
2026-04-27 21:09:28 +02:00
Henrik Gramner c0f2fe3135 build: Update meson version requirement to 0.54.0
Use of the meson 'fallback arg in dependency' feature was introduced
by the switch to external checkasm in 3a2a874.
2026-04-22 21:02:05 +02:00
Arpad Panyik c5726277ff AArch64: Optimize ipred_h_8bpc_neon
Optimize ipred_h_8bpc_neon using simpler stores and simpler indexing.

Relative runtime after this patch on some Cortex CPUs:

ipred_h:        w4      w8      w16     w32     w64
Cortex-A55:   1.054x  1.054x  0.978x  1.149x  1.097x
Cortex-A510:  0.455x  0.970x  0.973x  1.010x  1.002x
Cortex-A520:  0.973x  0.975x  0.979x  1.002x  1.000x
Cortex-A76:   0.791x  0.934x  0.912x  1.010x  0.999x
Cortex-A78:   0.771x  0.933x  0.957x  0.519x  0.510x
Cortex-A715:  0.838x  0.860x  0.893x  0.585x  0.661x
Cortex-A720:  0.839x  0.860x  0.892x  0.580x  0.659x
Cortex-A725:  0.809x  0.837x  0.871x  0.580x  0.660x
Cortex-X1:    0.973x  0.982x  0.989x  0.498x  0.660x
Cortex-X3:    0.971x  0.992x  0.987x  0.495x  0.661x
Cortex-X925:  0.950x  1.000x  1.000x  0.474x  0.655x
2026-04-16 16:02:28 +02:00
Arpad Panyik 47e2607e6c AArch64: Optimize ipred_v_8bpc_neon
Optimize the width = 4 case of ipred_v_8bpc_neon by using simple stores
instead of the lane stores which can improve performance on some CPUs.

Relative runtime after this patch on some Cortex CPUs:

 ipred_v:       w4
Cortex-A55:   1.041x
Cortex-A510:  0.297x
Cortex-A520:  0.748x
Cortex-A76:   0.866x
Cortex-A78:   0.856x
Cortex-A715:  0.874x
Cortex-A720:  0.875x
Cortex-A725:  0.868x
Cortex-X1:    1.013x
Cortex-X3:    1.000x
Cortex-X925:  1.000x
2026-04-15 17:37:46 +02:00
Martin Storsjö aa4504729c arm: Fix a typo in a URL
This was added in a00289b6d8.
2026-03-31 13:48:43 +03:00
Matthias Dressel d69235dd80 CI: Use shortform QEMU_CPU for loongarch64
Since qemu commit 979bf44af8483cedc00c63b3e79407de08e75a30 the cpu
argument accepts just 'max' as a shorthand.
2026-03-17 22:00:03 +01:00
Matthias Dressel bfbd7d4677 CI: loongarch64: Move QEMU_LD_PREFIX to crossfile
Simplifies developement builds on local machines.
2026-03-17 22:00:03 +01:00
Matthias Dressel afcdb781cb CI: riscv64: Move QEMU_LD_PREFIX to crossfile
Simplifies developement builds on local machines.
2026-03-17 22:00:03 +01:00
Matthias Dressel 42ac98706a CI: aarch64: Move QEMU_LD_PREFIX to crossfile
Simplifies developement builds on local machines.
2026-03-17 22:00:03 +01:00
Matthias Dressel 8feb8526bb CI: Remove outdated version suffix from job name 2026-03-17 22:00:03 +01:00
Martin Storsjö 594d1601ff arm: Add Armv9.3-A GCS (Guarded Control Stack) support
Signal that our assembly is compliant with the GCS feature, if
the GCS feature is enabled in the compiler (available since Clang
18 and GCC 15) - this is enabled by -mbranch-protection=standard
with a new enough compiler.

GCS doesn't require any specific modifications to the assembly
code, but requires that all functions return to the expected call
address (checked through a shadow stack).
2026-03-17 20:40:05 +00:00
Henrik Gramner 6894b7f2d0 Improve the memory pool API
Return a void pointer directly to the usable memory region,
abstracting away implementation details.
2026-03-17 18:28:57 +01:00
Henrik Gramner 241a6b236a x86: Fix warp8x8 gamma/delta naming mixup
For whatever reason the names of the gamma and delta parameters
have been switched in a few of the warp8x8 asm implementations.

This is a bit confusing, so fix things by switching them back.

This change is purely cosmetical, the output binary is identical.
2026-03-05 15:50:40 +01:00
Martin Storsjö 4fd22e97d8 arm: Switch to a more correct Windows flag for detecting I8MM
Newer revisions of WinSDK 10.0.26100.0 have exposed more flags for
IsProcessorFeaturePresent; now there is a separate one for
detecting specifically I8MM and not just SVE-I8MM. Switch to using
this flag instead.
2026-03-04 15:16:37 +02:00
Matthias Dressel 1dcfc90757 CI: Update images 2026-02-28 16:27:35 +01:00
Matthias Dressel daef396277 CI: Switch to loongarch64 Debian toolchain
loong64 was recently promoted to an official Debian architecture. [0]

[0] https://lists.debian.org/debian-devel-announce/2025/12/msg00004.html
2026-02-09 02:33:29 +01:00
Martin Storsjö de4ce4f32d arm: mc: Add missing # for some immediate constants, for consistency
The assembler doesn't require the # here, but we use that everywhere
else, so add it here as well for consistency.
2026-02-06 16:01:57 +02:00
Martin Storsjö 9c13b5fbd0 subprojects: Update checkasm to v1.1.0
This version, together with the previous commit
574e7f4727, fixes issue #460.

Due to checkasm internal restructuring, one may run into build
issues if rebuilding in an old build directory after updating
the checkasm subproject, without getting rid of older meson
generated headers in the build directory.
2026-02-03 09:27:22 +00:00
Cameron Cawley 60507bffc0 Fix compiler warning on platforms without cfi-icall 2026-01-30 23:46:42 +00:00
Cameron Cawley 4264096b72 Allow falling back to standard C signal on non-POSIX systems 2026-01-30 19:59:24 +00:00
Henrik Gramner 2272a19ab0 x86: Update x86inc.asm 2026-01-26 23:32:17 +01:00
KO Myung-HunandJean-Baptiste Kempf b29c5782e7 Export DAV1D APIs correctly on OS/2 2026-01-22 00:58:15 +01:00
KO Myung-HunandJean-Baptiste Kempf 8674770e4b Add asm support on OS/2 2026-01-22 00:58:15 +01:00
Cameron CawleyandHenrik Gramner f0b233fd09 Replace use of sprintf with snprintf 2026-01-21 14:41:57 +00:00
Steve LhommeandMartin Storsjö 50015c2ec9 CI: switch debian-llvm-mingw to UCRT
The msvcrt and ucrt are almost identical except
- C runtime is either msvcrt or UCRT (Universal C Runtime from Vista+ [^1])
- The default target OS is Windows 10 instead of Windows 7 (0x601)

[^1]: https://support.microsoft.com/en-us/topic/update-for-universal-c-runtime-in-windows-c0514201-7fe6-95a3-b0a5-287930f3560c
2026-01-21 13:43:37 +00:00
Cameron CawleyandMartin Storsjö 2bb7e63266 Fix reversed parameters when TRACK_HEAP_ALLOCATIONS is enabled
This fixes a regression from commit d268788467
2026-01-21 13:25:49 +00:00
Martin Storsjö a44b589872 Silence a new MSVC warning
This silences the following warnings in MSVC 2026 18.0 (and
2022 17.14):

    ../tools/dav1d_cli_parse.c(213): warning C5287: operands are different enum types 'CpuFlags' and 'CpuMask'; use an explicit cast to silence this warning
    ../tools/dav1d_cli_parse.c(214): warning C5287: operands are different enum types 'CpuFlags' and 'CpuMask'; use an explicit cast to silence this warning
    ../tools/dav1d_cli_parse.c(215): warning C5287: operands are different enum types 'CpuFlags' and 'CpuMask'; use an explicit cast to silence this warning
    ../tools/dav1d_cli_parse.c(216): warning C5287: operands are different enum types 'CpuFlags' and 'CpuMask'; use an explicit cast to silence this warning

This warning flag was new in MSVC 2022 17.14, but it was buggy
in that version - it produced spurious warnings for other cases
as well (and using an explicit cast to silence it didn't work
as advertised), see [1] and [2].

The bugs were fixed in 18.0, and the remaining construct that it
warns about is something that is somewhat reasonable to warn about:

    enum CpuFlags {
        DAV1D_X86_CPU_FLAG_SSE2        = 1 << 0,
        DAV1D_X86_CPU_FLAG_SSSE3       = 1 << 1,
    };
    enum CpuMask {
        X86_CPU_MASK_SSE2      = DAV1D_X86_CPU_FLAG_SSE2,
        X86_CPU_MASK_SSSE3     = DAV1D_X86_CPU_FLAG_SSSE3     | X86_CPU_MASK_SSE2,
    };

Instead of adding explicit casts on the constants from the foreign
enum, just disable this warning.

[1] https://developercommunity.visualstudio.com/t/False-positive-C5287:-operands-are-diff/10915265
[2] https://developercommunity.visualstudio.com/t/warning-C5287:-operands-are-different-e/10877942
2026-01-20 12:00:29 +02:00
Martin Storsjö 04b69f93e5 checkasm: Reinstate check for TRIM_DSP_FUNCTIONS
This was lost in 3a2a874994.

Without this, checkasm ends up printing a quite confusing output
consisting only of the functions that have two or more assembly
implementations, if trim_dsp happens to be enabled.
2026-01-14 22:29:38 +02:00
Martin Storsjö afd13d8906 arm: Fix a few misindented lines 2026-01-09 14:00:39 +02:00
Martin Storsjö 574e7f4727 checkasm: Pass HAVE_C11_GENERIC to checkasm as -DCHECKASM_HAVE_GENERIC=1/0
For this to have an effect, it requires using a newer version of
the wrapped checkasm subproject; including checkasm commit
be05a7972e47c658a7c5c186294d27caa5735db2 or newer.
2026-01-07 16:22:18 +02:00