For the "release" build configurations, trim_dsp defaults to true,
while it defaults to false for "debugoptimized". This means that
the configurations with release mode, without -Dtrim_dsp=false
actually run checkasm before.
In practice, checkasm is covered by later, full-test configurations,
but this ensures that we do test it at this stage as well, as
intended.
The functions of cdef_filter did not use the conventional names and
the macros for declarations.
This commit matches the style used for other archs and adjusts the
following:
- decl_cdef_fn() macro for declaration
- dav1d_cdef_filter_wxh as the name
The existing code has been written striving to align columns so
that the largest register names can be typed, e.g. r10 on ARM
(and similarly for x10 or q10 on AArch64), or v31.16b for AArch64
vectors.
Fix some cases, where the current forms were clearly
inconsistent/wrong. Not all cases have been fixed up to match this
norm, but some individual ones that were clearly wrong have been
fixed.
This makes those headers included with -isystem rather than -I,
which makes the compiler skip producing any warnings about them
(as they're expected to be out of the user code's control).
This avoids warnings with newer versions of the
dav1d-debian-unstable CI image, warnings (treated as errors in CI)
like this:
In file included from /usr/include/SDL2/SDL_config.h:51,
from /usr/include/SDL2/SDL_stdinc.h:33,
from /usr/include/SDL2/SDL_main.h:25,
from /usr/include/SDL2/SDL.h:31,
from ../examples/dav1dplay.c:33:
/usr/include/SDL2/SDL_config_unix.h:186:9: error: 'HAVE_GETAUXVAL' redefined [-Werror]
186 | #define HAVE_GETAUXVAL 1
| ^~~~~~~~~~~~~~
In file included from ../examples/dav1dplay.c:27:
./config.h:66:9: note: this is the location of the previous definition
66 | #define HAVE_GETAUXVAL 0
| ^~~~~~~~~~~~~~
Recently, Debian Unstable has switched from providing the
actual SDL 2 to providing the SDL 2 API through the sdl2-compat
package on top of SDL 3.
The SDL 2 headers expose their full config.h as part of their
installed headers (that the user code ends up including). This
includes unnamespaced defines, such as "#define HAVE_GETAUXVAL 1".
This issue hasn't shown up with the original SDL 2 package in
Debian, due to a Debian packaging detail. While most SDL 2
headers are installed in /usr/include/SDL2 (and user code
includes it as <SDL.h>, requiring the build system to include
/usr/include/SDL2), the Debian packaging has replaced
/usr/include/SDL2/SDL_config.h with a header that includes
<SDL2/_real_SDL_config.h>, which then gets resolved in
/usr/include/x86_64-linux-gnu/SDL2. Due to this being included
from a compiler default system include path
(/usr/include/x86_64-linux-gnu), no warnings about the header
was printed, even though that one also produced the same kind
of conflicting redefinitions. (We could also avoid the same issue
by attempting to include <SDL2/SDL.h> instead of <SDL.h>,
avoiding the use of the build system provided include directory,
resolving that from /usr/include, and having the compiler consider
it a system header.)
The sdl2-compat package in Debian doesn't redirect that header
in the same way, but includes SDL_config_unix.h in the same
directory in /usr/include/SDL2. Due to this being included
from a user specified -I (as long as it is included as <SDL.h>,
not <SDL2/SDL.h>), it's considered a user header, and warnings
are printed for it.
It seems like SDL 3 no longer exposes their config.h headers as
part of the installed headers.
The conflict between SDL 2's config.h's HAVE_GETAUXVAL and
our stems from the fact that we only try to detect GETAUXVAL
on architectures where we want to use it (arm/aarch64, loongarch,
ppc or riscv). On x86, where we don't need it, we don't try
to detect it, and set "#define HAVE_GETAUXVAL 0" in our
config.h.
To avoid warnings due to the conflict, we can declare the
SDL 2 dependency with the argument "include_type: 'system'",
which should silence any warnings in the SDL headers. This
Meson feature is available since Meson 0.52.0 (and we currently
require Meson 0.54.0).
An alternative way to avoid the redefinition conflict would be
to always try to detect getauxval on all architectures, to make
our config.h agree with SDL 2's config headers.
A third (and much more hacky way) around the conflict would be
to avoid the public SDL headers including the SDL_config header
by defining "SDL_config_h_" before including SDL.h. Doing this
also requires manually including a couple more standard headers
before SDL.h (stdint.h, stdio.h, stddef.h).
3a2a874994, which switched to using
the checkasm core from the separate checkasm project, removed the
thread dependency from the checkasm executable, as the checkasm
library itself has a thread dependency.
However, checkasm doesn't always include that thread dependency,
it only does that when pthread_setaffinity_np is detected.
The dav1d object files themselves use pthreads as well, causing
undefined symbols if checkasm doesn't link in pthreads.
This should fix linking on OpenBSD after
3a2a874994, fixing issue #467.
Optimize the width = 4 case of ipred_v_8bpc_neon by using simple stores
instead of the lane stores which can improve performance on some CPUs.
Relative runtime after this patch on some Cortex CPUs:
ipred_v: w4
Cortex-A55: 1.041x
Cortex-A510: 0.297x
Cortex-A520: 0.748x
Cortex-A76: 0.866x
Cortex-A78: 0.856x
Cortex-A715: 0.874x
Cortex-A720: 0.875x
Cortex-A725: 0.868x
Cortex-X1: 1.013x
Cortex-X3: 1.000x
Cortex-X925: 1.000x
Signal that our assembly is compliant with the GCS feature, if
the GCS feature is enabled in the compiler (available since Clang
18 and GCC 15) - this is enabled by -mbranch-protection=standard
with a new enough compiler.
GCS doesn't require any specific modifications to the assembly
code, but requires that all functions return to the expected call
address (checked through a shadow stack).
For whatever reason the names of the gamma and delta parameters
have been switched in a few of the warp8x8 asm implementations.
This is a bit confusing, so fix things by switching them back.
This change is purely cosmetical, the output binary is identical.
Newer revisions of WinSDK 10.0.26100.0 have exposed more flags for
IsProcessorFeaturePresent; now there is a separate one for
detecting specifically I8MM and not just SVE-I8MM. Switch to using
this flag instead.
This version, together with the previous commit
574e7f4727, fixes issue #460.
Due to checkasm internal restructuring, one may run into build
issues if rebuilding in an old build directory after updating
the checkasm subproject, without getting rid of older meson
generated headers in the build directory.
This silences the following warnings in MSVC 2026 18.0 (and
2022 17.14):
../tools/dav1d_cli_parse.c(213): warning C5287: operands are different enum types 'CpuFlags' and 'CpuMask'; use an explicit cast to silence this warning
../tools/dav1d_cli_parse.c(214): warning C5287: operands are different enum types 'CpuFlags' and 'CpuMask'; use an explicit cast to silence this warning
../tools/dav1d_cli_parse.c(215): warning C5287: operands are different enum types 'CpuFlags' and 'CpuMask'; use an explicit cast to silence this warning
../tools/dav1d_cli_parse.c(216): warning C5287: operands are different enum types 'CpuFlags' and 'CpuMask'; use an explicit cast to silence this warning
This warning flag was new in MSVC 2022 17.14, but it was buggy
in that version - it produced spurious warnings for other cases
as well (and using an explicit cast to silence it didn't work
as advertised), see [1] and [2].
The bugs were fixed in 18.0, and the remaining construct that it
warns about is something that is somewhat reasonable to warn about:
enum CpuFlags {
DAV1D_X86_CPU_FLAG_SSE2 = 1 << 0,
DAV1D_X86_CPU_FLAG_SSSE3 = 1 << 1,
};
enum CpuMask {
X86_CPU_MASK_SSE2 = DAV1D_X86_CPU_FLAG_SSE2,
X86_CPU_MASK_SSSE3 = DAV1D_X86_CPU_FLAG_SSSE3 | X86_CPU_MASK_SSE2,
};
Instead of adding explicit casts on the constants from the foreign
enum, just disable this warning.
[1] https://developercommunity.visualstudio.com/t/False-positive-C5287:-operands-are-diff/10915265
[2] https://developercommunity.visualstudio.com/t/warning-C5287:-operands-are-different-e/10877942
This was lost in 3a2a874994.
Without this, checkasm ends up printing a quite confusing output
consisting only of the functions that have two or more assembly
implementations, if trim_dsp happens to be enabled.
For this to have an effect, it requires using a newer version of
the wrapped checkasm subproject; including checkasm commit
be05a7972e47c658a7c5c186294d27caa5735db2 or newer.