The aarch64 VP9 loopfilters actually violate aarch64 GCS
(Guarded Control Stack), even though we marked the code as GCS
compliant in 846746be4b.
This means that builds with GCS enabled, after that commit,
will crash when decoding VP9, on future hardware (or current
QEMU) that supports GCS. This also goes for ffmpeg version 8.1.1
where the GCS enabling was backported.
This matches the fix that was done for hevcdsp in
1f7ed8a78d.
This issue wasn't observed if running checkasm in QEMU - therefore,
I thought all GCS issues had been fixed by
846746be4b. (If I would have
tested the full "make fate" with QEMU, the issue would
have appeared though.)
However with the new checkasm, some of the GCS violations
do appear even in checkasm.
The reason is that the checkasm vp9 test intentionally craft
input pixels that attempt to trigger all the individual
separate cases in each input buffer (in
randomize_loopfilter_buffers). This means that the checkasm
tests actually never test or exercise the early exit cases,
which are the ones that violate GCS.
With the new checkasm, the call to "bench_new" always test
running the code at least once, even if not benchmarking.
As the input buffers weren't reinitialized between the test
and "bench_new", the pixel differences now differ from the
initial setup, so that the code now some times (often) would
end up hitting the early exit cases.
Ideally, the vp9 checkasm test would be repeated to cover all
cases of input buffers that allow early exits, in addition to
covering the case with all different cases in one block.
e13b0bb3ff x86: Skip the vzeroupper checks when built with MSVC
3cbf066c51 longjmp: Use raw arch defines for checking for x86_32
54817bd68f include: Fix a mismatched include guard comment
162f15c861 github: Build the UWP job as WINAPI_FAMILY_PHONE_APP
34d920e8bb arm/cpu: Avoid the Windows registry API in UWP builds
9a0cb83b69 utils: Avoid the GetStdHandle and GetConsoleScreenBufferInfo APIs in UWP builds
8d1609d583 ci: Test building ffmpeg with the latest checkasm
d93232845f ci: Test building latest dav1d and dav2d with the current checkasm
e15a8efbfc readme: Add me to the list of maintainers
01b4334a95 meson: Bump version to v1.3.0
git-subtree-dir: tests/checkasm/ext
git-subtree-split: e13b0bb3ff0935b7d2a1c2cc91163370f2cc8f40
Commit 4569ab7eaa tried to set this
only on the object files for the checkasm library itself, but
missed that EXT_CHECKASMOBJS lacks the path prefix, thus this
wasn't set at all.
Alternatively, for simplicity, we could keep passing this for
all checkasm object files, not only the checkasm library objects;
the other object files don't use it in any case.
This is required for overriding defines that exist in the public
headers of checkasm, when e.g. building with assembly disabled
for an architecture where we normally would use the checked_call
wrapper.
This fixes a leftover in how checkasm is integrated into the
ffmpeg build system; there were many different approaches
considered for fixing --disable-asm, and the ffmpeg configure
integration didn't end up matching the final solution.
This fixes building with --disable-asm.
Some of these files aligned instructions to 4/24 columns, while
we commonly indent arm/aarch64 assembly to 8/24 columns.
Some of these files also used a different alignment for the
operands.
When we try to lowercase register names (e.g. Q0 -> q0) we avoid
doing that for parts of the code that are comments, as comments
occasionally contain pseudocode that contain such mentions that
aren't register names, but pseudocode/reference code variables.
See 7ebb6c54eb for more details
about that.
In addition to recognizing comments starting with //, also
recognize /* and @ (which is a comment char in arm assembly, but
not in aarch64).
We currently don't have any cases where this is needed, but include
it for completeness and clarity.
These macros for BTI were added in
08b4716a9e.
A later comment in this file, added in
248986a0db, referenced the macro
AARCH64_VALID_JUMP_CALL_TARGET which never was added here before.
Whenever the link register is stored on the stack, sign it
before storing it and validate at a symmetrical point (with the
stack at the same level as when it was signed).
These macros only have an effect if built with PAC enabled (e.g.
through -mbranch-protection=standard), otherwise they don't
generate any extra instructions.
None of these cases were present when PAC support was added
in 248986a0db in 2022.
Without these changes, PAC still had an effect in the compiler
generated code and in the existing cases where we these macros were
used - but make it apply to the remaining cases of link register
on the stack.
The sme_entry/sme_exit macros already take care of backing up/restoring
these registers. Additionally, as long as no function calls are
made within the function, x30 doesn't need to be backed up at all.
Signal that our assembly is compliant with the GCS feature, if
the GCS feature is enabled in the compiler (available since Clang
18 and GCC 15) - this is enabled by -mbranch-protection=standard
with a new enough compiler.
GCS doesn't require any specific modifications to the assembly
code, but requires that all functions return to the expected call
address (checked through a shadow stack).
For cases when returning early without updating any pixels, we
previously returned to return address in the caller's scope,
bypassing one function entirely. While this may seem like a neat
optimization, it makes the return stack predictor mispredict
the returns - which potentially can cost more performance than
it gains.
Secondly, if the armv9.3 feature GCS (Guarded Control Stack) is
enabled, then returns _must_ match the expected value; this feature
is being enabled across linux distributions, and by fixing the
hevc assembly, we can enable the security feature on ffmpeg as well.
Passing a struct/union by value can generally be inefficient.
Additionally, when the struct/union is declared to be aligned,
whether it really stays aligned when passed as a parameter by
value is unclear.
This fixes build errors like this, with MSVC targeting 32 bit ARM:
libswscale/ops_chain.h(91): error C2719: 'unnamed-parameter': formal parameter with requested alignment of 16 won't be aligned
This fixes compiling with MSVC for aarch64 after
510999f6b0.
While MSVC does do dead code elimintation for function references
within e.g. "if (0)", it doesn't do that for functions referenced
within a static function, even if that static function itself ends
up not used.
A reproduction example:
void missing(void);
void (*func_ptr)(void);
static void wrapper(void) {
missing();
}
void init(int cpu_flags) {
if (0) {
func_ptr = wrapper;
}
}
If "wrapper" is entirely unreferenced, then MSVC doesn't produce
any reference to the symbol "missing". Also, if we do
"func_ptr = missing;" then the reference to missing also is
eliminated. But for the case of referencing the function in a
static function, even if the reference to the static function can
be eliminated, then MSVC does keep the reference to the symbol.
Accept up to 15 ULP difference.
This fixes running "checkasm --test=ac3dsp <seed>" for the seeds
2043066705, 24168 and 111972 on ARM, and the seeds 40552 and
209754 on aarch64.
This is the same change as 8e4c904c8e,
increasing the tolerance further.
With this change, checkasm passes for over 500 000 seeds on both
ARM and aarch64.
Newer revisions of WinSDK 10.0.26100.0 have exposed more flags for
IsProcessorFeaturePresent; now there is a separate one for
detecting specifically I8MM and not just SVE-I8MM. Switch to using
this flag instead.
Match more SVE/SME specific details.
Also lowercase all register names. As this matches many cases
of code comments that refer to variables elsewhere, not specific
registers, we only apply this tranformation on the part of lines
before a potential comment.
This file is excempt from the indent checker script, as there
are a few other bits in it that the script wants to reformat
into slightly worse form, or which might not warrant being
reformatted.
But these instructions should indeed be indented this way.
Name the feature "arm_crc" rather than plain "crc", to make it
clear that this is about a CPU feature extension, not CRC
implementations in general.
This requires dealing with the extension slightly differently
than other extensions, as the name of the feature and the
".arch_extension" extension name differ.
Naming it with an "arm" prefix rather than "aarch64", as the
CPU extension also is available in 32 bit ARM form, even though
we don't intend to use it there.
This allows naming the ffmpeg wide feature with a different (more
elaborate) name than the raw cpu extension as it is spelled in
the ".arch_extension" directives.
We use a dummy aarch64 feature to work around an issue in older
Clang, where an .arch line such as ".arch armv8.2-a" doesn't take
effect immediately, while one like ".arch armv8.2-a+feature" works.
Previously, we used "crc" for this dummy feature to add (as an
old feature that would be supported widely by old toolchains).
But as we may want to actually use crc features and detect whether
they are supported, we may want to switch to another feature.
Use the "fp" feature instead, for the purposes of this extra
feature in the .arch lines. (The "fp" feature indicates floating
point support, which is implicitly part of the baseline feature
set anyway.)
The doxygen comments were missed as these functions were updated
during review; they don't take separate pointer/length parameters
but use an AVBPrint struct now instead.
Also clarify that ff_make_codec_str doesn't log if the logctx
parameter is NULL.
This avoids needing to use the extra_conf variable. That variable
is problematic for setting a value that contains spaces.
This adds options for another tool in the same fashion as other
tools were added in 523d688c2b.
Busybox-w32 uses regular Windows style paths with drive letters,
but with forward slashes; thus an absolute path starts with "c:/".
Make the target_path() function in fate-run.sh (which converts a
potentially relative path to an absolute one, under the target_path
prefix) handle this case.
With this in place, running fate tests almost works in
busybox-w32 - only one issue remains. A patch [1] has been sent to
upstream busybox for fixing that issue (which also is present if
running fate tests on busybox on Linux), but it hasn't been
responded to yet.
[1] https://lists.busybox.net/pipermail/busybox/2025-December/091851.html
Busybox-w32 [1] works for building ffmpeg on Windows (as an
alternative to msys2, cygwin or WSL).
On busybox-w32, "uname" returns "Windows_NT"; recognize this
in exesuf() as having an .exe suffix.
If building in this environment with a mingw toolchain, one has
to explicitly set --target-os=mingw32. (We probably don't
want to imply that this uname, set as target_os_default, would
default to mingw?) But despite what is set with --target-os,
one can't override the configure variable "host_os", which
exesuf() has to recognize.
[1] https://github.com/rmyorston/busybox-w32
Explicitly spell it out that we are not going to modify the
individual libraries for the purposes of improving conformance
to ARM64EC.
We may (or may not) accept build system patches for making such
a build succeed, provided that it does not require changes to
the actual library code.
Add a public API for producing RFC 4281/6381 codecs trings for
MIME types.
This can be required for providing alternative video files to
a web browser, letting the browser pick the best file it supports.
Such strings also allow querying a browser whether it supports
a certain codec combination.
Finally, if implementing a DASH/HLS segmenter outside of libavformat,
one also has to generate such strings.
Generating such strings for H264/AAC is very simple, but for
more modern codecs, it can require a lot of nontrivial codec
specific parsing of extradata.
As libavformat already implements this, expose it for users as well.
The old, internal function ff_make_codec_str is kept and used by
the HLS and DASH muxers; the old function takes a logging context
which can be used for logging auxillary info about how the string
generation worked out.
Older versions of Clang (Xcode 14, llvm.org Clang 13 and 14)
do support and recognize SME, but when enabled through
".arch_extension sme" it fails to transitively enable support
for Streaming SVE; this was fixed in [1].
This issue results in those versions currently detecting support
for SME, but later failing to build cpu_sme.s with errors like
"error: instruction requires: sve or sme" or "error: instruction
requires: streaming-sve or sve", on the "cntb x0" instruction.
Extend the check for this instruction set extension, to test
with two instructions, both specifically a SME instruction
(smstart) and an instruction that is available in Streaming SVE
mode (cntb).
For the configure check, add an extra parameter to
check_archext_insn for an optional second instruction to check.
It would be tempting to just pass both instructions through
the same parameter, as "smstart; cntb x0". However, Darwin
targets use a different token (%%) for starting a new
instruction on the same line - those targets interpret ";"
as the start of a comment. Due to that, such a check would
entirely ignore the second instruction on Darwin targets.
To avoid dealing with the variability in passing multiple
instructions on one line, just pass the optional second
instruction on a separate line.
[1] https://github.com/llvm/llvm-project/commit/ff3f3a54e2d1b05c36943bf88ae0be7475d622ed
This was accidentally removed in
357fc5243c.
This fixes test failures when built with Clang and MSVC;
surprisingly, the checkasm test did seem to pass when built with
GCC. Clang and MSVC also warn about the use of the uninitialized
variable, while GCC didn't.
This allows using the tool for one-off reindentations without needing
the check_arm_indent.sh script (e.g. for use outside of ffmpeg),
without having to pipe the file through stdin/stdout.
This function uses ff_sws_pixel_type_size to switch on the
size of the provided type. However, ff_sws_pixel_type_size returns
a size in bytes (from sizeof()), not a size in bits. Therefore,
this would previously never return the right thing but always
hit the av_unreachable() below.
As the function is entirely unused, just remove it.
This fixes compilation with MSVC 2026 18.0 when targeting ARM64,
which previously hit an internal compiler error [1].
[1] https://developercommunity.visualstudio.com/t/Internal-Compiler-Error-targeting-ARM64-/10962922
This fixes building after commit
1ce88d29d0.
That commit caused the following errors:
src/doc/fate.texi:234: @anchor expected braces.
src/doc/fate.texi:245: @item found outside of an insertion block.
src/doc/fate.texi:249: @item found outside of an insertion block.
src/doc/fate.texi:261: @item found outside of an insertion block.
src/doc/fate.texi:265: @item found outside of an insertion block.
src/doc/fate.texi:268: @item found outside of an insertion block.
src/doc/fate.texi:274: @item found outside of an insertion block.
src/doc/fate.texi:277: @item found outside of an insertion block.
src/doc/fate.texi:281: @item found outside of an insertion block.
src/doc/fate.texi:287: Unmatched `@end'.
./src/doc/fate.texi:65: Cross reference to nonexistent node `makefile variables' (perhaps incorrect sectioning?).
This reverts commit 7b18eafabd.
That commit added tests that don't work on Windows, and which
also fail in setups with cross/remote testing (with --target-exec
and --target-path).
See https://code.ffmpeg.org/FFmpeg/FFmpeg/pulls/20876 for more
discussions about issues with that commit.
The one in dashenc was added in
fe5e6e34c0, while the one in hlsenc
was added later in 0afa171f25. Both
have had various additions on top; merge both implementations
into one shared. (Notable additions in
060e74e2a9,
1cf2f040e3,
a2b1dd0ce3 and
797f0b27c175022d896e46db4ac2873e3e0a70af.)
For H264/avc1, use the implementation from hlsenc (which doesn't
use temporary allocations). For most other codecs, use the
only implementation from whichever had one.
The original dashenc implementation tried to be generic based
on RFC 6381, looking up codec tags in ff_codec_movvideo_tags
or ff_codec_movaudio_tags, and doing specific extra additions
for "mp4a" and "mp4v". In practice, only AV_CODEC_ID_AAC
and AV_CODEC_ID_MPEG4 ever mapped to these; simplify this to
a more straightforward codec id based handling, and merge
with the AAC profile based code from hlsenc.
There's a slight behaviour difference from the old one in
dashenc; if there's no code for a specific codec ID, we previously
just output what we matched from the mov tag tables, but now
we won't output anything. But most commonly used codecs in
DASH should be covered here.
This makes the final file truly hybrid: Externally the file
is a regular, non-fragmented file, but internally, the fragmented
form also exists un-overwritten.
To make any use of that, first, the fragments need to be muxed in
a position independent form, i.e. with empty_moov+default_base_moof
(or the dash or cmaf meta-flags).
Making use of the fragmented form when the file is finalized is
not entirely obvious though. One can dump the contents of the
single mdat box, and get the fragmented form. (This is a neat
trick, but not something that anybody really is expected to
want to do.)
The main expected use case is accessing fragments in the form of
byte range segments, for e.g. HLS.
Previously, the start of the file would look like this:
- ftyp
- free
- moov
- (moov contents)
After finalizing the file, it would look like this:
- ftyp
- free
- mdat (previously moov)
- (moov contents)
In this form, the size and type of the original moov box were
overwritten, and the original moov contents is just leftover
as unused data in the mdat box.
To avoid this issue, the start of the file now looks like this:
- ftyp
- free
- free
- ftyp
- moov
- (moov contents)
The second, hidden ftyp box inside mdat, would normally never be
seen.
After finalizing, the difference is that the mdat box now is
extended to cover the ftyp and the whole moov including its header
(and all the following fragments).
I.e., the start of the file looks like this:
- ftyp
- free
- mdat
- ftyp
- moov
- (moov contents)
This allows accessing the "ftyp+moov" pair sequentially as such,
with a byte range - this range is untouched when finalizing,
producing the same ftyp+moov pair both while writing, when the
file is fragmented, and after finalizing, when the file is
transformed to non-fragmented externally.
Note; the sequential two "free+free" boxes may look slightly
silly; it could be tempting to make the second one an mdat
from the get-go. However, some players of fragmented mp4 (in
particular, Apple's HLS player) bail out if the initialization
segment contains an mdat box - therefore, use a free box.
It could also be possible to use just one single free box with
8 bytes of padding at the start - but that would require more
changes to the finalization logic.
For a segmenting user of the muxer, the only unclarity is how
to determine the right byte range for the internal ftyp+moov
pair. Currently, this requires parsing the muxer output and skip
past anything up to the start of the non-empty free box.
If using the delay_moov flag in combination with hybrid_fragment
(which is a potentially problematic combination otherwise - the
ftyp box does end up hidden in the end), then we need to flush
twice to get both the moov box and the first fragment, if the
file is finished before the first fragment is completed.
If samples were available when the moov was written, chunking
for those samples has been done already, which has to be reset
here.
This is the case when not using empty_moov, when the moov box
describes the first fragment - this case was accounted for already.
But if using the delay_moov flag, then those samples also were
available when writing the moov, so chunking for them has already
been done in this case as well.
Therefore, always reset chunking here (it should be harmless to
always do it), and update the comment to clarify the cases
involved here.
Write the moov tag at the end first, before overwriting the mdat size
at the start of the file.
In case writing the final moov box fails (e.g. due to being out
of disk), we haven't broken the initial moov box yet.
Thus if writing stops between these steps, we could end up with
a file with two moov boxes - which arguably is more feasible to
recover from, than from a file with no moov boxes at all.
This was missed in 0ce413af9c.
This fixes proper detection of Objective C APIs (that are missing)
if targeting older macOS versions, such as the check for
AVCaptureSession.
Alternatively, this could be a separate job, potentially keyed
to only run on PRs that touch files matching */aarch64/*. But
as this runs very quickly, it's probably less clutter to just
bundle it here.
The same also applies for arm assembly, but there are more known
deviations within that.
Add a script which checks all files, except for a few known files
that deviate, for various reasons.
This amends 307983b292 to fix
building with older versions of mingw-w64.
The previously checked constant, SP_PROT_DTLS1_X_CLIENT, was
added in mingw-w64 in df36f5deda23192d0ee99ffd661ea36df924e667
in 2020, and is included in released versions since v8.0.0.
The new checked constant SECPKG_ATTR_DTLS_MTU was added in
mingw-w64 in 0792283787cca8fc27dd38671107c791c87f4db3 in 2021,
and first appeared in mingw-w64 v9.0.0.
This fixes building with mingw-w64 v8, which is the version bundled
in Ubuntu 22.04.
This fixes the following compiler error, if compiling with MSVC
for ARM (32 bit):
src/libswscale/ops_chain.c(48): error C2719: 'priv': formal parameter with requested alignment of 16 won't be aligned
This change shouldn't affect the performance of this operation
(which in itself probably isn't relevant); instead of copying the
contents of the SwsOpPriv struct from the stack as parameter,
it gets copied straight from the caller function's stack frame
instead.
Separately from this issue, MSVC 17.8 and 17.9 end up in an
internal compiler error when compiling libswscale/ops.c, but
older and newer versions do compile it successfully.
If we're invoked with range == UINT_MAX, we end up doing
"rnd() % (UINT_MAX + 1)", which is equal to "rnd() % 0". On
arm (on all platforms) and on MSVC i386, this ends up crashing
at runtime.
This fixes the crash.
Currently, the aacencdsp checkasm tests fails for many seeds,
if the C code has been built with x87 math. This happens because
the excess precision of x87 math can make it end up rounding
to a different integer, and the checkasm tests checks that the
output integers match exactly between C and assembly.
One such failing case is "tests/checkasm/checkasm --test=aacencdsp
41" when compiled with GCC. When compiled with Clang, the test
seed 21 produces a failure.
To avoid the issue, we need to limit the precision of intermediates
to their nominal float range, matching the assembly implementations.
This can be achieved when compiling with GCC, by just adding a single
cast.
To observe the effect of this cast, compile the following
snippet,
int cast(float a, float b) {
return (int)
#ifdef CAST
(float)
#endif
(a + b);
}
with "gcc -m32 -std=c17 -O2", with/without -DCAST. For x86_64
cases (without the "-m32"), the cast doesn't make any difference
on the generated code.
This cast would seem to not have any effect, as a binary expression
with float inputs also would have the type float.
However, if compiling with GCC with -fexcess-precision=standard,
the cast forces limiting the precision according to the language
standard here - according to the GCC docs [1]:
> When compiling C or C++, if -fexcess-precision=standard is
> specified then excess precision follows the rules specified in
> ISO C99 or C++; in particular, both casts and assignments cause
> values to be rounded to their semantic types (whereas -ffloat-store
> only affects assignments). This option is enabled by default for
> C or C++ if a strict conformance option such as -std=c99 or
> -std=c++17 is used.
Ffmpeg's configure scripts enables -std=c17 by default.
This only helps with GCC though - the cast doesn't make any
difference for Clang. (Although, upstream Clang seems to default
to SSE math, while Ubuntu provided Clang defaults to x87 math.)
Limiting the precision with Clang would require casting to volatile
float for both intermediates here - and that does have a code
generation effect on all architectures.
[1] https://gcc.gnu.org/onlinedocs/gcc/Optimize-Options.html
Previously, these tests failed when running on Windows, if the
system is configured with a time zone east of Greenwich, i.e.
with a positive GMT offset.
The muxer converts the creation_date given by the user using
av_parse_time to unix time, as a time_t. The creation_date is
interpreted as a local time, i.e. according to the current time
zone. (This time_t value is then converted back to a broken out
local time form with localtime_r.)
The given reference date/time, "1970-01-01T00:00:00", is the
origin point for unix time, corresponding to time_t zero. However
when interpreted as local time, this doesn't map to exactly zero.
Time zones east of Greenwich reached this time a number of hours
before the point of zero time_t - so the corresponding time_t
value essentially is minus the GMT offset, in seconds.
Windows mktime returns an error, returning (time_t)-1, when given
such a "struct tm", while e.g. glibc mktime happily returns a
negative time_t. av_parse_time doesn't check the return value of
mktime for potential errors.
This is observable with the following test snippet:
struct tm tm = { 0 };
tm.tm_year = 70;
tm.tm_isdst = -1;
tm.tm_mday = 1;
tm.tm_hour = 0;
time_t t = mktime(&tm);
printf("%d-%02d-%02d %02d:%02d:%02d\n", tm.tm_year + 1900, tm.tm_mon + 1, tm.tm_mday, tm.tm_hour, tm.tm_min, tm.tm_sec);
printf("t %d\n", (int)t);
By varying the value of tm_hour and the system time zone, one
can observe that Windows mktime returns -1 for all time_t values
that would have been negative.
This range limit is also documented by Microsoft in detail at
https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/mktime-mktime32-mktime64.
To avoid the issue, pick a different, arbitrary reference time,
which should have a nonnegative time_t for all time zones.
Accept up to 13 ULP difference.
This fixes running "checkasm --test=ac3dsp 3044836819" on ARM.
Depending on how the SIMD implementations aggregate numbers,
larger/smaller values might not end up accumulated in exactly
the same way; the current NEON implementation for ARM aggregates
into vectors of 2 elements. If it would aggregate into vectors
of 4 elements instead, like the AArch64 version does, this particular
case would end up with a smaller difference.
When running plain "cl", to get the MSVC version, it prints the
version header on stderr, while the usage instructions are printed
on stdout. Usually, the version on stderr gets flushed first,
so "head -n1" gets the line it expects, but some times (in particular
when running MSVC wrapped in wine), it can get the usage line
first.
Redirect stdout to /dev/null, so we only grab the version among
the lines printed to stderr. This should make the version number
grabbing more robust.
At least all relevant versions of MSVC seem to print this specifically
to stderr, not stdout (so we don't risk to miss it); checked down
to MSVC 2010.
Signed-off-by: Martin Storsjö <martin@martin.st>
This fixes building with Clang in MSVC mode, for x86, which was
broken in 6e49b86996 (in Nov 2024);
previously it failed with undefined symbols for the constants
defined with DECLARE_ASM_CONST, accessed via inline assembly.
Before 57861911a3, there was an
#elif defined(__GNUC__) || defined(__clang__)
case before the
#elif defined(_MSC_VER)
case for defining DECLARE_ASM_CONST, which included av_used.
(This case included the explicit "defined(__clang__)" since
f637046d3134a331e4b5a7243ac3dfb92735b8a5.)
After 57861911a3, it used the
generic definition of DECLARE_ASM_CONST that also included
av_used - which also worked for Clang in MSVC mode. But after
6e49b86996, Clang in MSVC mode
ended up using the MSVC specific variant which lacked the
av_used declaration, causing linker errors due to undefined
symbols.
Signed-off-by: Martin Storsjö <martin@martin.st>
Since GCC 10 and llvm.org Clang 11, -fno-common is the default.
However Apple's Xcode Clang hasn't followed suit yet, and still
defaults to -fcommon.
Compiling with -fcommon causes uninitialized global variables to
be treated as "common" (which allows multiple object files to have
similar definitions).
Common variables seem to have the issue that their intended alignment
isn't signaled, so the linker assumes that they may need alignment
according to their full size.
With large global tables, this can lead to linker warnings like
this, with Xcode 16.3:
ld: warning: reducing alignment of section __DATA,__common from 0x8000 to 0x4000 because it exceeds segment maximum alignment
This can be reproduced with a small snippet like this:
char table[16385];
int main(int argc, char* argv[]) { return 0; }
Compiling with -fno-common avoids this issue and warning, and
matches the default behaviour of other compilers. (Compiling with
-fno-common also avoids the risk of accidentally accepting
duplicate definitions of global variables, as long as they are
uninitialized.)
Signed-off-by: Martin Storsjö <martin@martin.st>
This allows catching whether the functions write outside of
the designated rectangle, and if run with "checkasm -v", it also
prints out on which side of the rectangle the overwrite was.
Signed-off-by: Martin Storsjö <martin@martin.st>
This class is unavailable on tvOS before 17.0 (and macOS before 10.7
and iOS before 4.0, but those are fairly ancient). This makes sure
that we don't try to build the avfoundation indevice for such
OSes.
Signed-off-by: Martin Storsjö <martin@martin.st>
E.g. tvOS doesn't have devicesWithMediaType.
In principle, we could probably disable building the whole
input device on such OSes, but that would either require
testing explicitly for the OS type in configure (which we don't
do anywhere so far), or test for individual objective C methods.
This approach allows the code to compile, but no input devices
will be found at runtime.
Signed-off-by: Martin Storsjö <martin@martin.st>
This backports similar functionality from dav1d, from commits
35d1d011fda4a92bcaf42d30ed137583b27d7f6d and
d130da9c315d5a1d3968d278bbee2238ad9051e7.
This allows detecting writes out of bounds, on all 4 sides of
the intended destination rectangle.
The bounds checking also can optionally allow small overwrites
(up to a specified alignment), while still checking for larger
overwrites past the intended allowed region.
Signed-off-by: Martin Storsjö <martin@martin.st>
This makes it easier to implement custom error printouts in tests.
This is a port of dav1d's commit
13a7d78655f8747c2cd01e8a48d44dcc7f60a8e5 into ffmpeg's checkasm.
Signed-off-by: Martin Storsjö <martin@martin.st>
This makes sure to disable VideoToolbox if building with an SDK
that does contain VideoToolbox, but targeting an older version of
the OS where it is unavailable. Previously, we would enable
VideoToolbox as long as the framework itself was found, which only
require the framework to exist in the SDK.
Signed-off-by: Martin Storsjö <martin@martin.st>
The audiotoolbox outdev uses APIs that only are available on macOS,
not on iOS or tvOS. Check for them in configure, and make sure the
outdev is disabled otherwise.
This allows building for iOS without explicitly having to disable
the audiotoolbox outdev.
Signed-off-by: Martin Storsjö <martin@martin.st>
The kVTVideoDecoderReferenceMissingErr constant was only added
in the macOS 12 and iOS 15 SDKs. Use a hardcoded value instead
of the named constant, to fix building with older SDKs
after c6214b0d69.
Signed-off-by: Martin Storsjö <martin@martin.st>
We normally don't need else statements here; the common pattern
is to assign lower level SIMD implementations first, then
conditionally reassign higher level ones afterwards, if supported.
Signed-off-by: Martin Storsjö <martin@martin.st>
On a Zen 5, on Ubuntu 24.04 (with CLOCKS_PER_SEC 1000000), the
value of clock() in this loop increments by 0 most of the time,
and when it does increment, it usually increments by 1 compared
to the previous round.
Due to the "last_t + 2*last_td + (CLOCKS_PER_SEC > 1000) >= t"
expression, we only manage to take one step forward in this loop
(incrementing i) if clock() increments by 2, while it incremented
by 0 in the previous iteration (last_td).
This is similar to the change done in
c4152fc42e, to speed it up on
systems with very small CLOCKS_PER_SEC. However in this case,
CLOCKS_PER_SEC is still very large, but the machine is fast enough
to hit every clock increment repeatedly.
For this case, use the number of repetitions of each timer value
as entropy source; require a change in the number of repetitions
in order to proceed to the next buffer index.
This helps the fate-random-seed test to actually terminate within
a reasonable time on such a system (where it previously could hang,
running for many minutes).
Signed-off-by: Martin Storsjö <martin@martin.st>
Previously, we read elements from ff_aac_pow34sf_tab; however
that table is initialized to zero; one needs to call
ff_aac_float_common_init() to make sure that the table is
initialized.
However, given the range of the input values, a large number of
entries in ff_aac_pow34sf_tab would give results outside of the
range for signed 32 bit integers. As the largest aac_cb_maxval
entry is 16, it seems more reasonable to produce values within
an order of mangitude of that value.
(When hitting INT_MIN, implementations may end up with different
results depending on whether the value is negated as a float or
as an int. This corner case is irrelevant in practice as this
is way outside of the expected value range here.)
Coincidentally, this fixes linking checkasm with Apple's older
linker. (In Xcode 15, Apple switched to a new linker. The one in
older toolchains seems to have a bug where it won't figure out to
load object files from a static library, if the only symbol
referenced in the object file is a "common" symbol, i.e. one for
a zero-initialized variable. This issue can also be reproduced with
newer Apple toolchains by passing -Wl,-ld_classic to the linker.)
Signed-off-by: Martin Storsjö <martin@martin.st>