Matthias Dressel
46e9017355
subprojects: Update checkasm to v1.2.0
...
Among various fixes it no longer installs the checkasm library, header
files and pkgconfig when installing dav1d.
2026-06-07 23:15:02 +02:00
Matthias Dressel
d69235dd80
CI: Use shortform QEMU_CPU for loongarch64
...
Since qemu commit 979bf44af8483cedc00c63b3e79407de08e75a30 the cpu
argument accepts just 'max' as a shorthand.
2026-03-17 22:00:03 +01:00
Matthias Dressel
bfbd7d4677
CI: loongarch64: Move QEMU_LD_PREFIX to crossfile
...
Simplifies developement builds on local machines.
2026-03-17 22:00:03 +01:00
Matthias Dressel
afcdb781cb
CI: riscv64: Move QEMU_LD_PREFIX to crossfile
...
Simplifies developement builds on local machines.
2026-03-17 22:00:03 +01:00
Matthias Dressel
42ac98706a
CI: aarch64: Move QEMU_LD_PREFIX to crossfile
...
Simplifies developement builds on local machines.
2026-03-17 22:00:03 +01:00
Matthias Dressel
8feb8526bb
CI: Remove outdated version suffix from job name
2026-03-17 22:00:03 +01:00
Matthias Dressel
1dcfc90757
CI: Update images
2026-02-28 16:27:35 +01:00
Matthias Dressel
daef396277
CI: Switch to loongarch64 Debian toolchain
...
loong64 was recently promoted to an official Debian architecture. [0]
[0] https://lists.debian.org/debian-devel-announce/2025/12/msg00004.html
2026-02-09 02:33:29 +01:00
Matthias Dressel
c3f3a7e567
CI: Check --frametimes with msan
...
This would have caught 583e8e02eb .
2025-07-01 18:35:31 +02:00
Matthias Dressel
8d95618093
CI: Build '-mavx' code as debugoptimized
...
Workaround a GCC 14 bug where it does not insert `vzeroupper` in C code
built without at least '-O2'.
2025-03-10 16:40:35 +01:00
Matthias Dressel
edeac873c4
CI: Update images
2025-03-10 16:40:35 +01:00
Matthias Dressel
1d0cda02a6
CI: Update ppc64le image
...
Since there seems to be a problem with gcc-14 stay on gcc-13 for now.
2025-03-05 21:58:24 +01:00
Matthias Dressel and Jean-Baptiste Kempf
37155c1147
CI: Update Android image
...
NDK 26 dropped support for API versions 19 and 20 (KitKat, Android 4.4).
The minimum supported API is now 21 (Lollipop, Android 5.0).
2024-05-18 10:04:31 +00:00
Matthias Dressel and Jean-Baptiste Kempf
c7df9a3e65
CI: Improve coverage for argon samples using different thread counts
...
Similar to 4796b59fc0 .
2024-05-01 13:09:09 +00:00
Matthias Dressel and Jean-Baptiste Kempf
0f504bf57c
CI: Add dotprod to argon tests
2024-05-01 13:09:09 +00:00
Matthias Dressel
5851901772
CI: Move llvm crossfiles from image to project
...
Since dav1d was the only user of these crossfiles, it was agreed upon to
remove them from the image [0] and move to dav1d directly. [1]
[0] https://code.videolan.org/videolan/docker-images/-/merge_requests/293
[1] https://code.videolan.org/videolan/docker-images/-/merge_requests/294#note_434720
2024-04-16 11:53:16 +02:00
Matthias Dressel
313af0b6a5
CI: Update images
...
Now with clang 18 and downgraded xz-utils.
2024-04-14 01:57:37 +02:00
Matthias Dressel
aa63a41ccd
cli: Add missing ARM cpumasks help text
...
Forgotten in acc1121d2f .
2024-04-11 23:15:07 +02:00
Matthias Dressel
b9312c8dd8
Update THANKS.md
2024-03-08 23:24:30 +01:00
Matthias Dressel
9d57a654e2
CI: Add riscv64 clang build
2024-02-22 19:13:23 +01:00
Matthias Dressel
bada810c17
CI: Update image
...
Now contains clang 17.
2024-02-22 19:13:23 +01:00
Matthias Dressel
91ddba0b07
gcovr: Fix config file
...
gcovr 7.0 fixed a config file parsing bug [0].
Valid options are 'all', 'negative_hits.warn',
'negative_hits.warn_once_per_file'.
[0] https://github.com/gcovr/gcovr/pull/816
2024-02-22 19:13:23 +01:00
Matthias Dressel
81c0b46375
meson: Test for RISC-V assembler support
...
Support for '.option arch' directive [0] was added to binutils in
d3ffd7f77654adafe5f1989bdfdbe4a337ff2e8b [1] and in llvm in
9e8ed3403c191ab9c4903e8eeb8f732ff8a43cb4 [2].
[0] https://github.com/riscv-non-isa/riscv-asm-manual/pull/67
[1] https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=d3ffd7f77654adafe5f1989bdfdbe4a337ff2e8b
[2] https://github.com/llvm/llvm-project/commit/9e8ed3403c191ab9c4903e8eeb8f732ff8a43cb4
2024-02-20 15:53:04 +01:00
Matthias Dressel and Nathan E. Egge
a7edb02987
CI: Use cross-compiling libc instead of multi-arch
...
See https://code.videolan.org/videolan/docker-images/-/merge_requests/272
for more context.
2024-01-31 06:04:21 -05:00
Matthias Dressel and Nathan E. Egge
ebbddd48e3
CI: Add riscv64 tests
2024-01-31 06:04:21 -05:00
Matthias Dressel
16ed8e8b99
meson: Disable seek-stress tests by default
2024-01-24 16:28:38 +01:00
Matthias Dressel and Ronald S. Bultje
2c9bbb4908
meson: Add 'enable_seek_stress' option
...
Allows to explicitly enable/disable seek-stress tests.
2024-01-23 17:47:46 +00:00
Matthias Dressel
b084160736
CI: Switch to using 'testdata' suite
...
Simplifies testing and also contains the forgotten 'testdata-multi'
suite which was added later.
2024-01-23 00:26:51 +01:00
Matthias Dressel
7d225bec62
CI: Add loongarch64 tests
2024-01-15 14:54:46 +01:00
Matthias Dressel
655d7ec07d
CI: Add loongarch64 toolchain
2024-01-15 09:35:54 +01:00
Matthias Dressel
48ef395920
CI: Update images
2023-10-24 20:27:33 +02:00
Matthias Dressel
9278a14cf4
checkasm: Always bench C-only functions as well
...
Integrates --bench-c into --bench to simplify benchmarks.
2023-07-12 19:38:06 +02:00
Matthias Dressel
fc40a0db51
checkasm: document '-t' in --help text
2023-07-07 21:21:51 +02:00
Matthias Dressel
f8ae94eca0
CI: Add argon tests
2023-05-14 17:52:59 +02:00
Matthias Dressel and Jean-Baptiste Kempf
6addb1a83c
crossfiles: Streamline and simplify crossfiles
...
* `needs_exe_wrapper` is only needed in specific cases when
`exe_wrapper` is not set.
See https://mesonbuild.com/Cross-compilation.html#properties
* "Before 0.56.0, <lang>_args and <lang>_link_args must be put in the
properties section instead, else they will be ignored."
[https://mesonbuild.com/Machine-files.html#meson-builtin-options ]
Our minimum version is 0.49.0. Meson >= 0.56.0 prints a deprecation
warning.
2023-04-23 12:39:02 +00:00
Matthias Dressel and Jean-Baptiste Kempf
380efd764f
CI: Add wasm{32,64} builds
...
Fixes #421
2023-04-06 07:52:12 +00:00
Matthias Dressel
0207e0fe9f
x86/itx: Fix identation of macro instructions
2023-03-31 18:41:54 +02:00
Matthias Dressel
f6d4c0c473
x86/itx: Add 32x32 12bpc AVX2 idtx
...
inv_txfm_add_32x32_identity_identity_0_12bpc_c: 5785.8 ( 1.00x)
inv_txfm_add_32x32_identity_identity_0_12bpc_avx2: 20.7 (279.65x)
inv_txfm_add_32x32_identity_identity_1_12bpc_c: 5896.9 ( 1.00x)
inv_txfm_add_32x32_identity_identity_1_12bpc_avx2: 20.7 (285.01x)
inv_txfm_add_32x32_identity_identity_2_12bpc_c: 5799.5 ( 1.00x)
inv_txfm_add_32x32_identity_identity_2_12bpc_avx2: 68.9 (84.20x)
inv_txfm_add_32x32_identity_identity_3_12bpc_c: 5798.1 ( 1.00x)
inv_txfm_add_32x32_identity_identity_3_12bpc_avx2: 140.6 (41.25x)
inv_txfm_add_32x32_identity_identity_4_12bpc_c: 5803.3 ( 1.00x)
inv_txfm_add_32x32_identity_identity_4_12bpc_avx2: 308.2 (18.83x)
2023-03-31 18:41:36 +02:00
Matthias Dressel
1e602b8b33
x86/itx: Add 32x16 12bpc AVX2 idtx
...
inv_txfm_add_32x16_identity_identity_0_12bpc_c: 4138.7 ( 1.00x)
inv_txfm_add_32x16_identity_identity_0_12bpc_avx2: 30.4 (136.26x)
inv_txfm_add_32x16_identity_identity_1_12bpc_c: 4147.5 ( 1.00x)
inv_txfm_add_32x16_identity_identity_1_12bpc_avx2: 30.7 (135.25x)
inv_txfm_add_32x16_identity_identity_2_12bpc_c: 4138.2 ( 1.00x)
inv_txfm_add_32x16_identity_identity_2_12bpc_avx2: 98.9 (41.84x)
inv_txfm_add_32x16_identity_identity_3_12bpc_c: 4136.6 ( 1.00x)
inv_txfm_add_32x16_identity_identity_3_12bpc_avx2: 167.7 (24.67x)
inv_txfm_add_32x16_identity_identity_4_12bpc_c: 4156.3 ( 1.00x)
inv_txfm_add_32x16_identity_identity_4_12bpc_avx2: 242.1 (17.17x)
2023-03-31 18:41:19 +02:00
Matthias Dressel
e6b194e7d2
x86/itx: Add 16x32 12bpc AVX2 idtx
...
inv_txfm_add_16x32_identity_identity_0_12bpc_c: 4287.9 ( 1.00x)
inv_txfm_add_16x32_identity_identity_0_12bpc_avx2: 31.4 (136.66x)
inv_txfm_add_16x32_identity_identity_1_12bpc_c: 4293.7 ( 1.00x)
inv_txfm_add_16x32_identity_identity_1_12bpc_avx2: 30.9 (139.07x)
inv_txfm_add_16x32_identity_identity_2_12bpc_c: 4273.8 ( 1.00x)
inv_txfm_add_16x32_identity_identity_2_12bpc_avx2: 97.3 (43.92x)
inv_txfm_add_16x32_identity_identity_3_12bpc_c: 4269.0 ( 1.00x)
inv_txfm_add_16x32_identity_identity_3_12bpc_avx2: 165.2 (25.83x)
inv_txfm_add_16x32_identity_identity_4_12bpc_c: 4284.4 ( 1.00x)
inv_txfm_add_16x32_identity_identity_4_12bpc_avx2: 235.2 (18.22x)
2023-03-31 18:40:35 +02:00
Matthias Dressel
d426d1c910
.gitignore: Add tests/argon
2023-03-01 19:59:10 +01:00
Matthias Dressel and Henrik Gramner
e43904ca48
Add script to test against argon samples
...
Co-authored-by: Henrik Gramner <gramner@twoorioles.com >
2023-03-01 19:59:10 +01:00
Matthias Dressel
b8a43e2225
CI: Replace only/except with rules
...
"only and except are not being actively developed. rules is the
preferred keyword to control when to add jobs to pipelines." [0]
[0] https://docs.gitlab.com/ee/ci/yaml/index.html#only--except
2023-02-13 21:10:44 +01:00
Matthias Dressel
616dad2b43
CI: Unambiguously call meson setup
...
Calling meson with no command is deprecated since 0.64.0
2023-02-13 21:10:44 +01:00
Matthias Dressel
899d6c9fd3
CI: Update images
2023-02-13 21:10:44 +01:00
Matthias Dressel
934713e4a6
CI: Disable trimming on some tests
...
Allow checkasm to run.
2022-09-09 09:21:25 +02:00
Matthias Dressel
3920bd9d9d
CI: Remove git 'safe.directory' config
...
It is now handled by the gitlab runner.
Ref: 7d859f9c72
2022-09-09 09:21:25 +02:00
Matthias Dressel
ddb3189c25
gcovr: Ignore parsing errors
2022-09-09 09:21:25 +02:00
Matthias Dressel
aa3fda7800
crossfiles: Update Android toolchains
...
* Android armv7: target API 19 since it's the lowest directly provided
by the new NDK.
* Newer NDK has generic tools for ar, strip, etc.
* Remove windres as it's only relevant for Windows targets.
2022-09-09 09:20:52 +02:00
Matthias Dressel
d92594bd5d
CI: Update images
...
Remove experimental since gcc12, clang14, mold are now in unstable.
2022-09-09 09:20:52 +02:00
Matthias Dressel
8c079f784a
CI: Update coverage collecting
...
artifacts:reports:cobertura was deprecated in GitLab 14.9
2022-05-25 19:41:34 +02:00
Matthias Dressel
0770d98d93
CI: Add a build with the minimum requirements
...
* meson 0.49.0
* nasm 2.14
2022-05-25 19:41:34 +02:00
Matthias Dressel
7d859f9c72
CI: Deactivate git 'safe.directory'
...
An attacker already has arbitrary code execution inside the container.
Ref: CVE-2022-24765
2022-05-25 19:41:34 +02:00
Matthias Dressel
c1264cd27e
CI: Update images
2022-05-25 19:41:34 +02:00
Matthias Dressel
9833c92807
CI: Add gcc12 and clang14 builds with mold linker
2022-05-07 16:51:25 +02:00
Matthias Dressel
1bd91c3e67
CI: Trigger documentation rebuild if configuration changes
...
Additionally, switch from 'only'/'except' to 'rules' which is
more flexible.
2022-05-06 01:52:36 +02:00
Matthias Dressel
9c69574d0f
meson/doc: Fix doxygen config
...
* Doxygen had a longstanding bug [0] where it would use `dot` even if
not configured to do so. Due to this behaviour our config magically
worked.
This bug is fixed in 1.9.2 therefore we need to explicitly enable
`dot` support in order to keep existing functionality.
* Enables WARN_AS_ERROR to catch mistakes.
* Adds a version string to the header to easily identify which commit
the docs are built from.
[0] https://github.com/doxygen/doxygen/issues/7273
2022-05-06 01:52:36 +02:00
Matthias Dressel
ffb5968035
x86/itx: Add 32x8 12bpc AVX2 transforms
...
inv_txfm_add_32x8_dct_dct_0_12bpc_c: 286.7
inv_txfm_add_32x8_dct_dct_0_12bpc_avx2: 20.1
inv_txfm_add_32x8_dct_dct_1_12bpc_c: 7832.7
inv_txfm_add_32x8_dct_dct_1_12bpc_avx2: 710.6
inv_txfm_add_32x8_dct_dct_2_12bpc_c: 7838.1
inv_txfm_add_32x8_dct_dct_2_12bpc_avx2: 711.6
inv_txfm_add_32x8_dct_dct_3_12bpc_c: 7818.3
inv_txfm_add_32x8_dct_dct_3_12bpc_avx2: 710.9
inv_txfm_add_32x8_dct_dct_4_12bpc_c: 7820.6
inv_txfm_add_32x8_dct_dct_4_12bpc_avx2: 710.5
inv_txfm_add_32x8_identity_identity_0_12bpc_c: 1526.6
inv_txfm_add_32x8_identity_identity_0_12bpc_avx2: 19.3
inv_txfm_add_32x8_identity_identity_1_12bpc_c: 1519.4
inv_txfm_add_32x8_identity_identity_1_12bpc_avx2: 19.9
inv_txfm_add_32x8_identity_identity_2_12bpc_c: 1519.9
inv_txfm_add_32x8_identity_identity_2_12bpc_avx2: 43.6
inv_txfm_add_32x8_identity_identity_3_12bpc_c: 1519.4
inv_txfm_add_32x8_identity_identity_3_12bpc_avx2: 67.8
inv_txfm_add_32x8_identity_identity_4_12bpc_c: 1523.2
inv_txfm_add_32x8_identity_identity_4_12bpc_avx2: 91.6
2022-04-24 20:58:00 +02:00
Matthias Dressel
e67a500054
x86/itx: Add 8x32 12bpc AVX2 transforms
...
inv_txfm_add_8x32_dct_dct_0_12bpc_c: 334.6
inv_txfm_add_8x32_dct_dct_0_12bpc_avx2: 66.0
inv_txfm_add_8x32_dct_dct_1_12bpc_c: 7929.7
inv_txfm_add_8x32_dct_dct_1_12bpc_avx2: 489.3
inv_txfm_add_8x32_dct_dct_2_12bpc_c: 7925.8
inv_txfm_add_8x32_dct_dct_2_12bpc_avx2: 547.1
inv_txfm_add_8x32_dct_dct_3_12bpc_c: 7928.9
inv_txfm_add_8x32_dct_dct_3_12bpc_avx2: 647.8
inv_txfm_add_8x32_dct_dct_4_12bpc_c: 7916.1
inv_txfm_add_8x32_dct_dct_4_12bpc_avx2: 701.0
inv_txfm_add_8x32_identity_identity_0_12bpc_c: 2413.1
inv_txfm_add_8x32_identity_identity_0_12bpc_avx2: 28.6
inv_txfm_add_8x32_identity_identity_1_12bpc_c: 2415.2
inv_txfm_add_8x32_identity_identity_1_12bpc_avx2: 28.6
inv_txfm_add_8x32_identity_identity_2_12bpc_c: 2413.7
inv_txfm_add_8x32_identity_identity_2_12bpc_avx2: 55.1
inv_txfm_add_8x32_identity_identity_3_12bpc_c: 2415.4
inv_txfm_add_8x32_identity_identity_3_12bpc_avx2: 85.3
inv_txfm_add_8x32_identity_identity_4_12bpc_c: 2401.8
inv_txfm_add_8x32_identity_identity_4_12bpc_avx2: 116.8
2022-04-24 20:56:32 +02:00
Matthias Dressel
0c1fbdefdc
x86/itx: Deduplicate dconly code
2022-04-24 17:59:04 +02:00
Matthias Dressel
11aa919a2f
lib: Fix typo in documentation
2022-04-23 23:38:20 +02:00
Matthias Dressel
d821d88035
Update THANKS.md
2022-02-19 18:01:31 +01:00
Matthias Dressel
94b1bf456e
meson: Use native check of return value
2022-02-09 15:35:09 +01:00
Matthias Dressel
8e8148c16d
x86/itx: Add 16x16 12bpc AVX2 transforms
...
inv_txfm_add_16x16_adst_adst_0_12bpc_c: 8990.0
inv_txfm_add_16x16_adst_adst_0_12bpc_avx2: 646.1
inv_txfm_add_16x16_adst_adst_1_12bpc_c: 8965.3
inv_txfm_add_16x16_adst_adst_1_12bpc_avx2: 646.9
inv_txfm_add_16x16_adst_adst_2_12bpc_c: 8983.2
inv_txfm_add_16x16_adst_adst_2_12bpc_avx2: 870.1
inv_txfm_add_16x16_adst_dct_0_12bpc_c: 9058.2
inv_txfm_add_16x16_adst_dct_0_12bpc_avx2: 548.8
inv_txfm_add_16x16_adst_dct_1_12bpc_c: 9092.7
inv_txfm_add_16x16_adst_dct_1_12bpc_avx2: 549.3
inv_txfm_add_16x16_adst_dct_2_12bpc_c: 9086.7
inv_txfm_add_16x16_adst_dct_2_12bpc_avx2: 775.5
inv_txfm_add_16x16_adst_flipadst_0_12bpc_c: 9083.4
inv_txfm_add_16x16_adst_flipadst_0_12bpc_avx2: 645.6
inv_txfm_add_16x16_adst_flipadst_1_12bpc_c: 8998.3
inv_txfm_add_16x16_adst_flipadst_1_12bpc_avx2: 646.2
inv_txfm_add_16x16_adst_flipadst_2_12bpc_c: 9014.7
inv_txfm_add_16x16_adst_flipadst_2_12bpc_avx2: 873.8
inv_txfm_add_16x16_dct_adst_0_12bpc_c: 9080.1
inv_txfm_add_16x16_dct_adst_0_12bpc_avx2: 598.2
inv_txfm_add_16x16_dct_adst_1_12bpc_c: 9103.3
inv_txfm_add_16x16_dct_adst_1_12bpc_avx2: 598.1
inv_txfm_add_16x16_dct_adst_2_12bpc_c: 9089.5
inv_txfm_add_16x16_dct_adst_2_12bpc_avx2: 764.4
inv_txfm_add_16x16_dct_dct_0_12bpc_c: 1042.1
inv_txfm_add_16x16_dct_dct_0_12bpc_avx2: 28.6
inv_txfm_add_16x16_dct_dct_1_12bpc_c: 9164.6
inv_txfm_add_16x16_dct_dct_1_12bpc_avx2: 500.8
inv_txfm_add_16x16_dct_dct_2_12bpc_c: 9161.9
inv_txfm_add_16x16_dct_dct_2_12bpc_avx2: 678.2
inv_txfm_add_16x16_dct_flipadst_0_12bpc_c: 9104.9
inv_txfm_add_16x16_dct_flipadst_0_12bpc_avx2: 601.8
inv_txfm_add_16x16_dct_flipadst_1_12bpc_c: 9248.6
inv_txfm_add_16x16_dct_flipadst_1_12bpc_avx2: 599.2
inv_txfm_add_16x16_dct_flipadst_2_12bpc_c: 9087.4
inv_txfm_add_16x16_dct_flipadst_2_12bpc_avx2: 770.1
inv_txfm_add_16x16_dct_identity_0_12bpc_c: 6570.4
inv_txfm_add_16x16_dct_identity_0_12bpc_avx2: 243.9
inv_txfm_add_16x16_dct_identity_1_12bpc_c: 6615.4
inv_txfm_add_16x16_dct_identity_1_12bpc_avx2: 246.0
inv_txfm_add_16x16_dct_identity_2_12bpc_c: 6553.4
inv_txfm_add_16x16_dct_identity_2_12bpc_avx2: 435.0
inv_txfm_add_16x16_flipadst_adst_0_12bpc_c: 8982.1
inv_txfm_add_16x16_flipadst_adst_0_12bpc_avx2: 647.2
inv_txfm_add_16x16_flipadst_adst_1_12bpc_c: 8978.9
inv_txfm_add_16x16_flipadst_adst_1_12bpc_avx2: 647.2
inv_txfm_add_16x16_flipadst_adst_2_12bpc_c: 8964.0
inv_txfm_add_16x16_flipadst_adst_2_12bpc_avx2: 868.4
inv_txfm_add_16x16_flipadst_dct_0_12bpc_c: 9083.5
inv_txfm_add_16x16_flipadst_dct_0_12bpc_avx2: 550.0
inv_txfm_add_16x16_flipadst_dct_1_12bpc_c: 9070.4
inv_txfm_add_16x16_flipadst_dct_1_12bpc_avx2: 550.2
inv_txfm_add_16x16_flipadst_dct_2_12bpc_c: 9085.8
inv_txfm_add_16x16_flipadst_dct_2_12bpc_avx2: 779.7
inv_txfm_add_16x16_flipadst_flipadst_0_12bpc_c: 8977.1
inv_txfm_add_16x16_flipadst_flipadst_0_12bpc_avx2: 657.3
inv_txfm_add_16x16_flipadst_flipadst_1_12bpc_c: 9002.0
inv_txfm_add_16x16_flipadst_flipadst_1_12bpc_avx2: 657.3
inv_txfm_add_16x16_flipadst_flipadst_2_12bpc_c: 9008.4
inv_txfm_add_16x16_flipadst_flipadst_2_12bpc_avx2: 872.0
inv_txfm_add_16x16_identity_dct_0_12bpc_c: 6504.7
inv_txfm_add_16x16_identity_dct_0_12bpc_avx2: 387.5
inv_txfm_add_16x16_identity_dct_1_12bpc_c: 6548.3
inv_txfm_add_16x16_identity_dct_1_12bpc_avx2: 387.5
inv_txfm_add_16x16_identity_dct_2_12bpc_c: 6512.4
inv_txfm_add_16x16_identity_dct_2_12bpc_avx2: 387.5
inv_txfm_add_16x16_identity_identity_0_12bpc_c: 3926.2
inv_txfm_add_16x16_identity_identity_0_12bpc_avx2: 135.0
inv_txfm_add_16x16_identity_identity_1_12bpc_c: 3896.7
inv_txfm_add_16x16_identity_identity_1_12bpc_avx2: 134.5
inv_txfm_add_16x16_identity_identity_2_12bpc_c: 3888.0
inv_txfm_add_16x16_identity_identity_2_12bpc_avx2: 230.3
2022-01-24 18:11:46 +01:00
Matthias Dressel
0a596b6fa1
x86/filmgrain: Don't use AVX2 for fgy, fguv on CPUs with slow gather
...
Filmgrain is using a lot of `vpgatherdd` instructions which are rather
slow on certain chips, making the SSSE3 version faster.
Fixes #377
2022-01-11 23:24:41 +01:00
Matthias Dressel and Henrik Gramner
e663897a94
x86: Detect CPUs with slow AVX2 gather
...
`vpgather*` instructions seem to be relatively slow on current AMD
chips. Intel Haswell is slow as well, but just (barely) fast enough to
not cause regressions in our current use cases.
Co-authored-by: Henrik Gramner <gramner@twoorioles.com >
2022-01-11 23:24:41 +01:00
Matthias Dressel
633c63ed51
README: Add the new documentation option
2022-01-04 15:26:01 +01:00
Matthias Dressel
37881b8278
ppc: Rename types.h to dav1d_types.h
...
Avoid collision with system header using gcc7.
Fixes #363
2022-01-03 22:39:52 +01:00
Matthias Dressel
3e5b7d3770
CI: Add enable_docs option
2021-12-29 17:25:37 +01:00
Matthias Dressel and Rudi Heitbaum
5e67cfd806
meson: Add explicit option to build documentation
...
Co-authored-by: Rudi Heitbaum <rudi@heitbaum.com >
2021-12-29 17:25:37 +01:00
Matthias Dressel
f266b3b295
README: Update minimum meson version
...
Changed in d85fdf52
2021-12-28 21:15:50 +01:00
Matthias Dressel
e8a3f99d90
x86/itx: Add 16x8 12bpc AVX2 transforms
...
inv_txfm_add_16x8_adst_adst_0_12bpc_c: 4517.9
inv_txfm_add_16x8_adst_adst_0_12bpc_avx2: 432.4
inv_txfm_add_16x8_adst_adst_1_12bpc_c: 4510.9
inv_txfm_add_16x8_adst_adst_1_12bpc_avx2: 432.4
inv_txfm_add_16x8_adst_adst_2_12bpc_c: 4498.6
inv_txfm_add_16x8_adst_adst_2_12bpc_avx2: 432.4
inv_txfm_add_16x8_adst_dct_0_12bpc_c: 4553.8
inv_txfm_add_16x8_adst_dct_0_12bpc_avx2: 389.1
inv_txfm_add_16x8_adst_dct_1_12bpc_c: 4543.3
inv_txfm_add_16x8_adst_dct_1_12bpc_avx2: 389.1
inv_txfm_add_16x8_adst_dct_2_12bpc_c: 4538.4
inv_txfm_add_16x8_adst_dct_2_12bpc_avx2: 389.1
inv_txfm_add_16x8_adst_flipadst_0_12bpc_c: 4532.6
inv_txfm_add_16x8_adst_flipadst_0_12bpc_avx2: 435.4
inv_txfm_add_16x8_adst_flipadst_1_12bpc_c: 4520.4
inv_txfm_add_16x8_adst_flipadst_1_12bpc_avx2: 435.4
inv_txfm_add_16x8_adst_flipadst_2_12bpc_c: 4516.2
inv_txfm_add_16x8_adst_flipadst_2_12bpc_avx2: 435.4
inv_txfm_add_16x8_adst_identity_0_12bpc_c: 3502.3
inv_txfm_add_16x8_adst_identity_0_12bpc_avx2: 255.9
inv_txfm_add_16x8_adst_identity_1_12bpc_c: 3492.9
inv_txfm_add_16x8_adst_identity_1_12bpc_avx2: 256.3
inv_txfm_add_16x8_adst_identity_2_12bpc_c: 3471.4
inv_txfm_add_16x8_adst_identity_2_12bpc_avx2: 256.7
inv_txfm_add_16x8_dct_adst_0_12bpc_c: 4563.2
inv_txfm_add_16x8_dct_adst_0_12bpc_avx2: 383.6
inv_txfm_add_16x8_dct_adst_1_12bpc_c: 4573.1
inv_txfm_add_16x8_dct_adst_1_12bpc_avx2: 383.9
inv_txfm_add_16x8_dct_adst_2_12bpc_c: 4562.2
inv_txfm_add_16x8_dct_adst_2_12bpc_avx2: 383.7
inv_txfm_add_16x8_dct_dct_0_12bpc_c: 514.0
inv_txfm_add_16x8_dct_dct_0_12bpc_avx2: 25.0
inv_txfm_add_16x8_dct_dct_1_12bpc_c: 4540.5
inv_txfm_add_16x8_dct_dct_1_12bpc_avx2: 340.4
inv_txfm_add_16x8_dct_dct_2_12bpc_c: 4563.0
inv_txfm_add_16x8_dct_dct_2_12bpc_avx2: 339.3
inv_txfm_add_16x8_dct_flipadst_0_12bpc_c: 4568.0
inv_txfm_add_16x8_dct_flipadst_0_12bpc_avx2: 385.9
inv_txfm_add_16x8_dct_flipadst_1_12bpc_c: 4577.5
inv_txfm_add_16x8_dct_flipadst_1_12bpc_avx2: 385.8
inv_txfm_add_16x8_dct_flipadst_2_12bpc_c: 4573.8
inv_txfm_add_16x8_dct_flipadst_2_12bpc_avx2: 385.8
inv_txfm_add_16x8_dct_identity_0_12bpc_c: 3549.9
inv_txfm_add_16x8_dct_identity_0_12bpc_avx2: 212.1
inv_txfm_add_16x8_dct_identity_1_12bpc_c: 3538.7
inv_txfm_add_16x8_dct_identity_1_12bpc_avx2: 212.1
inv_txfm_add_16x8_dct_identity_2_12bpc_c: 3539.7
inv_txfm_add_16x8_dct_identity_2_12bpc_avx2: 212.1
inv_txfm_add_16x8_flipadst_adst_0_12bpc_c: 4495.3
inv_txfm_add_16x8_flipadst_adst_0_12bpc_avx2: 431.4
inv_txfm_add_16x8_flipadst_adst_1_12bpc_c: 4496.3
inv_txfm_add_16x8_flipadst_adst_1_12bpc_avx2: 431.4
inv_txfm_add_16x8_flipadst_adst_2_12bpc_c: 4499.2
inv_txfm_add_16x8_flipadst_adst_2_12bpc_avx2: 431.3
inv_txfm_add_16x8_flipadst_dct_0_12bpc_c: 4506.9
inv_txfm_add_16x8_flipadst_dct_0_12bpc_avx2: 386.3
inv_txfm_add_16x8_flipadst_dct_1_12bpc_c: 4512.9
inv_txfm_add_16x8_flipadst_dct_1_12bpc_avx2: 386.0
inv_txfm_add_16x8_flipadst_dct_2_12bpc_c: 4503.2
inv_txfm_add_16x8_flipadst_dct_2_12bpc_avx2: 386.0
inv_txfm_add_16x8_flipadst_flipadst_0_12bpc_c: 4509.1
inv_txfm_add_16x8_flipadst_flipadst_0_12bpc_avx2: 432.2
inv_txfm_add_16x8_flipadst_flipadst_1_12bpc_c: 4519.0
inv_txfm_add_16x8_flipadst_flipadst_1_12bpc_avx2: 432.1
inv_txfm_add_16x8_flipadst_flipadst_2_12bpc_c: 4518.3
inv_txfm_add_16x8_flipadst_flipadst_2_12bpc_avx2: 432.1
inv_txfm_add_16x8_flipadst_identity_0_12bpc_c: 3511.0
inv_txfm_add_16x8_flipadst_identity_0_12bpc_avx2: 257.1
inv_txfm_add_16x8_flipadst_identity_1_12bpc_c: 3518.5
inv_txfm_add_16x8_flipadst_identity_1_12bpc_avx2: 257.2
inv_txfm_add_16x8_flipadst_identity_2_12bpc_c: 3521.7
inv_txfm_add_16x8_flipadst_identity_2_12bpc_avx2: 257.1
inv_txfm_add_16x8_identity_adst_0_12bpc_c: 3166.8
inv_txfm_add_16x8_identity_adst_0_12bpc_avx2: 268.6
inv_txfm_add_16x8_identity_adst_1_12bpc_c: 3157.9
inv_txfm_add_16x8_identity_adst_1_12bpc_avx2: 268.6
inv_txfm_add_16x8_identity_adst_2_12bpc_c: 3156.5
inv_txfm_add_16x8_identity_adst_2_12bpc_avx2: 268.6
inv_txfm_add_16x8_identity_dct_0_12bpc_c: 3187.4
inv_txfm_add_16x8_identity_dct_0_12bpc_avx2: 224.4
inv_txfm_add_16x8_identity_dct_1_12bpc_c: 3185.8
inv_txfm_add_16x8_identity_dct_1_12bpc_avx2: 224.4
inv_txfm_add_16x8_identity_dct_2_12bpc_c: 3190.8
inv_txfm_add_16x8_identity_dct_2_12bpc_avx2: 224.4
inv_txfm_add_16x8_identity_flipadst_0_12bpc_c: 3167.7
inv_txfm_add_16x8_identity_flipadst_0_12bpc_avx2: 269.7
inv_txfm_add_16x8_identity_flipadst_1_12bpc_c: 3174.1
inv_txfm_add_16x8_identity_flipadst_1_12bpc_avx2: 269.8
inv_txfm_add_16x8_identity_flipadst_2_12bpc_c: 3174.7
inv_txfm_add_16x8_identity_flipadst_2_12bpc_avx2: 269.7
inv_txfm_add_16x8_identity_identity_0_12bpc_c: 2153.3
inv_txfm_add_16x8_identity_identity_0_12bpc_avx2: 99.1
inv_txfm_add_16x8_identity_identity_1_12bpc_c: 2143.6
inv_txfm_add_16x8_identity_identity_1_12bpc_avx2: 99.3
inv_txfm_add_16x8_identity_identity_2_12bpc_c: 2145.9
inv_txfm_add_16x8_identity_identity_2_12bpc_avx2: 98.6
2021-12-04 05:04:37 +01:00
Matthias Dressel
23e8405c2e
x86/itx: Add 8x16 12bpc AVX2 transforms
...
inv_txfm_add_8x16_adst_adst_0_12bpc_c: 4440.4
inv_txfm_add_8x16_adst_adst_0_12bpc_avx2: 354.3
inv_txfm_add_8x16_adst_adst_1_12bpc_c: 4437.3
inv_txfm_add_8x16_adst_adst_1_12bpc_avx2: 354.3
inv_txfm_add_8x16_adst_adst_2_12bpc_c: 4438.8
inv_txfm_add_8x16_adst_adst_2_12bpc_avx2: 442.6
inv_txfm_add_8x16_adst_dct_0_12bpc_c: 4507.3
inv_txfm_add_8x16_adst_dct_0_12bpc_avx2: 310.0
inv_txfm_add_8x16_adst_dct_1_12bpc_c: 4500.3
inv_txfm_add_8x16_adst_dct_1_12bpc_avx2: 310.0
inv_txfm_add_8x16_adst_dct_2_12bpc_c: 4516.1
inv_txfm_add_8x16_adst_dct_2_12bpc_avx2: 399.5
inv_txfm_add_8x16_adst_flipadst_0_12bpc_c: 4457.3
inv_txfm_add_8x16_adst_flipadst_0_12bpc_avx2: 355.6
inv_txfm_add_8x16_adst_flipadst_1_12bpc_c: 4441.3
inv_txfm_add_8x16_adst_flipadst_1_12bpc_avx2: 355.6
inv_txfm_add_8x16_adst_flipadst_2_12bpc_c: 4448.9
inv_txfm_add_8x16_adst_flipadst_2_12bpc_avx2: 445.5
inv_txfm_add_8x16_adst_identity_0_12bpc_c: 3204.0
inv_txfm_add_8x16_adst_identity_0_12bpc_avx2: 173.1
inv_txfm_add_8x16_adst_identity_1_12bpc_c: 3207.1
inv_txfm_add_8x16_adst_identity_1_12bpc_avx2: 173.6
inv_txfm_add_8x16_adst_identity_2_12bpc_c: 3210.4
inv_txfm_add_8x16_adst_identity_2_12bpc_avx2: 261.2
inv_txfm_add_8x16_dct_adst_0_12bpc_c: 4484.2
inv_txfm_add_8x16_dct_adst_0_12bpc_avx2: 334.0
inv_txfm_add_8x16_dct_adst_1_12bpc_c: 4503.8
inv_txfm_add_8x16_dct_adst_1_12bpc_avx2: 334.6
inv_txfm_add_8x16_dct_adst_2_12bpc_c: 4490.7
inv_txfm_add_8x16_dct_adst_2_12bpc_avx2: 395.6
inv_txfm_add_8x16_dct_dct_0_12bpc_c: 419.9
inv_txfm_add_8x16_dct_dct_0_12bpc_avx2: 37.6
inv_txfm_add_8x16_dct_dct_1_12bpc_c: 4482.6
inv_txfm_add_8x16_dct_dct_1_12bpc_avx2: 284.6
inv_txfm_add_8x16_dct_dct_2_12bpc_c: 4468.7
inv_txfm_add_8x16_dct_dct_2_12bpc_avx2: 348.3
inv_txfm_add_8x16_dct_flipadst_0_12bpc_c: 4468.4
inv_txfm_add_8x16_dct_flipadst_0_12bpc_avx2: 333.6
inv_txfm_add_8x16_dct_flipadst_1_12bpc_c: 4463.5
inv_txfm_add_8x16_dct_flipadst_1_12bpc_avx2: 333.5
inv_txfm_add_8x16_dct_flipadst_2_12bpc_c: 4459.4
inv_txfm_add_8x16_dct_flipadst_2_12bpc_avx2: 397.4
inv_txfm_add_8x16_dct_identity_0_12bpc_c: 3237.1
inv_txfm_add_8x16_dct_identity_0_12bpc_avx2: 149.6
inv_txfm_add_8x16_dct_identity_1_12bpc_c: 3229.9
inv_txfm_add_8x16_dct_identity_1_12bpc_avx2: 148.6
inv_txfm_add_8x16_dct_identity_2_12bpc_c: 3225.6
inv_txfm_add_8x16_dct_identity_2_12bpc_avx2: 211.3
inv_txfm_add_8x16_flipadst_adst_0_12bpc_c: 4532.1
inv_txfm_add_8x16_flipadst_adst_0_12bpc_avx2: 356.2
inv_txfm_add_8x16_flipadst_adst_1_12bpc_c: 4527.6
inv_txfm_add_8x16_flipadst_adst_1_12bpc_avx2: 356.1
inv_txfm_add_8x16_flipadst_adst_2_12bpc_c: 4532.5
inv_txfm_add_8x16_flipadst_adst_2_12bpc_avx2: 440.0
inv_txfm_add_8x16_flipadst_dct_0_12bpc_c: 4571.6
inv_txfm_add_8x16_flipadst_dct_0_12bpc_avx2: 310.3
inv_txfm_add_8x16_flipadst_dct_1_12bpc_c: 4554.5
inv_txfm_add_8x16_flipadst_dct_1_12bpc_avx2: 309.7
inv_txfm_add_8x16_flipadst_dct_2_12bpc_c: 4554.3
inv_txfm_add_8x16_flipadst_dct_2_12bpc_avx2: 399.9
inv_txfm_add_8x16_flipadst_flipadst_0_12bpc_c: 4497.2
inv_txfm_add_8x16_flipadst_flipadst_0_12bpc_avx2: 355.9
inv_txfm_add_8x16_flipadst_flipadst_1_12bpc_c: 4486.2
inv_txfm_add_8x16_flipadst_flipadst_1_12bpc_avx2: 355.6
inv_txfm_add_8x16_flipadst_flipadst_2_12bpc_c: 4493.4
inv_txfm_add_8x16_flipadst_flipadst_2_12bpc_avx2: 446.0
inv_txfm_add_8x16_flipadst_identity_0_12bpc_c: 3265.7
inv_txfm_add_8x16_flipadst_identity_0_12bpc_avx2: 173.8
inv_txfm_add_8x16_flipadst_identity_1_12bpc_c: 3270.8
inv_txfm_add_8x16_flipadst_identity_1_12bpc_avx2: 173.5
inv_txfm_add_8x16_flipadst_identity_2_12bpc_c: 3271.8
inv_txfm_add_8x16_flipadst_identity_2_12bpc_avx2: 261.6
inv_txfm_add_8x16_identity_adst_0_12bpc_c: 3295.3
inv_txfm_add_8x16_identity_adst_0_12bpc_avx2: 302.5
inv_txfm_add_8x16_identity_adst_1_12bpc_c: 3303.1
inv_txfm_add_8x16_identity_adst_1_12bpc_avx2: 303.0
inv_txfm_add_8x16_identity_adst_2_12bpc_c: 3304.6
inv_txfm_add_8x16_identity_adst_2_12bpc_avx2: 303.1
inv_txfm_add_8x16_identity_dct_0_12bpc_c: 3298.9
inv_txfm_add_8x16_identity_dct_0_12bpc_avx2: 257.8
inv_txfm_add_8x16_identity_dct_1_12bpc_c: 3308.1
inv_txfm_add_8x16_identity_dct_1_12bpc_avx2: 259.2
inv_txfm_add_8x16_identity_dct_2_12bpc_c: 3306.6
inv_txfm_add_8x16_identity_dct_2_12bpc_avx2: 259.2
inv_txfm_add_8x16_identity_flipadst_0_12bpc_c: 3294.7
inv_txfm_add_8x16_identity_flipadst_0_12bpc_avx2: 302.2
inv_txfm_add_8x16_identity_flipadst_1_12bpc_c: 3292.5
inv_txfm_add_8x16_identity_flipadst_1_12bpc_avx2: 302.2
inv_txfm_add_8x16_identity_flipadst_2_12bpc_c: 3275.4
inv_txfm_add_8x16_identity_flipadst_2_12bpc_avx2: 303.3
inv_txfm_add_8x16_identity_identity_0_12bpc_c: 2044.6
inv_txfm_add_8x16_identity_identity_0_12bpc_avx2: 116.2
inv_txfm_add_8x16_identity_identity_1_12bpc_c: 2059.9
inv_txfm_add_8x16_identity_identity_1_12bpc_avx2: 117.0
inv_txfm_add_8x16_identity_identity_2_12bpc_c: 2048.4
inv_txfm_add_8x16_identity_identity_2_12bpc_avx2: 116.2
2021-12-04 05:04:37 +01:00
Matthias Dressel
7be128579e
x86/itx: Add 16x4 12bpc AVX2 transforms
...
inv_txfm_add_16x4_adst_adst_0_12bpc_c: 1756.6
inv_txfm_add_16x4_adst_adst_0_12bpc_avx2: 182.4
inv_txfm_add_16x4_adst_adst_1_12bpc_c: 1756.0
inv_txfm_add_16x4_adst_adst_1_12bpc_avx2: 182.5
inv_txfm_add_16x4_adst_adst_2_12bpc_c: 1763.2
inv_txfm_add_16x4_adst_adst_2_12bpc_avx2: 182.4
inv_txfm_add_16x4_adst_dct_0_12bpc_c: 1863.6
inv_txfm_add_16x4_adst_dct_0_12bpc_avx2: 176.0
inv_txfm_add_16x4_adst_dct_1_12bpc_c: 1864.1
inv_txfm_add_16x4_adst_dct_1_12bpc_avx2: 176.0
inv_txfm_add_16x4_adst_dct_2_12bpc_c: 1861.3
inv_txfm_add_16x4_adst_dct_2_12bpc_avx2: 176.0
inv_txfm_add_16x4_adst_flipadst_0_12bpc_c: 1768.6
inv_txfm_add_16x4_adst_flipadst_0_12bpc_avx2: 184.1
inv_txfm_add_16x4_adst_flipadst_1_12bpc_c: 1768.8
inv_txfm_add_16x4_adst_flipadst_1_12bpc_avx2: 184.5
inv_txfm_add_16x4_adst_flipadst_2_12bpc_c: 1769.3
inv_txfm_add_16x4_adst_flipadst_2_12bpc_avx2: 184.7
inv_txfm_add_16x4_adst_identity_0_12bpc_c: 1686.6
inv_txfm_add_16x4_adst_identity_0_12bpc_avx2: 145.4
inv_txfm_add_16x4_adst_identity_1_12bpc_c: 1685.8
inv_txfm_add_16x4_adst_identity_1_12bpc_avx2: 145.8
inv_txfm_add_16x4_adst_identity_2_12bpc_c: 1681.7
inv_txfm_add_16x4_adst_identity_2_12bpc_avx2: 145.8
inv_txfm_add_16x4_dct_adst_0_12bpc_c: 1783.4
inv_txfm_add_16x4_dct_adst_0_12bpc_avx2: 167.7
inv_txfm_add_16x4_dct_adst_1_12bpc_c: 1789.1
inv_txfm_add_16x4_dct_adst_1_12bpc_avx2: 167.9
inv_txfm_add_16x4_dct_adst_2_12bpc_c: 1788.0
inv_txfm_add_16x4_dct_adst_2_12bpc_avx2: 169.8
inv_txfm_add_16x4_dct_dct_0_12bpc_c: 209.5
inv_txfm_add_16x4_dct_dct_0_12bpc_avx2: 21.6
inv_txfm_add_16x4_dct_dct_1_12bpc_c: 1894.3
inv_txfm_add_16x4_dct_dct_1_12bpc_avx2: 156.8
inv_txfm_add_16x4_dct_dct_2_12bpc_c: 1892.0
inv_txfm_add_16x4_dct_dct_2_12bpc_avx2: 156.8
inv_txfm_add_16x4_dct_flipadst_0_12bpc_c: 1784.7
inv_txfm_add_16x4_dct_flipadst_0_12bpc_avx2: 167.2
inv_txfm_add_16x4_dct_flipadst_1_12bpc_c: 1796.7
inv_txfm_add_16x4_dct_flipadst_1_12bpc_avx2: 168.6
inv_txfm_add_16x4_dct_flipadst_2_12bpc_c: 1788.9
inv_txfm_add_16x4_dct_flipadst_2_12bpc_avx2: 168.9
inv_txfm_add_16x4_dct_identity_0_12bpc_c: 1712.7
inv_txfm_add_16x4_dct_identity_0_12bpc_avx2: 128.8
inv_txfm_add_16x4_dct_identity_1_12bpc_c: 1714.8
inv_txfm_add_16x4_dct_identity_1_12bpc_avx2: 128.8
inv_txfm_add_16x4_dct_identity_2_12bpc_c: 1710.2
inv_txfm_add_16x4_dct_identity_2_12bpc_avx2: 128.8
inv_txfm_add_16x4_flipadst_adst_0_12bpc_c: 1763.6
inv_txfm_add_16x4_flipadst_adst_0_12bpc_avx2: 186.6
inv_txfm_add_16x4_flipadst_adst_1_12bpc_c: 1761.1
inv_txfm_add_16x4_flipadst_adst_1_12bpc_avx2: 185.6
inv_txfm_add_16x4_flipadst_adst_2_12bpc_c: 1761.8
inv_txfm_add_16x4_flipadst_adst_2_12bpc_avx2: 187.0
inv_txfm_add_16x4_flipadst_dct_0_12bpc_c: 1864.4
inv_txfm_add_16x4_flipadst_dct_0_12bpc_avx2: 176.8
inv_txfm_add_16x4_flipadst_dct_1_12bpc_c: 1862.7
inv_txfm_add_16x4_flipadst_dct_1_12bpc_avx2: 176.8
inv_txfm_add_16x4_flipadst_dct_2_12bpc_c: 1860.2
inv_txfm_add_16x4_flipadst_dct_2_12bpc_avx2: 176.8
inv_txfm_add_16x4_flipadst_flipadst_0_12bpc_c: 1760.4
inv_txfm_add_16x4_flipadst_flipadst_0_12bpc_avx2: 185.3
inv_txfm_add_16x4_flipadst_flipadst_1_12bpc_c: 1761.8
inv_txfm_add_16x4_flipadst_flipadst_1_12bpc_avx2: 185.3
inv_txfm_add_16x4_flipadst_flipadst_2_12bpc_c: 1766.5
inv_txfm_add_16x4_flipadst_flipadst_2_12bpc_avx2: 184.9
inv_txfm_add_16x4_flipadst_identity_0_12bpc_c: 1673.0
inv_txfm_add_16x4_flipadst_identity_0_12bpc_avx2: 143.1
inv_txfm_add_16x4_flipadst_identity_1_12bpc_c: 1673.2
inv_txfm_add_16x4_flipadst_identity_1_12bpc_avx2: 143.1
inv_txfm_add_16x4_flipadst_identity_2_12bpc_c: 1681.6
inv_txfm_add_16x4_flipadst_identity_2_12bpc_avx2: 143.2
inv_txfm_add_16x4_identity_adst_0_12bpc_c: 1128.7
inv_txfm_add_16x4_identity_adst_0_12bpc_avx2: 102.8
inv_txfm_add_16x4_identity_adst_1_12bpc_c: 1131.3
inv_txfm_add_16x4_identity_adst_1_12bpc_avx2: 101.3
inv_txfm_add_16x4_identity_adst_2_12bpc_c: 1127.5
inv_txfm_add_16x4_identity_adst_2_12bpc_avx2: 99.1
inv_txfm_add_16x4_identity_dct_0_12bpc_c: 1228.3
inv_txfm_add_16x4_identity_dct_0_12bpc_avx2: 88.3
inv_txfm_add_16x4_identity_dct_1_12bpc_c: 1220.5
inv_txfm_add_16x4_identity_dct_1_12bpc_avx2: 88.0
inv_txfm_add_16x4_identity_dct_2_12bpc_c: 1227.3
inv_txfm_add_16x4_identity_dct_2_12bpc_avx2: 88.1
inv_txfm_add_16x4_identity_flipadst_0_12bpc_c: 1142.4
inv_txfm_add_16x4_identity_flipadst_0_12bpc_avx2: 100.3
inv_txfm_add_16x4_identity_flipadst_1_12bpc_c: 1134.1
inv_txfm_add_16x4_identity_flipadst_1_12bpc_avx2: 100.3
inv_txfm_add_16x4_identity_flipadst_2_12bpc_c: 1136.4
inv_txfm_add_16x4_identity_flipadst_2_12bpc_avx2: 100.3
inv_txfm_add_16x4_identity_identity_0_12bpc_c: 1056.1
inv_txfm_add_16x4_identity_identity_0_12bpc_avx2: 61.6
inv_txfm_add_16x4_identity_identity_1_12bpc_c: 1064.6
inv_txfm_add_16x4_identity_identity_1_12bpc_avx2: 62.9
inv_txfm_add_16x4_identity_identity_2_12bpc_c: 1067.5
inv_txfm_add_16x4_identity_identity_2_12bpc_avx2: 63.5
2021-11-29 15:30:38 +01:00
Matthias Dressel
f64b2c2256
x86/itx: Add 4x16 12bpc AVX2 transforms
...
inv_txfm_add_4x16_adst_adst_0_12bpc_c: 1799.1
inv_txfm_add_4x16_adst_adst_0_12bpc_avx2: 178.8
inv_txfm_add_4x16_adst_adst_1_12bpc_c: 1795.0
inv_txfm_add_4x16_adst_adst_1_12bpc_avx2: 179.1
inv_txfm_add_4x16_adst_adst_2_12bpc_c: 1806.6
inv_txfm_add_4x16_adst_adst_2_12bpc_avx2: 179.3
inv_txfm_add_4x16_adst_dct_0_12bpc_c: 1824.8
inv_txfm_add_4x16_adst_dct_0_12bpc_avx2: 166.8
inv_txfm_add_4x16_adst_dct_1_12bpc_c: 1828.2
inv_txfm_add_4x16_adst_dct_1_12bpc_avx2: 166.7
inv_txfm_add_4x16_adst_dct_2_12bpc_c: 1830.9
inv_txfm_add_4x16_adst_dct_2_12bpc_avx2: 165.6
inv_txfm_add_4x16_adst_flipadst_0_12bpc_c: 1797.9
inv_txfm_add_4x16_adst_flipadst_0_12bpc_avx2: 179.6
inv_txfm_add_4x16_adst_flipadst_1_12bpc_c: 1795.9
inv_txfm_add_4x16_adst_flipadst_1_12bpc_avx2: 180.6
inv_txfm_add_4x16_adst_flipadst_2_12bpc_c: 1791.6
inv_txfm_add_4x16_adst_flipadst_2_12bpc_avx2: 180.1
inv_txfm_add_4x16_adst_identity_0_12bpc_c: 1163.7
inv_txfm_add_4x16_adst_identity_0_12bpc_avx2: 78.6
inv_txfm_add_4x16_adst_identity_1_12bpc_c: 1163.4
inv_txfm_add_4x16_adst_identity_1_12bpc_avx2: 78.9
inv_txfm_add_4x16_adst_identity_2_12bpc_c: 1164.3
inv_txfm_add_4x16_adst_identity_2_12bpc_avx2: 78.8
inv_txfm_add_4x16_dct_adst_0_12bpc_c: 1914.8
inv_txfm_add_4x16_dct_adst_0_12bpc_avx2: 177.0
inv_txfm_add_4x16_dct_adst_1_12bpc_c: 1904.8
inv_txfm_add_4x16_dct_adst_1_12bpc_avx2: 177.3
inv_txfm_add_4x16_dct_adst_2_12bpc_c: 1905.4
inv_txfm_add_4x16_dct_adst_2_12bpc_avx2: 176.4
inv_txfm_add_4x16_dct_dct_0_12bpc_c: 217.1
inv_txfm_add_4x16_dct_dct_0_12bpc_avx2: 26.6
inv_txfm_add_4x16_dct_dct_1_12bpc_c: 1955.1
inv_txfm_add_4x16_dct_dct_1_12bpc_avx2: 162.3
inv_txfm_add_4x16_dct_dct_2_12bpc_c: 1948.9
inv_txfm_add_4x16_dct_dct_2_12bpc_avx2: 162.2
inv_txfm_add_4x16_dct_flipadst_0_12bpc_c: 1922.8
inv_txfm_add_4x16_dct_flipadst_0_12bpc_avx2: 180.6
inv_txfm_add_4x16_dct_flipadst_1_12bpc_c: 1919.7
inv_txfm_add_4x16_dct_flipadst_1_12bpc_avx2: 180.1
inv_txfm_add_4x16_dct_flipadst_2_12bpc_c: 1912.0
inv_txfm_add_4x16_dct_flipadst_2_12bpc_avx2: 180.1
inv_txfm_add_4x16_dct_identity_0_12bpc_c: 1276.4
inv_txfm_add_4x16_dct_identity_0_12bpc_avx2: 75.4
inv_txfm_add_4x16_dct_identity_1_12bpc_c: 1277.5
inv_txfm_add_4x16_dct_identity_1_12bpc_avx2: 75.4
inv_txfm_add_4x16_dct_identity_2_12bpc_c: 1270.1
inv_txfm_add_4x16_dct_identity_2_12bpc_avx2: 75.3
inv_txfm_add_4x16_flipadst_adst_0_12bpc_c: 1802.8
inv_txfm_add_4x16_flipadst_adst_0_12bpc_avx2: 180.8
inv_txfm_add_4x16_flipadst_adst_1_12bpc_c: 1804.8
inv_txfm_add_4x16_flipadst_adst_1_12bpc_avx2: 180.7
inv_txfm_add_4x16_flipadst_adst_2_12bpc_c: 1800.6
inv_txfm_add_4x16_flipadst_adst_2_12bpc_avx2: 181.2
inv_txfm_add_4x16_flipadst_dct_0_12bpc_c: 1842.5
inv_txfm_add_4x16_flipadst_dct_0_12bpc_avx2: 165.1
inv_txfm_add_4x16_flipadst_dct_1_12bpc_c: 1837.8
inv_txfm_add_4x16_flipadst_dct_1_12bpc_avx2: 164.4
inv_txfm_add_4x16_flipadst_dct_2_12bpc_c: 1841.6
inv_txfm_add_4x16_flipadst_dct_2_12bpc_avx2: 166.1
inv_txfm_add_4x16_flipadst_flipadst_0_12bpc_c: 1812.4
inv_txfm_add_4x16_flipadst_flipadst_0_12bpc_avx2: 182.0
inv_txfm_add_4x16_flipadst_flipadst_1_12bpc_c: 1803.9
inv_txfm_add_4x16_flipadst_flipadst_1_12bpc_avx2: 181.2
inv_txfm_add_4x16_flipadst_flipadst_2_12bpc_c: 1809.9
inv_txfm_add_4x16_flipadst_flipadst_2_12bpc_avx2: 183.2
inv_txfm_add_4x16_flipadst_identity_0_12bpc_c: 1170.5
inv_txfm_add_4x16_flipadst_identity_0_12bpc_avx2: 78.4
inv_txfm_add_4x16_flipadst_identity_1_12bpc_c: 1172.1
inv_txfm_add_4x16_flipadst_identity_1_12bpc_avx2: 80.0
inv_txfm_add_4x16_flipadst_identity_2_12bpc_c: 1170.9
inv_txfm_add_4x16_flipadst_identity_2_12bpc_avx2: 78.6
inv_txfm_add_4x16_identity_adst_0_12bpc_c: 1705.4
inv_txfm_add_4x16_identity_adst_0_12bpc_avx2: 162.6
inv_txfm_add_4x16_identity_adst_1_12bpc_c: 1714.5
inv_txfm_add_4x16_identity_adst_1_12bpc_avx2: 162.6
inv_txfm_add_4x16_identity_adst_2_12bpc_c: 1703.1
inv_txfm_add_4x16_identity_adst_2_12bpc_avx2: 162.5
inv_txfm_add_4x16_identity_dct_0_12bpc_c: 1775.0
inv_txfm_add_4x16_identity_dct_0_12bpc_avx2: 150.5
inv_txfm_add_4x16_identity_dct_1_12bpc_c: 1753.0
inv_txfm_add_4x16_identity_dct_1_12bpc_avx2: 150.6
inv_txfm_add_4x16_identity_dct_2_12bpc_c: 1759.6
inv_txfm_add_4x16_identity_dct_2_12bpc_avx2: 149.8
inv_txfm_add_4x16_identity_flipadst_0_12bpc_c: 1727.5
inv_txfm_add_4x16_identity_flipadst_0_12bpc_avx2: 160.3
inv_txfm_add_4x16_identity_flipadst_1_12bpc_c: 1739.8
inv_txfm_add_4x16_identity_flipadst_1_12bpc_avx2: 160.9
inv_txfm_add_4x16_identity_flipadst_2_12bpc_c: 1728.3
inv_txfm_add_4x16_identity_flipadst_2_12bpc_avx2: 159.9
inv_txfm_add_4x16_identity_identity_0_12bpc_c: 1098.6
inv_txfm_add_4x16_identity_identity_0_12bpc_avx2: 60.4
inv_txfm_add_4x16_identity_identity_1_12bpc_c: 1095.4
inv_txfm_add_4x16_identity_identity_1_12bpc_avx2: 61.3
inv_txfm_add_4x16_identity_identity_2_12bpc_c: 1111.6
inv_txfm_add_4x16_identity_identity_2_12bpc_avx2: 60.6
2021-11-29 15:30:38 +01:00
Matthias Dressel
00f92f2ccb
x86/itx: Convert 8bpc WHT to SSE2
...
WHT uses no SSSE3 instructions. The 16bpc variant is already SSE2.
2021-11-29 14:56:25 +01:00
Matthias Dressel
31820a5e6b
x86/itx: Add 8x8 12bpc AVX2 transforms
...
inv_txfm_add_8x8_adst_adst_0_12bpc_c: 1997.9
inv_txfm_add_8x8_adst_adst_0_12bpc_avx2: 185.7
inv_txfm_add_8x8_adst_adst_1_12bpc_c: 2009.8
inv_txfm_add_8x8_adst_adst_1_12bpc_avx2: 185.7
inv_txfm_add_8x8_adst_dct_0_12bpc_c: 1991.0
inv_txfm_add_8x8_adst_dct_0_12bpc_avx2: 161.3
inv_txfm_add_8x8_adst_dct_1_12bpc_c: 1977.0
inv_txfm_add_8x8_adst_dct_1_12bpc_avx2: 161.4
inv_txfm_add_8x8_adst_flipadst_0_12bpc_c: 2017.6
inv_txfm_add_8x8_adst_flipadst_0_12bpc_avx2: 184.2
inv_txfm_add_8x8_adst_flipadst_1_12bpc_c: 2018.9
inv_txfm_add_8x8_adst_flipadst_1_12bpc_avx2: 184.2
inv_txfm_add_8x8_adst_identity_0_12bpc_c: 1407.2
inv_txfm_add_8x8_adst_identity_0_12bpc_avx2: 95.7
inv_txfm_add_8x8_adst_identity_1_12bpc_c: 1405.9
inv_txfm_add_8x8_adst_identity_1_12bpc_avx2: 95.8
inv_txfm_add_8x8_dct_adst_0_12bpc_c: 2024.2
inv_txfm_add_8x8_dct_adst_0_12bpc_avx2: 156.9
inv_txfm_add_8x8_dct_adst_1_12bpc_c: 2018.8
inv_txfm_add_8x8_dct_adst_1_12bpc_avx2: 160.1
inv_txfm_add_8x8_dct_dct_0_12bpc_c: 213.0
inv_txfm_add_8x8_dct_dct_0_12bpc_avx2: 24.8
inv_txfm_add_8x8_dct_dct_1_12bpc_c: 2008.6
inv_txfm_add_8x8_dct_dct_1_12bpc_avx2: 139.0
inv_txfm_add_8x8_dct_flipadst_0_12bpc_c: 2012.3
inv_txfm_add_8x8_dct_flipadst_0_12bpc_avx2: 159.2
inv_txfm_add_8x8_dct_flipadst_1_12bpc_c: 2005.1
inv_txfm_add_8x8_dct_flipadst_1_12bpc_avx2: 158.7
inv_txfm_add_8x8_dct_identity_0_12bpc_c: 1470.4
inv_txfm_add_8x8_dct_identity_0_12bpc_avx2: 71.7
inv_txfm_add_8x8_dct_identity_1_12bpc_c: 1477.8
inv_txfm_add_8x8_dct_identity_1_12bpc_avx2: 70.7
inv_txfm_add_8x8_flipadst_adst_0_12bpc_c: 2006.1
inv_txfm_add_8x8_flipadst_adst_0_12bpc_avx2: 183.6
inv_txfm_add_8x8_flipadst_adst_1_12bpc_c: 1987.6
inv_txfm_add_8x8_flipadst_adst_1_12bpc_avx2: 183.6
inv_txfm_add_8x8_flipadst_dct_0_12bpc_c: 1986.6
inv_txfm_add_8x8_flipadst_dct_0_12bpc_avx2: 163.0
inv_txfm_add_8x8_flipadst_dct_1_12bpc_c: 1979.3
inv_txfm_add_8x8_flipadst_dct_1_12bpc_avx2: 163.1
inv_txfm_add_8x8_flipadst_flipadst_0_12bpc_c: 2004.0
inv_txfm_add_8x8_flipadst_flipadst_0_12bpc_avx2: 184.3
inv_txfm_add_8x8_flipadst_flipadst_1_12bpc_c: 2003.9
inv_txfm_add_8x8_flipadst_flipadst_1_12bpc_avx2: 184.3
inv_txfm_add_8x8_flipadst_identity_0_12bpc_c: 1433.5
inv_txfm_add_8x8_flipadst_identity_0_12bpc_avx2: 95.3
inv_txfm_add_8x8_flipadst_identity_1_12bpc_c: 1425.4
inv_txfm_add_8x8_flipadst_identity_1_12bpc_avx2: 96.3
inv_txfm_add_8x8_identity_adst_0_12bpc_c: 1456.5
inv_txfm_add_8x8_identity_adst_0_12bpc_avx2: 115.8
inv_txfm_add_8x8_identity_adst_1_12bpc_c: 1453.5
inv_txfm_add_8x8_identity_adst_1_12bpc_avx2: 115.8
inv_txfm_add_8x8_identity_dct_0_12bpc_c: 1450.0
inv_txfm_add_8x8_identity_dct_0_12bpc_avx2: 93.5
inv_txfm_add_8x8_identity_dct_1_12bpc_c: 1447.5
inv_txfm_add_8x8_identity_dct_1_12bpc_avx2: 94.3
inv_txfm_add_8x8_identity_flipadst_0_12bpc_c: 1451.7
inv_txfm_add_8x8_identity_flipadst_0_12bpc_avx2: 114.0
inv_txfm_add_8x8_identity_flipadst_1_12bpc_c: 1456.4
inv_txfm_add_8x8_identity_flipadst_1_12bpc_avx2: 114.0
inv_txfm_add_8x8_identity_identity_0_12bpc_c: 892.3
inv_txfm_add_8x8_identity_identity_0_12bpc_avx2: 33.7
inv_txfm_add_8x8_identity_identity_1_12bpc_c: 897.2
inv_txfm_add_8x8_identity_identity_1_12bpc_avx2: 33.1
2021-11-13 15:04:54 +01:00
Matthias Dressel
53cf6a3b65
x86/itx: Add 8x4 12bpc AVX2 transforms
...
inv_txfm_add_8x4_adst_adst_0_12bpc_c: 882.1
inv_txfm_add_8x4_adst_adst_0_12bpc_avx2: 113.7
inv_txfm_add_8x4_adst_adst_1_12bpc_c: 882.5
inv_txfm_add_8x4_adst_adst_1_12bpc_avx2: 113.8
inv_txfm_add_8x4_adst_dct_0_12bpc_c: 928.0
inv_txfm_add_8x4_adst_dct_0_12bpc_avx2: 109.2
inv_txfm_add_8x4_adst_dct_1_12bpc_c: 924.9
inv_txfm_add_8x4_adst_dct_1_12bpc_avx2: 109.2
inv_txfm_add_8x4_adst_flipadst_0_12bpc_c: 889.9
inv_txfm_add_8x4_adst_flipadst_0_12bpc_avx2: 114.3
inv_txfm_add_8x4_adst_flipadst_1_12bpc_c: 886.0
inv_txfm_add_8x4_adst_flipadst_1_12bpc_avx2: 114.8
inv_txfm_add_8x4_adst_identity_0_12bpc_c: 832.2
inv_txfm_add_8x4_adst_identity_0_12bpc_avx2: 88.8
inv_txfm_add_8x4_adst_identity_1_12bpc_c: 834.6
inv_txfm_add_8x4_adst_identity_1_12bpc_avx2: 89.0
inv_txfm_add_8x4_dct_adst_0_12bpc_c: 870.3
inv_txfm_add_8x4_dct_adst_0_12bpc_avx2: 96.3
inv_txfm_add_8x4_dct_adst_1_12bpc_c: 884.6
inv_txfm_add_8x4_dct_adst_1_12bpc_avx2: 96.3
inv_txfm_add_8x4_dct_dct_0_12bpc_c: 116.1
inv_txfm_add_8x4_dct_dct_0_12bpc_avx2: 24.5
inv_txfm_add_8x4_dct_dct_1_12bpc_c: 925.1
inv_txfm_add_8x4_dct_dct_1_12bpc_avx2: 92.3
inv_txfm_add_8x4_dct_flipadst_0_12bpc_c: 882.7
inv_txfm_add_8x4_dct_flipadst_0_12bpc_avx2: 97.0
inv_txfm_add_8x4_dct_flipadst_1_12bpc_c: 882.1
inv_txfm_add_8x4_dct_flipadst_1_12bpc_avx2: 97.0
inv_txfm_add_8x4_dct_identity_0_12bpc_c: 827.5
inv_txfm_add_8x4_dct_identity_0_12bpc_avx2: 72.4
inv_txfm_add_8x4_dct_identity_1_12bpc_c: 827.8
inv_txfm_add_8x4_dct_identity_1_12bpc_avx2: 73.8
inv_txfm_add_8x4_flipadst_adst_0_12bpc_c: 899.5
inv_txfm_add_8x4_flipadst_adst_0_12bpc_avx2: 113.2
inv_txfm_add_8x4_flipadst_adst_1_12bpc_c: 898.8
inv_txfm_add_8x4_flipadst_adst_1_12bpc_avx2: 113.3
inv_txfm_add_8x4_flipadst_dct_0_12bpc_c: 945.7
inv_txfm_add_8x4_flipadst_dct_0_12bpc_avx2: 108.3
inv_txfm_add_8x4_flipadst_dct_1_12bpc_c: 945.6
inv_txfm_add_8x4_flipadst_dct_1_12bpc_avx2: 108.3
inv_txfm_add_8x4_flipadst_flipadst_0_12bpc_c: 903.6
inv_txfm_add_8x4_flipadst_flipadst_0_12bpc_avx2: 113.9
inv_txfm_add_8x4_flipadst_flipadst_1_12bpc_c: 902.8
inv_txfm_add_8x4_flipadst_flipadst_1_12bpc_avx2: 114.2
inv_txfm_add_8x4_flipadst_identity_0_12bpc_c: 856.6
inv_txfm_add_8x4_flipadst_identity_0_12bpc_avx2: 88.3
inv_txfm_add_8x4_flipadst_identity_1_12bpc_c: 848.8
inv_txfm_add_8x4_flipadst_identity_1_12bpc_avx2: 87.4
inv_txfm_add_8x4_identity_adst_0_12bpc_c: 583.2
inv_txfm_add_8x4_identity_adst_0_12bpc_avx2: 69.6
inv_txfm_add_8x4_identity_adst_1_12bpc_c: 584.3
inv_txfm_add_8x4_identity_adst_1_12bpc_avx2: 69.6
inv_txfm_add_8x4_identity_dct_0_12bpc_c: 632.9
inv_txfm_add_8x4_identity_dct_0_12bpc_avx2: 65.3
inv_txfm_add_8x4_identity_dct_1_12bpc_c: 629.6
inv_txfm_add_8x4_identity_dct_1_12bpc_avx2: 65.8
inv_txfm_add_8x4_identity_flipadst_0_12bpc_c: 587.0
inv_txfm_add_8x4_identity_flipadst_0_12bpc_avx2: 71.0
inv_txfm_add_8x4_identity_flipadst_1_12bpc_c: 586.9
inv_txfm_add_8x4_identity_flipadst_1_12bpc_avx2: 71.0
inv_txfm_add_8x4_identity_identity_0_12bpc_c: 533.0
inv_txfm_add_8x4_identity_identity_0_12bpc_avx2: 45.3
inv_txfm_add_8x4_identity_identity_1_12bpc_c: 539.7
inv_txfm_add_8x4_identity_identity_1_12bpc_avx2: 45.9
2021-11-13 15:04:54 +01:00
Matthias Dressel
241753f5be
x86/itx: Add 4x8 12bpc AVX2 transforms
...
inv_txfm_add_4x8_adst_adst_0_12bpc_c: 900.8
inv_txfm_add_4x8_adst_adst_0_12bpc_avx2: 118.8
inv_txfm_add_4x8_adst_adst_1_12bpc_c: 893.7
inv_txfm_add_4x8_adst_adst_1_12bpc_avx2: 118.8
inv_txfm_add_4x8_adst_dct_0_12bpc_c: 890.2
inv_txfm_add_4x8_adst_dct_0_12bpc_avx2: 104.8
inv_txfm_add_4x8_adst_dct_1_12bpc_c: 887.4
inv_txfm_add_4x8_adst_dct_1_12bpc_avx2: 104.8
inv_txfm_add_4x8_adst_flipadst_0_12bpc_c: 919.6
inv_txfm_add_4x8_adst_flipadst_0_12bpc_avx2: 116.6
inv_txfm_add_4x8_adst_flipadst_1_12bpc_c: 912.1
inv_txfm_add_4x8_adst_flipadst_1_12bpc_avx2: 116.6
inv_txfm_add_4x8_adst_identity_0_12bpc_c: 613.5
inv_txfm_add_4x8_adst_identity_0_12bpc_avx2: 42.8
inv_txfm_add_4x8_adst_identity_1_12bpc_c: 608.7
inv_txfm_add_4x8_adst_identity_1_12bpc_avx2: 43.3
inv_txfm_add_4x8_dct_adst_0_12bpc_c: 951.7
inv_txfm_add_4x8_dct_adst_0_12bpc_avx2: 113.8
inv_txfm_add_4x8_dct_adst_1_12bpc_c: 949.0
inv_txfm_add_4x8_dct_adst_1_12bpc_avx2: 113.1
inv_txfm_add_4x8_dct_dct_0_12bpc_c: 118.6
inv_txfm_add_4x8_dct_dct_0_12bpc_avx2: 24.5
inv_txfm_add_4x8_dct_dct_1_12bpc_c: 942.4
inv_txfm_add_4x8_dct_dct_1_12bpc_avx2: 99.2
inv_txfm_add_4x8_dct_flipadst_0_12bpc_c: 959.3
inv_txfm_add_4x8_dct_flipadst_0_12bpc_avx2: 113.9
inv_txfm_add_4x8_dct_flipadst_1_12bpc_c: 964.1
inv_txfm_add_4x8_dct_flipadst_1_12bpc_avx2: 114.3
inv_txfm_add_4x8_dct_identity_0_12bpc_c: 659.9
inv_txfm_add_4x8_dct_identity_0_12bpc_avx2: 41.9
inv_txfm_add_4x8_dct_identity_1_12bpc_c: 658.6
inv_txfm_add_4x8_dct_identity_1_12bpc_avx2: 41.6
inv_txfm_add_4x8_flipadst_adst_0_12bpc_c: 906.6
inv_txfm_add_4x8_flipadst_adst_0_12bpc_avx2: 117.3
inv_txfm_add_4x8_flipadst_adst_1_12bpc_c: 907.7
inv_txfm_add_4x8_flipadst_adst_1_12bpc_avx2: 117.3
inv_txfm_add_4x8_flipadst_dct_0_12bpc_c: 890.3
inv_txfm_add_4x8_flipadst_dct_0_12bpc_avx2: 104.6
inv_txfm_add_4x8_flipadst_dct_1_12bpc_c: 895.6
inv_txfm_add_4x8_flipadst_dct_1_12bpc_avx2: 104.6
inv_txfm_add_4x8_flipadst_flipadst_0_12bpc_c: 902.9
inv_txfm_add_4x8_flipadst_flipadst_0_12bpc_avx2: 116.5
inv_txfm_add_4x8_flipadst_flipadst_1_12bpc_c: 915.0
inv_txfm_add_4x8_flipadst_flipadst_1_12bpc_avx2: 116.4
inv_txfm_add_4x8_flipadst_identity_0_12bpc_c: 618.6
inv_txfm_add_4x8_flipadst_identity_0_12bpc_avx2: 45.3
inv_txfm_add_4x8_flipadst_identity_1_12bpc_c: 618.1
inv_txfm_add_4x8_flipadst_identity_1_12bpc_avx2: 44.0
inv_txfm_add_4x8_identity_adst_0_12bpc_c: 829.7
inv_txfm_add_4x8_identity_adst_0_12bpc_avx2: 107.4
inv_txfm_add_4x8_identity_adst_1_12bpc_c: 831.7
inv_txfm_add_4x8_identity_adst_1_12bpc_avx2: 107.8
inv_txfm_add_4x8_identity_dct_0_12bpc_c: 823.2
inv_txfm_add_4x8_identity_dct_0_12bpc_avx2: 90.7
inv_txfm_add_4x8_identity_dct_1_12bpc_c: 824.1
inv_txfm_add_4x8_identity_dct_1_12bpc_avx2: 90.7
inv_txfm_add_4x8_identity_flipadst_0_12bpc_c: 853.4
inv_txfm_add_4x8_identity_flipadst_0_12bpc_avx2: 106.8
inv_txfm_add_4x8_identity_flipadst_1_12bpc_c: 852.2
inv_txfm_add_4x8_identity_flipadst_1_12bpc_avx2: 106.8
inv_txfm_add_4x8_identity_identity_0_12bpc_c: 543.2
inv_txfm_add_4x8_identity_identity_0_12bpc_avx2: 36.4
inv_txfm_add_4x8_identity_identity_1_12bpc_c: 544.8
inv_txfm_add_4x8_identity_identity_1_12bpc_avx2: 36.6
2021-11-13 13:58:28 +01:00
Matthias Dressel
9727d8579b
CI: Check for potientially dangerous Unicode characters
...
Bidirectional control and invisible characters can be used to hide
malicious code.
Ref: CVE-2021-42574, CVE-2021-42694
2021-11-05 14:58:25 +01:00
Matthias Dressel
e40cc46c3c
x86/itx: Add clipping to iadst 4x16
...
Values need to be clipped after Hadamard rotations.
2021-11-02 16:29:05 +01:00
Matthias Dressel
eb0308bcdf
x86/itx: Add 12-bit 4x4 transforms in AVX2
...
Refactors itx into separate 10, 12 bit functions to prevent conditional
jumps.
inv_txfm_add_4x4_adst_adst_0_12bpc_c: 370.9
inv_txfm_add_4x4_adst_adst_0_12bpc_avx2: 68.6
inv_txfm_add_4x4_adst_adst_1_12bpc_c: 371.0
inv_txfm_add_4x4_adst_adst_1_12bpc_avx2: 68.7
inv_txfm_add_4x4_adst_dct_0_12bpc_c: 413.1
inv_txfm_add_4x4_adst_dct_0_12bpc_avx2: 69.2
inv_txfm_add_4x4_adst_dct_1_12bpc_c: 412.7
inv_txfm_add_4x4_adst_dct_1_12bpc_avx2: 68.8
inv_txfm_add_4x4_adst_flipadst_0_12bpc_c: 378.5
inv_txfm_add_4x4_adst_flipadst_0_12bpc_avx2: 74.9
inv_txfm_add_4x4_adst_flipadst_1_12bpc_c: 378.1
inv_txfm_add_4x4_adst_flipadst_1_12bpc_avx2: 74.6
inv_txfm_add_4x4_adst_identity_0_12bpc_c: 347.8
inv_txfm_add_4x4_adst_identity_0_12bpc_avx2: 48.8
inv_txfm_add_4x4_adst_identity_1_12bpc_c: 342.7
inv_txfm_add_4x4_adst_identity_1_12bpc_avx2: 49.0
inv_txfm_add_4x4_dct_adst_0_12bpc_c: 399.2
inv_txfm_add_4x4_dct_adst_0_12bpc_avx2: 73.1
inv_txfm_add_4x4_dct_adst_1_12bpc_c: 398.7
inv_txfm_add_4x4_dct_adst_1_12bpc_avx2: 72.2
inv_txfm_add_4x4_dct_dct_0_12bpc_c: 69.6
inv_txfm_add_4x4_dct_dct_0_12bpc_avx2: 32.9
inv_txfm_add_4x4_dct_dct_1_12bpc_c: 420.5
inv_txfm_add_4x4_dct_dct_1_12bpc_avx2: 72.2
inv_txfm_add_4x4_dct_flipadst_0_12bpc_c: 405.5
inv_txfm_add_4x4_dct_flipadst_0_12bpc_avx2: 75.9
inv_txfm_add_4x4_dct_flipadst_1_12bpc_c: 404.2
inv_txfm_add_4x4_dct_flipadst_1_12bpc_avx2: 75.6
inv_txfm_add_4x4_dct_identity_0_12bpc_c: 374.1
inv_txfm_add_4x4_dct_identity_0_12bpc_avx2: 51.6
inv_txfm_add_4x4_dct_identity_1_12bpc_c: 368.0
inv_txfm_add_4x4_dct_identity_1_12bpc_avx2: 51.8
inv_txfm_add_4x4_flipadst_adst_0_12bpc_c: 368.0
inv_txfm_add_4x4_flipadst_adst_0_12bpc_avx2: 69.2
inv_txfm_add_4x4_flipadst_adst_1_12bpc_c: 370.7
inv_txfm_add_4x4_flipadst_adst_1_12bpc_avx2: 70.4
inv_txfm_add_4x4_flipadst_dct_0_12bpc_c: 393.7
inv_txfm_add_4x4_flipadst_dct_0_12bpc_avx2: 70.1
inv_txfm_add_4x4_flipadst_dct_1_12bpc_c: 392.9
inv_txfm_add_4x4_flipadst_dct_1_12bpc_avx2: 69.6
inv_txfm_add_4x4_flipadst_flipadst_0_12bpc_c: 382.2
inv_txfm_add_4x4_flipadst_flipadst_0_12bpc_avx2: 74.6
inv_txfm_add_4x4_flipadst_flipadst_1_12bpc_c: 381.3
inv_txfm_add_4x4_flipadst_flipadst_1_12bpc_avx2: 74.9
inv_txfm_add_4x4_flipadst_identity_0_12bpc_c: 346.7
inv_txfm_add_4x4_flipadst_identity_0_12bpc_avx2: 48.2
inv_txfm_add_4x4_flipadst_identity_1_12bpc_c: 347.9
inv_txfm_add_4x4_flipadst_identity_1_12bpc_avx2: 48.7
inv_txfm_add_4x4_identity_adst_0_12bpc_c: 344.7
inv_txfm_add_4x4_identity_adst_0_12bpc_avx2: 59.8
inv_txfm_add_4x4_identity_adst_1_12bpc_c: 340.5
inv_txfm_add_4x4_identity_adst_1_12bpc_avx2: 59.2
inv_txfm_add_4x4_identity_dct_0_12bpc_c: 369.8
inv_txfm_add_4x4_identity_dct_0_12bpc_avx2: 59.3
inv_txfm_add_4x4_identity_dct_1_12bpc_c: 369.5
inv_txfm_add_4x4_identity_dct_1_12bpc_avx2: 59.2
inv_txfm_add_4x4_identity_flipadst_0_12bpc_c: 353.4
inv_txfm_add_4x4_identity_flipadst_0_12bpc_avx2: 65.6
inv_txfm_add_4x4_identity_flipadst_1_12bpc_c: 350.9
inv_txfm_add_4x4_identity_flipadst_1_12bpc_avx2: 65.9
inv_txfm_add_4x4_identity_identity_0_12bpc_c: 326.1
inv_txfm_add_4x4_identity_identity_0_12bpc_avx2: 39.5
inv_txfm_add_4x4_identity_identity_1_12bpc_c: 321.6
inv_txfm_add_4x4_identity_identity_1_12bpc_avx2: 39.5
2021-10-18 20:45:36 +02:00
Matthias Dressel
4cdfe6919f
x86/itx: Rename rax to r6
...
Use numerical GPR references everywhere for consistency.
2021-10-18 20:20:02 +02:00
Matthias Dressel
1ea40afdbb
x86/itx: Name constants more explicit
...
Give some constants a more explicit name to avoid confusion when 12bpc
support is added.
2021-10-18 20:20:02 +02:00
Matthias Dressel
cff5ba694c
CI: Update CI images
2021-10-18 16:15:52 +02:00
Matthias Dressel
c6a97f8a3e
CI: Output the dav1d-test-data commit used in the run
...
Having the exact commit hash in the logs helps with debugging.
2021-09-17 16:31:51 +02:00
Matthias Dressel
4533dd8678
CI: snap: Upload releases to stable channel
2021-09-03 13:31:42 +00:00
Matthias Dressel
6ab2b716cc
x86: Simplify loopfilter init
2021-09-03 14:59:59 +02:00
Matthias Dressel
e4812a6ad7
x86: itx4: Inline transpose
...
Saves one move.
2021-06-21 19:30:39 +02:00
Matthias Dressel
89be94d41e
x86: Add bpc suffix to filmgrain functions
2021-06-20 23:02:02 +02:00
Matthias Dressel
c7e0ad4577
x86: Add bpc suffix to loopfilter functions
2021-06-20 23:02:02 +02:00
Matthias Dressel
a6821cee0a
x86: Add bpc suffix to ipred functions
2021-06-20 23:02:02 +02:00
Matthias Dressel
f951165ea6
x86: itx: Port 10-bit 4x4 transforms to SSE4
...
64-bit 32-bit
inv_txfm_add_4x4_adst_adst_0_10bpc_c: 257.0 346.3
inv_txfm_add_4x4_adst_adst_0_10bpc_sse4: 47.1 51.7
inv_txfm_add_4x4_adst_adst_0_10bpc_avx2: 57.4
inv_txfm_add_4x4_adst_adst_1_10bpc_c: 259.8 345.6
inv_txfm_add_4x4_adst_adst_1_10bpc_sse4: 47.1 52.0
inv_txfm_add_4x4_adst_adst_1_10bpc_avx2: 56.9
inv_txfm_add_4x4_adst_dct_0_10bpc_c: 284.6 369.9
inv_txfm_add_4x4_adst_dct_0_10bpc_sse4: 42.2 46.0
inv_txfm_add_4x4_adst_dct_0_10bpc_avx2: 51.9
inv_txfm_add_4x4_adst_dct_1_10bpc_c: 285.2 369.8
inv_txfm_add_4x4_adst_dct_1_10bpc_sse4: 42.4 45.9
inv_txfm_add_4x4_adst_dct_1_10bpc_avx2: 51.9
inv_txfm_add_4x4_adst_flipadst_0_10bpc_c: 262.9 345.0
inv_txfm_add_4x4_adst_flipadst_0_10bpc_sse4: 46.8 50.1
inv_txfm_add_4x4_adst_flipadst_0_10bpc_avx2: 57.0
inv_txfm_add_4x4_adst_flipadst_1_10bpc_c: 262.1 345.6
inv_txfm_add_4x4_adst_flipadst_1_10bpc_sse4: 46.8 50.3
inv_txfm_add_4x4_adst_flipadst_1_10bpc_avx2: 57.1
inv_txfm_add_4x4_adst_identity_0_10bpc_c: 225.6 302.9
inv_txfm_add_4x4_adst_identity_0_10bpc_sse4: 38.0 42.3
inv_txfm_add_4x4_adst_identity_0_10bpc_avx2: 41.4
inv_txfm_add_4x4_adst_identity_1_10bpc_c: 225.7 303.1
inv_txfm_add_4x4_adst_identity_1_10bpc_sse4: 37.8 42.3
inv_txfm_add_4x4_adst_identity_1_10bpc_avx2: 41.4
inv_txfm_add_4x4_dct_adst_0_10bpc_c: 274.6 378.0
inv_txfm_add_4x4_dct_adst_0_10bpc_sse4: 44.8 48.5
inv_txfm_add_4x4_dct_adst_0_10bpc_avx2: 50.7
inv_txfm_add_4x4_dct_adst_1_10bpc_c: 274.0 377.4
inv_txfm_add_4x4_dct_adst_1_10bpc_sse4: 44.6 48.6
inv_txfm_add_4x4_dct_adst_1_10bpc_avx2: 51.0
inv_txfm_add_4x4_dct_dct_0_10bpc_c: 39.2 50.6
inv_txfm_add_4x4_dct_dct_0_10bpc_sse4: 29.1 33.8
inv_txfm_add_4x4_dct_dct_0_10bpc_avx2: 29.3
inv_txfm_add_4x4_dct_dct_1_10bpc_c: 300.6 399.0
inv_txfm_add_4x4_dct_dct_1_10bpc_sse4: 39.7 44.3
inv_txfm_add_4x4_dct_dct_1_10bpc_avx2: 48.6
inv_txfm_add_4x4_dct_flipadst_0_10bpc_c: 278.6 377.8
inv_txfm_add_4x4_dct_flipadst_0_10bpc_sse4: 45.3 49.6
inv_txfm_add_4x4_dct_flipadst_0_10bpc_avx2: 50.2
inv_txfm_add_4x4_dct_flipadst_1_10bpc_c: 277.1 378.3
inv_txfm_add_4x4_dct_flipadst_1_10bpc_sse4: 45.0 49.7
inv_txfm_add_4x4_dct_flipadst_1_10bpc_avx2: 50.2
inv_txfm_add_4x4_dct_identity_0_10bpc_c: 246.9 335.8
inv_txfm_add_4x4_dct_identity_0_10bpc_sse4: 37.1 41.7
inv_txfm_add_4x4_dct_identity_0_10bpc_avx2: 37.4
inv_txfm_add_4x4_dct_identity_1_10bpc_c: 247.2 336.2
inv_txfm_add_4x4_dct_identity_1_10bpc_sse4: 37.1 41.6
inv_txfm_add_4x4_dct_identity_1_10bpc_avx2: 37.3
inv_txfm_add_4x4_flipadst_adst_0_10bpc_c: 259.4 351.7
inv_txfm_add_4x4_flipadst_adst_0_10bpc_sse4: 47.1 51.8
inv_txfm_add_4x4_flipadst_adst_0_10bpc_avx2: 57.9
inv_txfm_add_4x4_flipadst_adst_1_10bpc_c: 258.7 350.8
inv_txfm_add_4x4_flipadst_adst_1_10bpc_sse4: 47.1 51.8
inv_txfm_add_4x4_flipadst_adst_1_10bpc_avx2: 57.4
inv_txfm_add_4x4_flipadst_dct_0_10bpc_c: 282.3 375.4
inv_txfm_add_4x4_flipadst_dct_0_10bpc_sse4: 42.2 45.8
inv_txfm_add_4x4_flipadst_dct_0_10bpc_avx2: 52.5
inv_txfm_add_4x4_flipadst_dct_1_10bpc_c: 283.0 375.8
inv_txfm_add_4x4_flipadst_dct_1_10bpc_sse4: 42.5 45.9
inv_txfm_add_4x4_flipadst_dct_1_10bpc_avx2: 52.4
inv_txfm_add_4x4_flipadst_flipadst_0_10bpc_c: 258.8 356.1
inv_txfm_add_4x4_flipadst_flipadst_0_10bpc_sse4: 47.3 50.1
inv_txfm_add_4x4_flipadst_flipadst_0_10bpc_avx2: 57.4
inv_txfm_add_4x4_flipadst_flipadst_1_10bpc_c: 259.0 355.3
inv_txfm_add_4x4_flipadst_flipadst_1_10bpc_sse4: 47.8 50.2
inv_txfm_add_4x4_flipadst_flipadst_1_10bpc_avx2: 57.4
inv_txfm_add_4x4_flipadst_identity_0_10bpc_c: 228.6 309.4
inv_txfm_add_4x4_flipadst_identity_0_10bpc_sse4: 37.8 42.0
inv_txfm_add_4x4_flipadst_identity_0_10bpc_avx2: 41.4
inv_txfm_add_4x4_flipadst_identity_1_10bpc_c: 229.1 309.6
inv_txfm_add_4x4_flipadst_identity_1_10bpc_sse4: 37.9 42.2
inv_txfm_add_4x4_flipadst_identity_1_10bpc_avx2: 41.3
inv_txfm_add_4x4_identity_adst_0_10bpc_c: 200.8 275.8
inv_txfm_add_4x4_identity_adst_0_10bpc_sse4: 39.0 43.9
inv_txfm_add_4x4_identity_adst_0_10bpc_avx2: 47.4
inv_txfm_add_4x4_identity_adst_1_10bpc_c: 200.8 276.5
inv_txfm_add_4x4_identity_adst_1_10bpc_sse4: 39.0 44.0
inv_txfm_add_4x4_identity_adst_1_10bpc_avx2: 47.2
inv_txfm_add_4x4_identity_dct_0_10bpc_c: 226.4 300.3
inv_txfm_add_4x4_identity_dct_0_10bpc_sse4: 36.9 41.7
inv_txfm_add_4x4_identity_dct_0_10bpc_avx2: 42.8
inv_txfm_add_4x4_identity_dct_1_10bpc_c: 229.0 300.6
inv_txfm_add_4x4_identity_dct_1_10bpc_sse4: 36.8 41.6
inv_txfm_add_4x4_identity_dct_1_10bpc_avx2: 42.7
inv_txfm_add_4x4_identity_flipadst_0_10bpc_c: 202.6 278.9
inv_txfm_add_4x4_identity_flipadst_0_10bpc_sse4: 39.2 43.7
inv_txfm_add_4x4_identity_flipadst_0_10bpc_avx2: 47.1
inv_txfm_add_4x4_identity_flipadst_1_10bpc_c: 202.6 279.3
inv_txfm_add_4x4_identity_flipadst_1_10bpc_sse4: 39.2 43.8
inv_txfm_add_4x4_identity_flipadst_1_10bpc_avx2: 47.0
inv_txfm_add_4x4_identity_identity_0_10bpc_c: 168.7 235.9
inv_txfm_add_4x4_identity_identity_0_10bpc_sse4: 31.7 37.6
inv_txfm_add_4x4_identity_identity_0_10bpc_avx2: 33.9
inv_txfm_add_4x4_identity_identity_1_10bpc_c: 169.1 235.7
inv_txfm_add_4x4_identity_identity_1_10bpc_sse4: 31.7 37.4
inv_txfm_add_4x4_identity_identity_1_10bpc_avx2: 33.8
2021-06-19 20:44:56 +02:00
Matthias Dressel
f4a8f804fd
x86: itx: wht: Minor fixes
...
* Rename macro for consistency. WHT has exactly one line per register.
* Use REPX to make code more readable.
2021-06-18 17:48:56 +02:00
Matthias Dressel
770c9c834d
x86: Add bpc suffix to itx functions
2021-06-18 02:54:34 +02:00
Matthias Dressel
c54add0204
x86: itx: Add 10/12-bit SSE2 WHT
2021-05-18 02:50:02 +02:00
Matthias Dressel
477cc158d1
x86: itx: Add 12-bit wht
2021-05-13 21:29:03 +02:00
Matthias Dressel
ae8958bdc7
CI: Fix asm checks
...
meson 0.57.0 introduced an optimization [0] for `meson test` to only
rebuild test dependencies. This does not cover changing the build
configuration anymore.
[0] https://mesonbuild.com/Release-notes-for-0-57-0.html
2021-04-12 20:18:03 +00:00
Matthias Dressel
2479973cbb
CI: Add check for illegal instructions
...
Some AVX2 instructions cannot be macroed by x86inc.asm.
Some instructions are valid in SSE4 but not in SSSE3, therefor checking
both.
* Conroe is up to SSSE3
* Penryn is up to SSE4.1
See also: 4dd9431
2021-03-07 17:59:02 +01:00
Matthias Dressel
061ac9aee8
cli: Fix md5 verification for short values
...
Verification should not succeed if the given string is too short to be a
real hash.
Fixes videolan/dav1d#361
2021-02-08 04:48:46 +01:00