Zhao Zhili and Zhao Zhili
eb14d45824
avfilter/vf_colordetect: add aarch64 asm
...
| rpi5 gcc 12 | m1 clang -fno-vectorize | m1 clang
---------------------------------------------------------------------------
alpha_8_full_c: | 32159.2 ( 1.00x) | 135.8 ( 1.00x) | 26.4 ( 1.00x)
alpha_8_full_neon: | 1266.0 (25.40x) | 8.0 (17.03x) | 8.4 ( 3.15x)
alpha_8_limited_c: | 37561.9 ( 1.00x) | 169.1 ( 1.00x) | 47.7 ( 1.00x)
alpha_8_limited_neon: | 3967.0 ( 9.47x) | 12.5 (13.53x) | 13.3 ( 3.59x)
alpha_16_full_c: | 15867.9 ( 1.00x) | 64.5 ( 1.00x) | 13.7 ( 1.00x)
alpha_16_full_neon: | 1256.9 (12.62x) | 7.9 ( 8.15x) | 8.3 ( 1.64x)
alpha_16_limited_c: | 16723.7 ( 1.00x) | 88.7 ( 1.00x) | 103.3 ( 1.00x)
alpha_16_limited_neon: | 4031.3 ( 4.15x) | 12.5 ( 7.08x) | 13.2 ( 7.86x)
range_8_c: | 21819.7 ( 1.00x) | 120.0 ( 1.00x) | 9.4 ( 1.00x)
range_8_neon: | 1148.3 (19.00x) | 4.3 (27.60x) | 4.8 ( 1.97x)
range_16_c: | 10757.1 ( 1.00x) | 45.7 ( 1.00x) | 7.9 ( 1.00x)
range_16_neon: | 1141.5 ( 9.42x) | 4.4 (10.38x) | 4.6 ( 1.72x)
2025-09-01 15:35:16 +00:00
John Cox and Martin Storsjö
5075cfb4e6
avfilter/vf_bwdif: Add neon for filter_intra
...
Adds an outline for aarch neon functions
Adds common macros and consts for aarch64 neon
Exports C filter_intra needed for tail fixup of neon code
Adds neon for filter_intra
Signed-off-by: John Cox <jc@kynesim.co.uk >
Signed-off-by: Martin Storsjö <martin@martin.st >
2023-07-06 00:21:05 +03:00
Clément Bœsch
5a71bce371
lavfi/nlmeans: add AArch64 SIMD for compute_safe_ssd_integral_image
...
ssd_integral_image_c: 49204.6
ssd_integral_image_neon: 28346.8
2018-05-08 10:28:06 +02:00