Skip to main content

Module avx2

Module avx2 

Source

Functions§

bernoulli_compare_avx2
Compare 32 random bytes against an unsigned threshold and return bit mask.
dot_f64_avx2
Dot product of two f64 slices using AVX2 FMA.
fused_and_popcount_avx2
Fused AND+popcount over packed words using AVX2 for the AND stage.
fused_xor_popcount_avx2
Fused XOR+popcount over packed words using AVX2 for the XOR stage.
hamming_distance_avx2
Hamming distance between two packed bitstream slices using AVX2.
max_f64_avx2
Maximum of f64 slice using AVX2.
pack_avx2
Pack u8 bits into u64 words using AVX2 movemask.
popcount_avx2
Count set bits in 64-bit words using AVX2.
scale_f64_avx2
Scale f64 slice in-place: y[i] *= alpha, using AVX2.
softmax_inplace_f64_avx2
In-place softmax using AVX2 for max, sum, and scale steps.
sum_f64_avx2
Sum of f64 slice using AVX2.