audioconvert: avoid even more precision loss in S32 to F32 conversion

There's really no point in doing that s25_32 intermediate step, to be honest i don't have a clue why the original implementation did that \_(ツ)_/¯. Both `S25_SCALE` and `S32_SCALE` are powers of two, and thus are both exactly representable as floats, and reprocial of power-of-two is also exactly representable, so it's not like that rescaling results in precision loss. This additionally avoids right-shift, and thus is even faster. As `test_lossless_s32_lossless_subset` shows, if the integer is in the form of s25+shift, the maximal absolute error became even lower, but not zero, because F32->S32 still goes through S25 intermediate. I think we could theoretically do better, but then the clamping becomes pretty finicky, so i don't feel like touching that here.
2026-07-10 11:03:57 -04:00 · 2024-06-14 06:05:18 +03:00 · 2024-06-14 06:05:18 +03:00 · f4c89b1b40
commit f4c89b1b40
parent c517865864
4 changed files with 26 additions and 37 deletions
--- a/spa/plugins/audioconvert/fmt-ops-sse2.c
+++ b/spa/plugins/audioconvert/fmt-ops-sse2.c
@ -335,7 +335,7 @@ conv_s32_to_f32d_1s_sse2(void *data, void * SPA_RESTRICT dst[], const void * SPA
 	float *d0 = dst[0];
 	uint32_t n, unrolled;
 	__m128i in;
-	__m128 out, factor = _mm_set1_ps(1.0f / S25_SCALE);
+	__m128 out, factor = _mm_set1_ps(1.0f / S32_SCALE);

 	if (SPA_IS_ALIGNED(d0, 16))
 		unrolled = n_samples & ~3;
@ -347,14 +347,13 @@ conv_s32_to_f32d_1s_sse2(void *data, void * SPA_RESTRICT dst[], const void * SPA
 				    s[1*n_channels],
 				    s[2*n_channels],
 				    s[3*n_channels]);
-		in = _mm_srai_epi32(in, 7);
 		out = _mm_cvtepi32_ps(in);
 		out = _mm_mul_ps(out, factor);
 		_mm_store_ps(&d0[n], out);
 		s += 4*n_channels;
 	}
 	for(; n < n_samples; n++) {
-		out = _mm_cvtsi32_ss(factor, s[0]>>7);
+		out = _mm_cvtsi32_ss(factor, s[0]);
 		out = _mm_mul_ss(out, factor);
 		_mm_store_ss(&d0[n], out);
 		s += n_channels;