audioconvert: avoid even more precision loss in F32 to S32 conversion

This is somewhat similar to the S32->F32 conversion improvements,
but here things a bit more tricky...

The main consideration is that the limits to which we clamp
must be valid 32-bit signed integers, but not all such integers
are exactly losslessly representable in `float32_t`.

For example it we'd clamp to `2147483647`,
that is actually a `2147483648.0f`,
and `2147483648` is not a valid 32-bit signed integer,
so the post-clamp conversion would basically be UB.
We don't have this problem for negative bound, though.

But as we know, any 25-bit signed integer is losslessly
round-trippable through float32_t, and since multiplying by 2
only changes the float's exponent, we can clamp to `2147483520`!
The algorithm of selection of the pre-clamping scale is unaffected.

This additionally avoids right-shift, and thus is even faster.

As `test_lossless_s32_lossless_subset` shows,
if the integer is in the form of s25+shift,
the maximal absolute error is finally zero.

Without going through `float`->`double`->`int`,
i'm not sure if the `float`->`int` conversion
can be improved further.
This commit is contained in:
Roman Lebedev 2024-06-25 19:20:42 +03:00
parent f4c89b1b40
commit 7c40cafa7c
No known key found for this signature in database
GPG key ID: 083C3EBB4A1689E0
4 changed files with 50 additions and 70 deletions

View file

@ -299,11 +299,11 @@ static void test_f32_s32(void)
1.0f/0x100000000, -1.0f/0x100000000, 1.0f/0x200000000, -1.0f/0x200000000,
};
static const int32_t out[] = { 0x00000000, 0x7fffff80, 0x80000000,
0x40000000, 0xc0000000, 0x7fffff80, 0x80000000, 0x00000100,
0xffffff00, 0x00000100, 0xffffff00, 0x00000080, 0xffffff80,
0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000,
0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000,
0x00000000, 0x00000000, 0x00000000, 0x00000000, 0x00000000,
0x40000000, 0xc0000000, 0x7fffff80, 0x80000000, 0x000000cd,
0xffffff33, 0x00000100, 0xffffff00, 0x00000080, 0xffffff80,
0x00000040, 0xffffffc0, 0x00000020, 0xffffffe0, 0x00000010,
0xfffffff0, 0x00000008, 0xfffffff8, 0x00000004, 0xfffffffc,
0x00000002, 0xfffffffe, 0x00000001, 0xffffffff, 0x00000000,
0x00000000, 0x00000000, 0x00000000,
};
@ -687,20 +687,15 @@ static void test_lossless_s32_lossless_subset(void)
{
int32_t i, j;
int all_lossless = 1;
int32_t max_abs_err = -1;
fprintf(stderr, "test %s:\n", __func__);
for (i = S25_MIN; i <= S25_MAX; i+=1) {
for(j = 0; j < 8; ++j) {
int32_t s = i * (1<<j);
float v = S32_TO_F32(s);
int32_t t = F32_TO_S32(v);
all_lossless &= s == t;
max_abs_err = SPA_MAX(max_abs_err, SPA_ABS(s - t));
spa_assert_se(s == t);
}
}
spa_assert_se(!all_lossless);
spa_assert_se(max_abs_err == 64);
}
static void test_lossless_u32(void)