building PA with -O0 leads to test failure in mix-test on i386
issue reported by Felipe, see
http://lists.freedesktop.org/archives/pulseaudio-discuss/2014-August/021406.html
the problem is the value 0xbeffbd7f: when byte-swapped it becomes 0x7fbdffbe and according
to IEEE-754 represents a signalling NaN (starting with s111 1111 10, see http://en.wikipedia.org/wiki/NaN)
when this value is assigned to a floating point register, it becomes 0x7ffdffbe, representing
a quiet NaN (starting with s111 1111 11) -- a signalling NaN is turned into a quiet NaN!
so PA_FLOAT32_SWAP(PA_FLOAT32_SWAP(x)) != x for certain values, uhuh!
the following test code can be used; due to volatile, it will always demonstrate the issue;
without volatile, it depends on the optimization level (i386, 32-bit, gcc 4.9):
// snip
static inline float PA_FLOAT32_SWAP(float x) {
union {
float f;
uint32_t u;
} t;
t.f = x;
t.u = bswap_32(t.u);
return t.f;
}
int main() {
unsigned x = 0xbeffbd7f;
volatile float f = PA_FLOAT32_SWAP(*(float *)&x);
printf("%08x %08x %08x %f\n", 0xbeffbd7f, *(unsigned *)&f, bswap_32(*(unsigned *)&f), f);
}
// snip
the problem goes away with optimization when no temporary floating point registers are used
the proposed solution is to avoid passing swapped floating point data in a
float; this is done with new functions PA_READ_FLOAT32RE() and PA_WRITE_FLOAT32RE()
which use uint32_t to dereference a pointer and byte-swap the data, hence no temporary
float variable is used
also delete PA_FLOAT32_TO_LE()/_BE(), not used
Signed-off-by: Peter Meerwald <pmeerw@pmeerw.net>
Reported-by: Felipe Sateler <fsateler@debian.org>
I think this makes the code a bit nicer to read and write. This also
reduces the chances of off-by-one errors when checking the bounds of
the sample format value.
move code to function pa_mult_s16_volume() in sample-util.h
use 64 bit integers on 64 bit platforms (it's faster)
on i5, 2.5GHz (64-bit)
Running suite(s): Mult-s16
32 bit mult: 1272300 usec (avg: 12723, min = 12533, max = 18749, stddev = 620.48).
64 bit mult: 852241 usec (avg: 8522.41, min = 8420, max = 9148, stddev = 109.388).
100%: Checks: 1, Failures: 0, Errors: 0
on Pentium D, 3.4GHz (32-bit)
Running suite(s): Mult-s16
32 bit mult: 2228504 usec (avg: 22285, min = 18775, max = 29648, stddev = 3865.59).
64 bit mult: 5546861 usec (avg: 55468.6, min = 55028, max = 64924, stddev = 978.981).
100%: Checks: 1, Failures: 0, Errors: 0
on TI DM3730, Cortex-A8, 800MHz (32-bit)
Running suite(s): Mult-s16
32 bit mult: 23708900 usec (avg: 237089, min = 191864, max = 557312, stddev = 77503.6).
64 bit mult: 22190039 usec (avg: 221900, min = 177978, max = 480469, stddev = 68520.5).
100%: Checks: 1, Failures: 0, Errors: 0
there is a test program called mult-s16-test which checks that the functions compute the
same results, and compares runtime
Signed-off-by: Peter Meerwald <pmeerw@pmeerw.net>
Move the volume code into a separate file with the reference C implementations.
Add a function to retrieve the volume function and one to install a new one.