building PA with -O0 leads to test failure in mix-test on i386
issue reported by Felipe, see
http://lists.freedesktop.org/archives/pulseaudio-discuss/2014-August/021406.html
the problem is the value 0xbeffbd7f: when byte-swapped it becomes 0x7fbdffbe and according
to IEEE-754 represents a signalling NaN (starting with s111 1111 10, see http://en.wikipedia.org/wiki/NaN)
when this value is assigned to a floating point register, it becomes 0x7ffdffbe, representing
a quiet NaN (starting with s111 1111 11) -- a signalling NaN is turned into a quiet NaN!
so PA_FLOAT32_SWAP(PA_FLOAT32_SWAP(x)) != x for certain values, uhuh!
the following test code can be used; due to volatile, it will always demonstrate the issue;
without volatile, it depends on the optimization level (i386, 32-bit, gcc 4.9):
// snip
static inline float PA_FLOAT32_SWAP(float x) {
union {
float f;
uint32_t u;
} t;
t.f = x;
t.u = bswap_32(t.u);
return t.f;
}
int main() {
unsigned x = 0xbeffbd7f;
volatile float f = PA_FLOAT32_SWAP(*(float *)&x);
printf("%08x %08x %08x %f\n", 0xbeffbd7f, *(unsigned *)&f, bswap_32(*(unsigned *)&f), f);
}
// snip
the problem goes away with optimization when no temporary floating point registers are used
the proposed solution is to avoid passing swapped floating point data in a
float; this is done with new functions PA_READ_FLOAT32RE() and PA_WRITE_FLOAT32RE()
which use uint32_t to dereference a pointer and byte-swap the data, hence no temporary
float variable is used
also delete PA_FLOAT32_TO_LE()/_BE(), not used
Signed-off-by: Peter Meerwald <pmeerw@pmeerw.net>
Reported-by: Felipe Sateler <fsateler@debian.org>
use (1<<15) instead of 0x7fff as a factor when converting from s16 to float32
use (1<<31) instead of 0x7fffffff as a factor when converting from s32 to float32
the change is motivated by the following desireable properties:
* s16_from_f32(f32_from_s16(x)) == x for all possible s16 values
* x / (1.0f << 15) == x * (1.0f / (1 << 15)) for all x in s16
above changes enable easier optimization while guaranteeing bit-exact results
further, other audio sample conversion code (libavresample) does it the same way
v3 (comments Tanu):
* fix saturation in pa_sconv_s16le_from_f32ne_neon(), use vqrshrn
v2 (comments Tanu):
* fix comments in ARM NEON code
* use llrintf() in pa_sconv_s32le_from_float32ne()
Signed-off-by: Peter Meerwald <p.meerwald@bct-electronic.com>
Cc: Tanu Kaskinen <tanuk@iki.fi>
Get rid of the liboil dependency and reimplement the liboil functions with an
equivalent C implementation. Note that most of these functions are deprecated in
liboil and that none of them had any optimisations. We can further specialize
our handrolled versions for some extra speedups.