This provides access to GNU C library-style endian and byteswap functions.
Windows doesn't provide pre-processor defines for endianness, but
all current Windows architectures (X32, X64, ARM) are little-endian.
This is somewhat similar to the S32->F32 conversion improvements,
but here things a bit more tricky...
The main consideration is that the limits to which we clamp
must be valid 32-bit signed integers, but not all such integers
are exactly losslessly representable in `float32_t`.
For example it we'd clamp to `2147483647`,
that is actually a `2147483648.0f`,
and `2147483648` is not a valid 32-bit signed integer,
so the post-clamp conversion would basically be UB.
We don't have this problem for negative bound, though.
But as we know, any 25-bit signed integer is losslessly
round-trippable through float32_t, and since multiplying by 2
only changes the float's exponent, we can clamp to `2147483520`!
The algorithm of selection of the pre-clamping scale is unaffected.
This additionally avoids right-shift, and thus is even faster.
As `test_lossless_s32_lossless_subset` shows,
if the integer is in the form of s25+shift,
the maximal absolute error is finally zero.
Without going through `float`->`double`->`int`,
i'm not sure if the `float`->`int` conversion
can be improved further.
There's really no point in doing that s25_32 intermediate step,
to be honest i don't have a clue why the original implementation
did that \_(ツ)_/¯.
Both `S25_SCALE` and `S32_SCALE` are powers of two,
and thus are both exactly representable as floats,
and reprocial of power-of-two is also exactly representable,
so it's not like that rescaling results in precision loss.
This additionally avoids right-shift, and thus is even faster.
As `test_lossless_s32_lossless_subset` shows,
if the integer is in the form of s25+shift,
the maximal absolute error became even lower,
but not zero, because F32->S32 still goes through S25 intermediate.
I think we could theoretically do better,
but then the clamping becomes pretty finicky,
so i don't feel like touching that here.
At the very least, we should go through s25_32 intermediate
instead of s24_32, to avoid needlessly loosing 1 LSB precision bit.
That being said, i suspect it's still not doing the right thing.
Why are we silently dropping those 7 LSB bits?
Is that really the way to do it?
At the very least, we should go through s25_32 intermediate
instead of s24_32, to avoid needlessly loosing 1 LSB precision bit.
FIXME: the noise codepath is not covered with tests.
The largest integer that 32-bit floating point can exactly represent
is actually `(2^24)-1`, not`(2^23)-1` like the code assumes.
This means, whenever we use s24 as an intermediate step
to go between f32 and s32, we lose a bit of precision.
s25_32 is really a i32 with highest byte always being a sign byte.
Printing was done by adding
```
for(int e = 0; e != 13; ++e)
fprintf(stderr, "%16.32e,", ((float*)m1)[e]);
```
to `compare_mem`. I don't like how these tests work.
https://godbolt.org/z/abe94sedT
Can be used to group ports together. Mostly because they are all from
the same stream and split into multiple ports by audioconvert/adapter.
Also useful for the alsa sequence to group client ports together.
Also interesting when pw-filter would be able to handle streams in the
future to find out what ports belong to what streams.
We can remove most of the special async handling in adapter, filter and
stream because this is now handled in the core.
Add a node.data-loop property to assign the node to a named data-loop.
Assign the non-rt stream and filter to the main loop. This means that
the node fd will be added to the main-loop and will be woken up directly
without having to wake up the RT thread and invoke the process callback
in the main-loop first. Because non-RT implies async, we can do all of
this like we do our rt processing because the output will only be used
in the next cycle.
Add a monitor.passthrough option. This will pass all latency information
directly between the port and its monitor ports.
This is interesting when the adapter (and audioconvert) is used with a
null-audio-sink that simply forwards the data to a real sink/souce. In
that case, we want the sink/source latency to be passed unmodified.
Set the monitor.passthrough on the pulseaudio null-sink because
a passthrough virtual sink is the most likely use case for this.
Add some monitor.passthrough default config where it makes sense.
Fixes#3888
Keep track of the valid ports and don't emit port info for
invalid ports. When a listener is added while the ports are being
created, it is possible that the ports are still NULL or invalid.
The rate we get from dlls can have a subsample precision. However,
the check for using process_copy is in sample precision. This means
that an adaptive stream will oscillate rather then lock into the
exact rate.
When starting the converter, calculate the initial size needed by
the resampler to fill one quantum.
This makes it possible to get the requested amount of samples before
the first process call is made.
Update the started and ready state after we suspend/pause the node so
that we don't complain if scheduling happens between setting the fields
and actually stopping the follower.
Also only complain when the scheduling happens when the node is not
ready. It is possible that the node is scheduled before we manage to set
the started field.
Move the driver and warned bits after the int field in the struct so
that they are placed in separate memory.
Otherwise, a write from the data thread might race with a write from the
main thread and leave the bits in the wrong state.
Don't blindly clear the format when EnumFormat changes. This will
just stop the node without renegotiating.
We should probably find a new best format, check if it changed and
then Stop/configure/Resume the follower with the new format.
This fixes a stall when a node is running and you change the allowed
codecs.
Don't try to allocate each time port buffers are set but only once
before we start procesing.
Also allocate enough temp buffers are there are ports. This saves us
quite a bit of memory in the normal case.