Also clamp the amount of input samples we push when flushing. do several
rounds of zero pushing until we have flushed enough.
Handle the cases where no input is needed or no output is generated.
Fixes crashes when downsampling from 96000 to 1000 Hz or so.
The order of attribute changes is random, so it's possible that controlCX is
accessible before the other devices, which marks the device as available but it
actually fails to open. Only consider the device accessible if both control and
PCM devices can be accessed.
This requires reacting to ATTRIB changes of pcm devices as well now.
Fixes#2534
We need to check the last offset against the size of the buffer, not the
remaining size in the buffer.
When the writing is split, this could cause the buffer to be reused
wrongly.
See #2536
Add avx mixer to test and benchmark
Rework and unroll the avx mixer some more.
The SSE one is 10 times faster than the C one, The AVX is 20 times
faster. The SSE2 function is 5 times faster than the C one.
User changing volume via headset buttons should be treated on the same
level as changing from desktop UI. Also initial headset volume should
be considered saved (even though session managers currently ignore the
initial route values on route restore).
Mark route as saved on volume events.
When emitting node, get initial volumes from transport hardware volume,
if available.
The session manager usually overrides these immediately with saved
values, but it's better to show the HW volume when the node first
appears.
The A2DP and HFP profiles may have different volume curves, so trying to
convert volumes between the two can produce undesirable volume spikes.
For example, when one of them is using hardware volume and the other
software.
Fix by separating HFP and A2DP routes.
Let the mixer functions accumulate the intermediate results into a
larger size variable and then clamp to the final precission. This avoids
distortions because of intermediate clamping.
Although the access pattern of the reads are no longer sequential, the
writes are sequential and we don't need to read intermediate values.
Together with the avoided clamping this is probably faster overall.
Add a unit test for the various cases.