Let the mixer functions accumulate the intermediate results into a
larger size variable and then clamp to the final precission. This avoids
distortions because of intermediate clamping.
Although the access pattern of the reads are no longer sequential, the
writes are sequential and we don't need to read intermediate values.
Together with the avoided clamping this is probably faster overall.
Add a unit test for the various cases.
pipewire will allocate buffers aligned to the max alignment required for
the CPU. Take this into account and don't expect larger alignment.
Fixes a warning in mixer-dsp when the CPU max alignment is 16 but the
plugin requires 32 bytes alignment for the AVX2 path (that would never
be chosen on the CPU).
See #2074
Move floatmix to the audiomixer plugin and change the name to
AUDIO_MIXER_DSP.
Add runtime selectable sse and sse2 optimizations.
Load the port mixer plugin dynamically based on the factory_name.
Add some more debug