Commit graph

1565 commits

Author SHA1 Message Date
Peter Meerwald
1e4e586150 mix: Add optimized mix code path for ARM NEON
Signed-off-by: Peter Meerwald <pmeerw@pmeerw.net>
2013-02-15 23:02:19 +02:00
Peter Meerwald
7758076d9c mix: Change end pointer to length parameter in mixing function
similar to volume functions, simplifies leftover samples handling
for SIMD'd code path

use concrete pointer type (e.g. int16_t*) instead of void*,
saves several casts

Signed-off-by: Peter Meerwald <pmeerw@pmeerw.net>
2013-02-15 21:53:43 +02:00
Peter Meerwald
c1cac8d82b mix: Add special cases for mixing streams in s16ne format
Signed-off-by: Peter Meerwald <pmeerw@pmeerw.net>
2013-02-15 21:40:07 +02:00
Peter Meerwald
8fa81a93c9 core: Refactor code to multiply s16 by volume
move code to function pa_mult_s16_volume() in sample-util.h
use 64 bit integers on 64 bit platforms (it's faster)

on i5, 2.5GHz (64-bit)

Running suite(s): Mult-s16
32 bit mult: 1272300 usec (avg: 12723, min = 12533, max = 18749, stddev = 620.48).
64 bit mult: 852241 usec (avg: 8522.41, min = 8420, max = 9148, stddev = 109.388).
100%: Checks: 1, Failures: 0, Errors: 0

on Pentium D, 3.4GHz (32-bit)

Running suite(s): Mult-s16
32 bit mult: 2228504 usec (avg: 22285, min = 18775, max = 29648, stddev = 3865.59).
64 bit mult: 5546861 usec (avg: 55468.6, min = 55028, max = 64924, stddev = 978.981).
100%: Checks: 1, Failures: 0, Errors: 0

on TI DM3730, Cortex-A8, 800MHz (32-bit)

Running suite(s): Mult-s16
32 bit mult: 23708900 usec (avg: 237089, min = 191864, max = 557312, stddev = 77503.6).
64 bit mult: 22190039 usec (avg: 221900, min = 177978, max = 480469, stddev = 68520.5).
100%: Checks: 1, Failures: 0, Errors: 0

there is a test program called mult-s16-test which checks that the functions compute the
same results, and compares runtime

Signed-off-by: Peter Meerwald <pmeerw@pmeerw.net>
2013-02-15 21:34:13 +02:00
Peter Meerwald
b123cfa7c9 mix: Combine loops over streams in pa_mix()
Signed-off-by: Peter Meerwald <pmeerw@pmeerw.net>
2013-02-15 21:34:09 +02:00
Peter Meerwald
9fa000bbfc mix: Export function to get/set mixing implementation for a sample format
Signed-off-by: Peter Meerwald <pmeerw@pmeerw.net>
2013-02-15 21:34:03 +02:00
Peter Meerwald
fe455ae013 mix: Split pa_mix() code using function table
have individual function for mixing stream with different sample format instead
of huge case block in pa_mix()

shorter functions, prepare for optimized code path

Signed-off-by: Peter Meerwald <pmeerw@pmeerw.net>
2013-02-15 21:33:52 +02:00
Peter Meerwald
c90868f2e0 mix: Use table for calc_stream_columes()
Signed-off-by: Peter Meerwald <pmeerw@pmeerw.net>
2013-02-15 21:33:42 +02:00
Peter Meerwald
1335914e72 sample-util: Remove duplicate stdio.h #include
Signed-off-by: Peter Meerwald <pmeerw@pmeerw.net>
2013-02-15 21:33:25 +02:00
Peter Meerwald
95b64804ab core: Move pa_mix() into new file mix.c
idea is to allow optimized code path (similar to volume code)
and rework/specialize mixing cases to enable runtime performance improvements

no functionality changes in this patch

Signed-off-by: Peter Meerwald <pmeerw@pmeerw.net>
2013-02-15 21:33:07 +02:00
Peter Meerwald
30ce3a14e5 resampler: Resample first followed by remapping if have more out channels than in channels
The patch intends to reduce computational load when resampling AND remapping. The PA
resampler performs the following steps:

sample format conversion -> remapping -> resampling -> sample format conversion

In case the number of output channels is higher than the number of input channels, the
resampler has to be run more often than necessary. E.g. in case of mono to 4-channel remapping,
the resampler runs on 4 channels separately.

To ímprove this, the PA resampler pipeline is made adaptive:

if out-channels <= in-channels:
sample format conversion -> remapping -> resampling -> sample format conversion
if out-channels > in-channels:
sample format conversion -> resampling -> remapping -> sample format conversion

Signed-off-by: Peter Meerwald <p.meerwald@bct-electronic.com>
2013-02-15 21:27:07 +02:00
Stefan Huber
930654a3af resampler: Generate normalized rows in calc_map_table()
Remixing one channel map to another is (except for special cases) done
via a linear mapping between channels, whose corresponding matrix is
computed by calc_map_table(). The k-th row in this matrix corresponds to
the coefficients of the linear combination of the input channels that
result in the k-th output channel. In order to avoid clipping of samples
we require that the sum of these coefficients is (at most) 1. This
commit ensures this.

Prior to this commit tests/remix-test.c gives 52 of 132 matrices that
violate this property. For example:
'front-left,front-right,front-center,lfe' -> 'front-left,front-right'
           prior this commit                  after this commit
         I00   I01   I02   I03              I00   I01   I02   I03
      +------------------------          +------------------------
  O00 | 0.750 0.000 0.375 0.375      O00 | 0.533 0.000 0.267 0.200
  O01 | 0.000 0.750 0.375 0.375      O01 | 0.000 0.533 0.267 0.200

Building the matrix is done in several steps. However, only insufficient
measures are taken in order to preserve a row-sum of 1.0 (or leaves it
at 0.0) after each step. The current patch adds a post-processing step
in order check for each row whether the sum exceeds 1.0 and, if
necessary, normalizes this row. This allows for further simplifactions:
 - The insufficient normalizations after some steps are removed. Gains
   are adapted to (partially) resemble the old matrices.
 - Handling unconnected input channls becomes a lot simpler.
2013-02-07 16:45:13 +02:00
Stefan Huber
1a40af9c3b resampler: Refactor calc_map_table()
- Separate the cases with PA_RESAMPLER_NO_REMAP or PA_RESAMPLER_NO_REMIX
  set and remove redundant if-conditions.
- Fix C90 compiler warning due to mixing code and variable declaration.
- Do not repeatedly count number of left, right and center channels in
  the input channel map.

The logic of calc_map_table() remains unaltered.
2013-02-07 16:09:33 +02:00
Stefan Huber
8f009c8680 resampler: Replace pa_bool_t by bool 2013-02-07 16:06:30 +02:00
Jarkko Suontausta
7e6e3b7044 core: Assert on memchunk divisibility by sample spec in pa_memblockq_push().
Earlier, -1 was returned if the memchunk size was not a multiple of the frame
size. Now, it is verified unconditionally through an assertion. Error code -1
is still returned when the memblock queue is full.

In those few cases where the return value of pa_memblockq_push() is checked,
an overflow is assumed to be the reason in case an error code is returned.
2013-02-07 11:11:55 +02:00
Tanu Kaskinen
4ffb6fd617 dbus: Fix cleanup when removing signal listeners 2013-02-06 12:34:06 +02:00
Peter Meerwald
e66e846418 sconv: Change/fix conversion to/from float32
use (1<<15) instead of 0x7fff as a factor when converting from s16 to float32
use (1<<31) instead of 0x7fffffff as a factor when converting from s32 to float32

the change is motivated by the following desireable properties:
* s16_from_f32(f32_from_s16(x)) == x for all possible s16 values
* x / (1.0f << 15) == x * (1.0f / (1 << 15)) for all x in s16

above changes enable easier optimization while guaranteeing bit-exact results

further, other audio sample conversion code (libavresample) does it the same way

v3 (comments Tanu):
* fix saturation in pa_sconv_s16le_from_f32ne_neon(), use vqrshrn
v2 (comments Tanu):
* fix comments in ARM NEON code
* use llrintf() in pa_sconv_s32le_from_float32ne()

Signed-off-by: Peter Meerwald <p.meerwald@bct-electronic.com>
Cc: Tanu Kaskinen <tanuk@iki.fi>
2013-02-04 12:07:14 +02:00
Peter Meerwald
3a3c4eb462 resampler: Improve s16<-->s32 conversion, use s16 work format if input or output is s16
Problem: s16 to s32 conversion is performed as s16->float->s32 (via work
format float) for resamplers TRIVIAL, COPY, PEAKS.
Precision and efficiency suffers: e.g. 0x9fff results in 0x9ffe4001 (instead
of 0x9fff0000) and there are two sample format conversions instead of one
conversion.

Solution: If input or output format is s16, then choose the work format
to be s16 as well.

If remapping is to be performed, we could stick to work format float32ne for
precision reseans. This is debateable.

Signed-off-by: Peter Meerwald <p.meerwald@bct-electronic.com>
2013-02-01 10:10:30 +02:00
Peter Meerwald
4b3de4422e resampler: Drop redundant assignment in convert_from_work_format()
r->from_work_format_buf.length is set twice

Signed-off-by: Peter Meerwald <p.meerwald@bct-electronic.com>
2013-02-01 10:08:57 +02:00
Peter Meerwald
db41a4832d sconv: Check for SSE flag before initializing code
Signed-off-by: Peter Meerwald <p.meerwald@bct-electronic.com>
2013-02-01 09:10:44 +02:00
Peter Meerwald
2c5d3d79ad remap_sse: More specific logging: SSE -> SSE2
Signed-off-by: Peter Meerwald <p.meerwald@bct-electronic.com>
2013-02-01 07:26:16 +02:00
Tanu Kaskinen
dd6c8ae38f card: Remove some unnecessary checks. 2013-01-22 08:54:57 +02:00
Tanu Kaskinen
78df02dba6 device-port: Return early from pa_device_port_set_latency_offset() if the offset doesn't change.
This avoids sending change notifications when nothing changes.
2013-01-22 08:48:02 +02:00
poljar (Damir Jelić)
477d6b71b6 device-port: Fire a hook when the latency offset changes.
This change adds a new hook type: PA_CORE_HOOK_PORT_LATENCY_OFFSET_CHANGED
And it is fired when the port latency offset changes.
2013-01-20 11:51:12 +02:00
poljar (Damir Jelić)
56a3561803 device-port: Cleanup of the sink/source subscription events.
Since it's now decided that we deprecated port info for sinks and
sources this isn't needed anymore.
2013-01-20 09:37:06 +02:00
poljar (Damir Jelić)
f8f3690ae9 device-port: Access the cards directly.
Since the ports now know which card owns them we don't need to iterate
through all of them anymore.
2013-01-20 09:33:17 +02:00
poljar (Damir Jelić)
9d6eb21c7e device-port: Add a card pointer to the ports.
This way we can directly access the card that owns the port instead of
iterating over all cards.
2013-01-20 09:27:05 +02:00
Tanu Kaskinen
54c9fa97bd shm: Support Solaris shm file paths.
Patch by Brian Cameron <brian.cameron@oracle.com>
2013-01-04 16:31:57 +02:00
Tanu Kaskinen
8d0e9d4662 modargs: Don't fail needlessly in pa_modargs_get_sample_spec_and_channel_map().
BugLink: https://bugs.freedesktop.org/show_bug.cgi?id=49664
2012-12-19 12:31:50 +02:00
Wang Xingchao
953bedc974 sndfile-util: reduce useless loop
it's useless to get the same SF_FORMAT_INFO three times, just compare the
name/extention in the same loop.

Signed-off-by: Wang Xingchao <xingchao.wang@intel.com>
2012-12-19 12:31:50 +02:00
Tanu Kaskinen
0c5e39a961 memblockq: Use pa_xnew0() to avoid manual zeroing. 2012-12-19 12:31:50 +02:00
Tanu Kaskinen
19c058dd08 Fix pa_parse_boolean() return value checking.
pa_parse_boolean() return value shouldn't be stored in
pa_bool_t, because 1 and -1 need to be distinguished.
2012-12-19 12:31:50 +02:00
Tanu Kaskinen
02d6aa6480 core-util: Improve get_path() documentation 2012-12-19 12:31:48 +02:00
Flavio Ceolin
f9beb8e867 modargs: Adding pa_modargs_get_value_volume()
This function gets a pa_volume_t from a string.
2012-12-19 12:31:47 +02:00
Flavio Ceolin
9e2b6a0b5c sink-input: New volume_factor system
Implement setting of more than one volume factor.  The
real value of the volume_factor will be the multiplication of these
values.
2012-12-19 12:31:47 +02:00
Tanu Kaskinen
0f44b1e820 Log the reason for every suspend/resume.
I was looking at a log that showed that a suspend happened (at
a strange time), but the log didn't tell me why the suspend was done.
This patch tries to make sure that that won't happen again.
2012-12-19 12:31:47 +02:00
Tanu Kaskinen
28c49a12fc esound: Suspend/resume also sources on STANDBY/RESUME commands. 2012-12-19 12:31:47 +02:00
Arun Raghavan
968c9c45ac core: Remove bad free() call
The string created when trying to use XDG_RUNTIME_DIR is freed before it
is used in a debug message, and is freed again.

https://bugs.freedesktop.org/show_bug.cgi?id=57280
2012-11-19 21:32:18 +05:30
Arun Raghavan
3effdfc16f sink-input, source-output: Check rate update success for passthrough
This makes sure we don't try to plug in a passthrough stream if the
final sink/source sample spec doesn't match what we want. In the future,
we might want to change rate updates to try a full sample spec update
for passthrough streams.

https://bugs.freedesktop.org/show_bug.cgi?id=50951
2012-11-19 13:10:36 +05:30
Arun Raghavan
da4163a85e source-output: Fix volume fixup for rate update
The could that should have been after the rate update ended up being
before, which is incorrect.
2012-11-19 12:56:33 +05:30
Tanu Kaskinen
29f064aa3d sink: Process rewind requests also when suspended.
When a rewind is requested on a sink input, the request parameters are
stored in the pa_sink_input struct. The parameters are reset during
rewind processing, and if the sink decides to ignore the rewind
request due to being suspended, stale parameters are left in
pa_sink_input. It's particularly problematic if the rewrite_bytes
parameter is left at -1, because that will prevent all future rewind
processing on that sink input. So, in order to avoid stale parameters,
every rewind request needs to be processed, even if the sink is
suspended.

Reported-by: Uoti Urpala
2012-11-16 23:16:05 +05:30
Arun Raghavan
cd1102cce0 sink, source: Prevent unnecessary rate update attempts
We don't need to try a rate update if the desired sample rate is the
same as the one the sink or source is already using.
2012-11-16 23:16:04 +05:30
Frédéric Dalleau
153e17e3bb resampler: Fix crash if 'auto' resampler chooses ffmpeg with variable rate
To reproduce, add resampler-method = ffmpeg in daemon.conf
then start PA, and load module-loopback

Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0xb2f1db40 (LWP 23047)]
0x00000000 in ?? ()
(gdb) bt
0  0x00000000 in ?? ()
1  0xb7c463cb in pa_resampler_set_input_rate (r=0x80e9438, rate=44011) at pulsecore/resampler.c:365
2  0xb7c6321d in pa_sink_input_process_msg (o=0x80e87a0, code=3, userdata=0xabeb, offset=0, chunk=0x0)
    at pulsecore/sink-input.c:1833
3  0xb7e9840b in sink_input_process_msg_cb (obj=0x80e87a0, code=3, data=0xabeb, offset=0, chunk=0x0)
        at modules/module-loopback.c:538
4  0xb7c2709b in pa_asyncmsgq_dispatch (object=0x80e87a0, code=3, userdata=0xabeb, offset=0, memchunk=0xb2f1d17c)
        at pulsecore/asyncmsgq.c:322
5  0xb7c4c6e3 in asyncmsgq_read_work (i=0x80dd580) at pulsecore/rtpoll.c:564
6  0xb7c4b34a in pa_rtpoll_run (p=0x80fb7e0, wait_op=true) at pulsecore/rtpoll.c:238
7  0xb7dd90af in thread_func (userdata=0x80afe88) at modules/alsa/alsa-sink.c:1785
8  0xb7bf3291 in internal_thread_func (userdata=0x8095d08) at pulsecore/thread-posix.c:83
9  0xb7ab9d4c in start_thread (arg=0xb2f1db40) at pthread_create.c:308
10 0xb79f3ace in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:130
2012-11-04 10:07:31 +01:00
Arun Raghavan
eeab4efa98 Revert "core: adjust playing_for and underrun_for at rewind"
This reverts commit 5bc6cadcb2.

I wasn't meaning to push this out - just merged for review / testing.
2012-11-03 10:29:20 +01:00
Uoti Urpala
5bc6cadcb2 core: adjust playing_for and underrun_for at rewind
A rewind may erase data that sink_input counted in playing_for or
underrun_for earlier. Add code adjusting those values after a rewind.

One visible symptom of this bug was problems recovering from an
underrun. When a client calls pa_stream_write() with a large block of
memory, the function can split that into smaller pieces before sending
it to the server. When receiving new data for a stream that had
silence queued due to underrun, the server would do a rewind to
replace the queued-but-not-played silence with the new data. Because
of the bug, this rewind itself would not change underrun_for. It's
possible for multiple rewinds to be done without filling the sink
buffer in between (which is what would eventually reset underrun_for).
In this case, the server rapidly processing the split packets would
rewind the stream for _each_ of them (as underrun_for would stay set),
erasing valid audio as a result.
2012-10-31 15:27:15 +05:30
Arun Raghavan
94039790f8 svolume: Fix ARM alignment issues
As Peter Meerwald <p.meerwald@bct-electronic.com> discovered, our ARM
svolume code performance is quite terrible when the incoming samples are
not word-aligned. This can very easily be the case, since the
architecture only requires that the samples be 16-bit aligned, and we
might end up running the innermost loop after processing modulo-4
samples. The performance degradation was ~50x on a Cortex A9
(Pandaboard).

This reworks the svolume logic to first consume enough samples to make
sure the rest is word aligned, and reordering the processing to work
with 4 samples at a time first, and then finally deal with the
remainder.

With this, performance is comparable for arbitrary alignments (~3x
faster than the C code).
2012-10-30 20:34:21 +05:30
Tanu Kaskinen
e4adf9c4d8 resampler: Make sure that there are no overflows when multiplying potentially big numbers.
This fixes at least one crash that has been observed. The
multiplication in trivial_resample() overflowed when
resampling from 96 kHz to 48 kHz, causing an assertion
error:

Assertion 'o_index * fz < pa_memblock_get_length(output->memblock)' failed at pulsecore/resampler.c:1521, function trivial_resample(). Aborting.

Without the assertion, the memcpy() after the assertion
would have overwritten some random heap memory.
2012-10-30 16:21:35 +02:00
Tanu Kaskinen
9bcb9f1a62 memblockq: Fix the order of setting minreq and prebuf. 2012-10-30 16:16:03 +02:00
Thomas Martitz
a8e7d8bc2c core-util: Don't error out on existing runtime directory.
When compiling without HAVE_SYMLINK the runtime dir is a real directory,
which is attempted to be created. In the case it already exists we shouldn't
error out. The HAVE_SYMLINK-enabled code already does this.
2012-10-30 16:22:30 +05:30
Thomas Martitz
7e344b5ff0 core: Proper poll() emulation to fix pacat and friends on Windows
Currently, Windows versions of pacat and friends fail because the current
poll emulation is not sufficient (it only works for socket fds).

Luckily Gnulib has a much better emulation that seems to work good enough.
The implementation has been largely copied (except a few bug fix
regarding timeout handling, to be pushed upstream) and works on pipes
and files as well. The copy has been obtained through their gnulib-tool utility,
which gives a LGPLv2.1+ licensed file.

This fixes the "Assertion (!e->dead) failed" error coming and lets pacat
and friends stream happily to/from a server (I didn't actually test parec).
2012-10-30 16:22:18 +05:30