If a provider uses the stream API and pushes a buffer to the stream
after the stream is set to paused, the buffer_id of the last buffer
remains in the io.
If a consumer starts streaming in this state, the buffer_id of the old
buffer is still in the io. The consumer receives a stale buffer_id and
may discard the buffer. Now the buffer is lost, since it is still marked
as busy on the producer.
This can be reproduced by starting Weston with the PipeWire backend and
repeatedly restarting a GStreamer pipeline that connects to the Weston
output. Eventually Weston won't be able to dequeue buffers since the
lost buffer is still busy.
Clear the buffers in the io when the buffers are cleared to avoid
sending an old buffer_id to the consumer.
When we write samples, check if we make a jump in the ringbuffer and
clear the samples we jumped over.
If we don't do this, the reader side might pick up old samples that we
didn't write or clear but that are now available for reading after we
made a jump in the ringbuffer.
This migh not be exactly what pulseaudio does but it is good for now.
Fixes#4464
Since fc49c1697a ("context: improve negotiation") it is possible
that the out parameter `format` will be set to `filter`. However,
`filter` is a SPA POD from the local SPA POD builder `fb`, which
references the local buffer `fbuf`.
In those cases, if the callers then make use of the returned SPA POD,
a stack use-after-free happens, such as the one displayed below.
The issue could be reliably triggered by executing the `video-play`
example program, and then trying to use the same camera in firefox.
As seen below, the input node, firefox's, provides no format preference,
causing the output format to be used. Previously, this had led
to the use-after-free described above.
pw.link | [impl-link.c: 130 link_update_state()] (46.0.1 -> 114.0.0) init -> negotiating (paused-configure)
pw.context | [ context.c: 935 pw_context_find_format()] 0x51e000000080: finding best format 3 1
pw.context | [ context.c: 943 pw_context_find_format()] 0x51e000000080: states 3 1
pw.context | [ context.c: 958 pw_context_find_format()] 0x51e000000080: Got output format:
pw.context | [ context.c: 959 pw_context_find_format()] video/raw
pw.context | [ context.c: 959 pw_context_find_format()] format : (Id) YUY2
pw.context | [ context.c: 959 pw_context_find_format()] size : (Rectangle) 640x480
pw.context | [ context.c: 959 pw_context_find_format()] framerate : (Fraction) 30/1
pw.context | [ context.c: 966 pw_context_find_format()] 0x51e000000080: no input format filter, using output format: Success
=================================================================
==418404==ERROR: AddressSanitizer: stack-use-after-return on address 0x73993ee46200 at pc 0x739941d31020 bp 0x7fff526b4670 sp 0x7fff526b4660
READ of size 4 at 0x73993ee46200 thread T0
#0 0x739941d3101f in spa_pod_builder_raw ../spa/include/spa/pod/builder.h:150
#1 0x739941d3b35d in do_negotiate ../src/pipewire/impl-link.c:294
#2 0x739941d46214 in check_states ../src/pipewire/impl-link.c:727
#3 0x739941f14405 in process_work_queue ../src/pipewire/work-queue.c:64
#4 0x73993d0dbe99 in source_event_func ../spa/plugins/support/loop.c:894
#5 0x73993d0d6881 in loop_iterate ../spa/plugins/support/loop.c:727
#6 0x739941d76b05 in spa_loop_control_enter ../spa/include/spa/support/loop.h:264
#7 0x739941d76d93 in spa_loop_control_leave ../spa/include/spa/support/loop.h:268
#8 0x739941d78946 in pw_main_loop_quit ../src/pipewire/main-loop.c:109
#9 0x5a64b3cb1cec in main ../src/daemon/pipewire.c:130
#10 0x739940c34e07 (/usr/lib/libc.so.6+0x25e07) (BuildId: 98b3d8e0b8c534c769cb871c438b4f8f3a8e4bf3)
#11 0x739940c34ecb in __libc_start_main (/usr/lib/libc.so.6+0x25ecb) (BuildId: 98b3d8e0b8c534c769cb871c438b4f8f3a8e4bf3)
#12 0x5a64b3caf3b4 in _start (/pipewire/build/src/daemon/pipewire+0x173b4) (BuildId: f9e8403a377e28bf8bd9cf0a5b89d33f08499917)
Address 0x73993ee46200 is located in stack of thread T0 at offset 512 in frame
#0 0x739941c6ed5e in pw_context_find_format ../src/pipewire/context.c:907
This frame has 15 object(s):
[...]
[432, 480) 'fb' (line 911)
[512, 4608) 'fbuf' (line 912) <== Memory access at offset 512 is inside this variable
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
(longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-use-after-return ../spa/include/spa/pod/builder.h:150 in spa_pod_builder_raw
[...]
Fixes: fc49c1697a ("context: improve negotiation")
Don't hide pulsedeviceprovider so pulsesink/src are listed by the device provider
and exclude pipewire's audio devices in its deviceprovider.
Continue showing video devices in pipewiredeviceprovider, so that pipewiresrc is
listed for the video devices
Make sure we copy the DSP functions in the convolver before leaving the
function because we need them to clear memory.
Don't store the DSP functions in the head and tail convolvers but pass
them from the main convolver because the convolvers might be NULL but we
still need the DSP functions to clear memory.
Fixes#4433
Don't use %f to serialize floats to JSON but use the json formatter
because in some locales, the decimal point becomes a , which does not
parse as a float anymore.
Also reformat some lines.
Fixes#4418
When the driver starts, save all previous node timestamps, not just the
previous signal time.
For async nodes, uses the previous timestamps in the profiler messages
so that we get stats with 1 cycle of delay instead of bogus values
because the node is still processing.
Fixes pw-top for async nodes.
The src can listen to RequestProcess commands and so gets the
node.supports-request = 1 property.
The play-pull example can both be a driver that listens to
RequestProcess or a lazy driver so set this in properties as well.
We can now remove the priority.driver property because automatic
priority selection will make us a driver or not.
With this change, video-src to video-play-pull will use lazy scheduling
with play-pull as the driver. video-play-pull to v4l2src will use
normal scheduling and video-src to video-play will also use normal
scheduling.
Collect the request scheduling flags and when there is a lazy driver,
set the lazy scheduling flag in its clock. This means that the driver
can expect RequestProcess commands to start the scheduling.
Pass the lazy scheduling clock flag to the node.
Make sure lazy scheduling driver have a higher priority than their
request drivers.
Make a plugin structure that is dynamically allocated for each plugin
and pass it around to the descriptor instance structures, so that they
all have access to dsp_ops without sharing a static pointer.
The problem with the static pointer is that the dsp_ops structure is
actually allocated in module-filter-chain's instance structure,
so it always points to the instance of the last filter-chain that was
loaded in the process. When this is unloaded, the other filter-chains
crash.
Take the current cycle times early and in all cases. We can use this to
get the current frame time for debugging purposes instead of the more
heavy jack_frame_time().
Rate limit the xrun warnings.
Client-side bugs calling methods on destroyed proxies like
pw_proxy_ref(p);
pw_proxy_destroy(p);
...
wait for server to ack the remove
...
pw_core_destroy(p->core, p); /* p already removed & destroyed */
should not send messages with the stale proxy id to server.
Set id to SPA_ID_INVALID when removing a proxy from the id map, so that
such client bugs only result to ENOENT.
First make instances of all the plugins and then try to link them up.
Otherwise, depending on the order the plugins are defined in the config,
a link will try to create port data and set it on the instance, which is
still NULL and we crash.
Flush with drain calls the drained callback for each cycle until paused
or resumed. Setting the stream to active again, clears the drained state
and makes things resume.
Flush without drain does not set the state to PAUSED but simply clears
the queued data. This is mostly useful when pausing or stopping.
At no point should the flush operation result in a PAUSED state change.
Don't queue an async state change completion for exported nodes. The
server sends a ping to check for completion and we want this ping reply
to happen after the state completion.
Consider the case where we have a follower and a driver, the follower is
sent the Start/Ping commands and replies to the ping but is still
processing the state change async. The server can then Start the driver,
which will then try to schedule the (still starting) follower and fail.
We could add the ping to the work queue as well but that creates
complications because modules (clients) and server share the same work
queues right now and block each other completions.
We could also make a method to process the work queue immediately but
that would be dangerous as well because it could contain a BUSY item
from some module that would block things.
Follow the state of the node and update the stream state accordingly.
The most important part is that Start is async and so it's better to
wait for completion from the node before emiting the STREAMING state.
Don't wait for the completion of the Pause command of the node but send
it to the stream immediately. Delaying it might make it come after the
set_param calls are done and confuse the stream.
When we receive a message with fds and we are at the end of the
buffer, we will call clear_buffer, which moves the next fds over the
fds of this message before we copy the fds into the message. This
results in the fd being leaked and the message using the fd of the next
message instead.
Avoid this by first copying the fds into the message and then move the
new ones over the old ones.
This fixes some wrong fds being used by clients.