Commit graph

139 commits

Author SHA1 Message Date
Wim Taymans
f93b3b23a3 loop: fix use after free case
Because we can now destroy sources (and free the source structure) by
simply holding the lock, there is a window where we might access the
freed source.

When we in iterate release the lock and go into the epoll, another
thread might acquire the lock and delete the fd from epoll. This might
happen right after epoll detected activity on the fd. When iterate
manages to acquire the lock again, it will process to dispatch the
active fd and deref the ep.data pointer, which is now pointing to freed
memory.

Fix this by incrementing a removed_count whenever we remove a source.
Check the counter if it was the same as before the epoll otherwise we
can't assume all sources are alive still. Return in that case as if
there were no fds to poll. The caller should reenter the iterate at some
point and we will return all the fds with activity, minus the one that
got destroyed. We need to give control to the caller because part of the
removal could be to stop the loop iteration all together.
2025-06-30 12:44:15 +02:00
Wim Taymans
9a6f8d31dc loop: unlock the lock when blocking on invoke
When we are the owners of the loop lock and we are not in the loop
thread itself, release all locks so that the loop can start processing
our invoke items and we get a chance to make progress. After that
re-acquire the locks.

This can happen when you change some of the core loop_locked() calls to
blocking _invoke functions that are called with the loop locked.

We have all core blocking invoke functions removed now so this is not
actually going to be used but just in case an application tries to
blocking invoke while locking the loop, this will now at least do
something else than deadlock.
2025-06-26 14:23:36 +02:00
Wim Taymans
8e32afb863 loop: don't call the hooks around blocking wait
The hooks were previously used to unlock the loop but now that the
lock is handled inside the loop itself and we don't unlock before the
blocking read anymore, we should also not call the hooks.

The blocking invoke function is not meant to be called with any of the
loop context locks acquired in order to avoid a deadlock. Make this (and
other blocking risks) clear in the documentation.

See #4472
2025-06-10 11:57:38 +02:00
Arun Raghavan
420c510d47 spa: loop: Fix potential uninitialised result 2025-06-06 12:50:56 +05:30
Arun Raghavan
56c6e19f99 Revert "spa: loop: Change get_time() timeout to unsigned"
This reverts commit c515f9bf8e. The PW
APIs use int64_t (partly because SPA_NSEC_PER_SEC is an LL), and we
don't want to change already public API.
2025-06-03 15:20:15 +05:30
Arun Raghavan
c515f9bf8e spa: loop: Change get_time() timeout to unsigned
A signed value doesn't really make sense in this context, so let's keep
it unsigned so the semantics are clear. This does break the interface,
but should be okay since it's not in a release yet.
2025-06-03 09:39:30 +00:00
Wim Taymans
34796d5bb8 loop: keep a free_list of sources
When a source is destroyed, move it to a free_list and reuse the memory
when a new source is made. This way we can avoid doing a free() from
the epoll thread.
2025-05-29 10:17:16 +02:00
Wim Taymans
f2452a6af7 spa: some more invoke -> locked calls 2025-05-29 10:17:16 +02:00
Wim Taymans
f7fdafc203 loop: add method to run a function with the lock
Convert some _invoke to _locked
2025-05-29 10:17:16 +02:00
Wim Taymans
fb49e0795c loop: move thread-loop to support loop
Add more synchronization primitives to spa loop so that we can replace
the thread-loop with it.
2025-05-29 10:17:16 +02:00
Wim Taymans
65cbbf1a02 spa: add locking to the loop
We can add a PTHREAD_PRIO_INHERIT lock to the loop to protect the
callbacks and then use this to update shared data in an RT-safe way.

This can avoid some invoke calls that require a context switch but
also due to the nature of epoll cause locking in the kernel with non-RT
guarantees.

Because we use PRIO_INHERIT, the code executed in the lock must not use
any RT-unsafe functions.
2025-05-29 10:17:16 +02:00
Wim Taymans
a4b553f3d4 spa: serialize in_thread flushes with a mutex
When we have no thread running the loop, we need to flush the queues
from the invoking thread. Make sure that when multiple threads attempt
this that we serialize the flushing because the flushing code is not
thread safe.
2024-12-03 16:38:28 +01:00
Wim Taymans
8c59fae42d loop: add overflow queues again
Add the overflow queues again. We can easily iterate atomically over the
overflow queues and flush them.

Overflowing a queue is quite common when heavy swapping is done and
should never cause a lockup, so allocate new queues as we need them. We
can share the eventfd with the main queue to avoid wastings fds.

The limit on the number of queues is then only for when concurrent
threads want to invoke things, so 128 is plenty enough.
2024-11-14 17:38:43 +01:00
Wim Taymans
6cf320e387 loop: handle queue overflow better
When a queue overflows we place the queue back in the stack and try
again. Because it's at the top of the stack we take exactly the same
queue and keep on looping forever if the other thread is blocked for
some reason.

Instead, mark the queue as overflowed and only place it back in the
stack when we have flushed it.

This avoids a deadlock when the main-thread invokes on the data loop
and blocks and when the data loop invokes on the main-thread and
overflows the queue.
2024-11-14 15:58:09 +01:00
Wim Taymans
9c19284f7f support: make the loop queue handling lockfree
Don't use TSS to store per-thread queues but keep a lockfree stack of
queues. We can then pick off a queue and write to that one and place it
back after use.

We need to keep the queues indexed by id in the stack because otherwise
we would need to compare-and-swap 128 bits (pointer + tag), which is
more problematic.

Because we keep the queues in an array and no queue is ever removed and
the array can only grow, we can quite easily just iterate the array
without a lock. Without the lock we also fix one of the potential
problems with ardour where the queue_flush thread is canceled while
flushing and the queue_mutex remains locked.

Because we end up with all queues in the array now, we can overflow the
fixed max amount of queues we can manage. When that happens, sleep for a
while and try again. This is a case where more than QUEUES_MAX (128) threads
are invoking at the same time and is rather unlikely.

There is also the queue overflow case which we now also must handle with
a retry. This potentially uses more eventfds but again this should be
unlikely and cause no further problems.

See #4356
2024-11-04 17:41:14 +01:00
Wim Taymans
22c45af7e0 loop: refcount the queues
The loop in the TSS gets an extra refcount and is unreffed when the TSS
destroy is called.

We can then also ref the queue during the function callback. When the
queue (thread) was destroyed during the callback, ignore the result and
continue with the next queues.

See #4356
2024-10-21 17:47:31 +02:00
Wim Taymans
bb53aa08ad loop: warn when some queues are still in TSS
When we clear we need to have all our queues removed from the TSS when
we delete the tss key or else they are leaked, check an warn about this
using a refcount of queued in the TSS.

See #4356
2024-10-21 17:08:10 +02:00
Wim Taymans
c4fece74a5 loop: fix race in shutdown
Make it possible to call loop_queue_destroy() from both the TSS destroy
and impl_clear() without races. We make sure that only one can remove
the queue from the queue list and cleanup. We also store the IN_TSS flag
in the flags so that we can see them before the queue is added to the
queue list. Only free the IN_TSS queue when the TSS destroy is called.

See #4356
2024-10-21 16:45:17 +02:00
Wim Taymans
dca11e6c41 loop: remove extra allocation
We don't actually need the extra allocation for the tss. We can just
mark the queue as being in the tss. When a queue is destroyed, mark it
as destroyed but when it is still in the tss, don't free the structure
yet. We free the structure when we destroy the tss.

We can also free the overflow queues of a queue when it is destroyed
immediately.
2024-10-02 09:40:54 +02:00
Wim Taymans
5d3aac313d loop: free tss from the thread calling impl::clear
The thread that calls the impl_clear method might be the main thread and
is certainly not going to call the invoke function anymore so free the
tss if there is any.

Fixes a leak in the unit test.
2024-10-02 09:21:00 +02:00
Wim Taymans
82585b7475 loop: improve tss cleanup
Store a pointer to a pointer to a queue in the tss and point to it from
the queue.

When we destroy the queue when we _clear the support, we can clear the
pointer in the tss as well. This way, when the thread is later
destroyed, it will see the NULL pointer and not try to free the queue
again.
2024-10-01 13:25:15 +02:00
Gleb Popov
67ddfc3053 Use the 'thrd_success' constant when checking for tss_create result 2024-09-23 08:09:45 +00:00
Wim Taymans
fff52bb7a2 Revert "spa: support: loop: do not call control hooks on blocking invoke"
This reverts commit 9ae89b4247.

All invokes should be paired with a lock/unlock if the loop requires
this. For internal calls of invoke, this will also be true because all
pipewire functions should be called with the lock.

Fixes #4215
2024-08-21 14:48:54 +02:00
Wim Taymans
8c1a69f1b5 loop: don't usleep when queue is full
When the queue is full, before this patch we used to go into usleep in
the hope that the other thread will run and empty the queue and that we
can retry after the usleep.

This however does not always work because the other thread might be waiting
for the thread that does the invoke call and we lock forever.

Therefore we should always try to make progress in some way. Instead of
waiting, allocate an (or use the previously allocated) overflow queue and
write to that one. We can chain multiple overflow queues together as many
as we need (but we might want to bound that as well).

The loop.retry-timeout property is now deprecated.

See #4114
2024-08-06 12:05:11 +02:00
Barnabás Pőcze
9ae89b4247 spa: support: loop: do not call control hooks on blocking invoke
The control hooks of a loop are called before the loop starts polling
and after it has finished polling. Currently, this is used to implement
the locking in pw_thread_loop. This is used to guarantee that the thread
loop's lock is taken while the thread loop is dispatching, and that
the lock can be taken while the loop is polling, when it is running
no user-space code.

However, calling the thread control hooks of thread A when doing an
blocking invoke from thread B serves little purpose, and in fact
can cause issues: for example, issuing a blocking invoke on a
pw_thread_loop does not work unless the lock thereof is taken.

This behaviour, of calling the control hooks from other threads,
is also not documented, and goes contrary to what is currently
stated in the loop.h header file:

  /** Executed right before waiting for events. It is typically used to
   * release locks. */
  ...
  /** Executed right after waiting for events. It is typically used to
   * reacquire locks. */

At the moment the implementation allows any thread to queue invoke
items on any other thread without restrictions; calling the control
hooks only places extra restrictions on the usability of this mechanism
(in case of pw_thread_loop, having to take the loop's lock).
So do not call the control hooks when doing a blocking invoke.
2024-08-05 18:14:39 +00:00
Wim Taymans
494600d46a loop: release queue lock before calling invoke function
We don't actually need to hold the lock while calling the invoke
function, we only need the lock to protect the list of queues.
2024-07-30 12:04:42 +02:00
Wim Taymans
2a8a08f303 loop: signal when queue is full
When our queue is full, signal the wakeup event to make sure the thread
will wake up and try to clear the queue before we go to sleep.
2024-07-20 14:05:09 +02:00
David Coles
2770e96e08 loop: fix update_timer handling of solo repeat argument
I believe the intent here is that if a `interval` is provided
but `value` is unset, then `value` should default to `period`
so the timer first fires after one `interval`.

Since `interval` is always a relative duration, `value` should
be interpreted as a relative duration, not an absolute one.
2024-06-30 18:37:49 +00:00
Wim Taymans
8b23a8a89e loop: flush items in the order they were added
Add a count to each invoke item that is updated with an increasing
loop atomic counter. Flush items from the queues based on their count
so that items are flushed in the order they were added even if they
were added to different queues.
2024-05-08 12:21:54 +02:00
Wim Taymans
8ff40e6252 loop: improve in_thread handling of invoke queue
Because we now have a dedicated queue per thread, we can simply add our
invoke item to the queue and then flush all the queues when we are
running in the thread of the loop.

This simplifies some things and removes potential out-of-order messages
that got queued while flushing.
2024-04-29 15:56:00 +02:00
Wim Taymans
de0db48f17 loop: create a per-thread queue
Keep a thread local queue. This makes it possible for multiple threads
to write to the ringbuffer.

There is a lock to protect the list of queues. It can only be contended
when new queues are created in the threads but this can be done at
thread startup.

Fixes #3983
2024-04-29 15:17:45 +02:00
Wim Taymans
c76424da36 loop: move invoke queue to separate object
Make an internal queue object that implements the invoke queue.

Because we can not do invokes concurrently from different threads, this
is required to make per-thread invoke queues later.
2024-04-29 12:10:48 +02:00
Wim Taymans
eb33145691 loop: fix clang compilation 2024-02-05 23:16:36 +01:00
Wim Taymans
e7e6742200 loop: sleep and retry when the invoke queue is full
When the invoke ringbuffer is full, sleep a little and try again.
Add an option to set the retty timeout, setting this to 0 restores
the old behaviour of returning -EPIPE.

Most callers don't check the return values and might assume the invoke
call is queued or executed, which could cause crashes or leaks.

When the queue overruns, it's better to log a warning and hope that the
problem is resolved soon. We might abort or return the error to the
caller later if we want to break the retry loop.

See !1887
2024-02-05 19:44:02 +01:00
Pauli Virtanen
eaea03c26c spa: export log topic enumerations 2024-01-04 10:02:55 +00:00
Wim Taymans
ee6e7021f0 loop: rate limit xrun messages
When the reader thread locks up for some reason, avoid excessive
logs about the invoke queue being filled.

See #3532
2023-09-30 09:29:20 +02:00
Wim Taymans
efea7ad060 hooks: add and use _fast callback function
Add a _fast callback function that skips the version and method check.
We can use this in places where performance is critical when we do the
check out of the critical loops.

Make all system methods _fast calls. We expect them to exist and have
the right version. If we add new versions we can make them slow.
2023-05-06 00:27:12 +02:00
Wim Taymans
4b5b94303e loop: clear rmask after dispatching all sources
To make the unit tests work again.
2023-05-05 18:36:50 +02:00
Wim Taymans
fbf17cf980 loop: add optimized non-cancellable iterate
Only use the more heavy cancellable loop when the loop.cancel property
was set. Makes pipewire go from 5% to 3% in high frequency wakeups.
2023-05-05 17:41:37 +02:00
Wim Taymans
67c38490a5 move some trace to trace_fp 2023-05-05 17:41:13 +02:00
Wim Taymans
74831aa967 support: add support for checking loop context
Add check for running the the loop context and thread.

Add checks in filter and stream to avoid doing things when not run from
the context main-loop because this can crash things when doing IPC from
concurrent threads.
2023-04-04 16:19:41 +02:00
Wim Taymans
f2be2923e6 thread: use pthread_equal to compare thread ids 2023-04-04 12:43:25 +02:00
Barnabás Pőcze
0e0a2627aa treewide: print pthread_t as a pointer
On glibc, `pthread_t` is `unsigned long int` while on musl
it has a pointer type. To avoid format string warnings,
cast it to `void *` and use the `%p` format specifier.
2023-02-25 20:45:28 +01:00
Barnabás Pőcze
934ab3036e treewide: use SPDX tags to specify copyright information
SPDX tags make the licensing information easy to understand and clear,
and they are machine parseable.

See https://spdx.dev for more information.
2023-02-16 10:54:48 +00:00
Wim Taymans
ddf6e7ae91 loop: don't write from multiple threads
We can only write from one thread to the ringbuffer so bypass the
ringbuffer when doing in-thread invoke. Only flush the current
items so that out-of-thread items don't get inserted.
2022-12-08 08:01:40 +01:00
Wim Taymans
8ecfcbf884 loop: support recursive loop flush
Always append the item to the ringbuffer, even if we are invoking from
the thread itself. This ensure all items are always invoked in the
right order.

If we invoke from the thread, flush all items of the ringbuffer and
return.

Make sure to set the callback to NULL before invoking so that recursive
invoke doesn't call it again.

When while flushing the items we get a recursive invoke, detect this
with a counter and return immediately.
2022-12-07 22:00:58 +01:00
Wim Taymans
97f95f51c5 loop: only flush pending items
Mostly useful for when invoking from the thread itself so that the new
invoke item is executed before new items are added.

Imagine this case with module-loopback:
     - data-loop goes into the capture process function
          - mainloop invokes node remove of capture and waits
     - data-loop invokes trigger -> node remove is first executed, mainloop
                                    is woken up
          - mainloop continues
    	  - mainloop invokes remove of playback and waits
     - data-loop continues flushing the ringbuffer -> playback remove is
                                 executed, mainloop wakes up
    	  - mainloop continues destroying items, frees playback
    	    and capture streams
     - data-loop finaly gets to flush the trigger and crashes because
            streams are gone.
2022-12-07 19:52:13 +01:00
Wim Taymans
61e600970b loop: improve error handling from fds
When we try to read one of the events and there was an error, don't
signal the callback. If the error is something else than EAGAIN log
a warning.

Especially for timerfd, EAGAIN can happen when the timer changed
while polling. This can happen when running the profiler because it
polls and updates the timer from different threads.
2022-12-01 20:03:06 +01:00
Wim Taymans
67dcb72295 loop: don't assert in cleanup
Just issue a warning instead of asserting. Firefox does strange things
to the fds that make things crash otherwise.
2022-11-08 15:45:55 +01:00
Wim Taymans
0ad7cb3298 loop: flush items before stopping
Before leaving the loop, flush out any pending items in the invoke
queue.

See #2631
2022-08-09 20:38:06 +02:00