Add a function that accepts the size of the position array when reading
the audio positions. This makes it possible to decouple the position
array size from SPA_AUDIO_MAX_CHANNELS.
Also use SPA_N_ELEMENTS to pass the number of array elements to
functions instead of a fixed constant. This makes it easier to change
the array size later to a different constant without having to patch up
all the places where the size is used.
If the timer was canceled, the discont flag needs to be set. But in the
next cycle, unless the timer was canceled again, that flag should not
remain set.
If the user alters the realtime clock (for example by using the "date"
command in the shell), and the node driver uses the realtime clock as
the timerfd clock, then the scheduled graph cycle invocation may not
take place, or may take place much later than planned, because the
timestamp that was passed to spa_system_timerfd_settime() is now invalid.
Configure the timer to automatically be canceled if the realtime clock
is modified so that the graph cycle can be rescheduled with an updated
timestamp that is actually usable with the altered realtime clock.
Because we can now destroy sources (and free the source structure) by
simply holding the lock, there is a window where we might access the
freed source.
When we in iterate release the lock and go into the epoll, another
thread might acquire the lock and delete the fd from epoll. This might
happen right after epoll detected activity on the fd. When iterate
manages to acquire the lock again, it will process to dispatch the
active fd and deref the ep.data pointer, which is now pointing to freed
memory.
Fix this by incrementing a removed_count whenever we remove a source.
Check the counter if it was the same as before the epoll otherwise we
can't assume all sources are alive still. Return in that case as if
there were no fds to poll. The caller should reenter the iterate at some
point and we will return all the fds with activity, minus the one that
got destroyed. We need to give control to the caller because part of the
removal could be to stop the loop iteration all together.
When we are the owners of the loop lock and we are not in the loop
thread itself, release all locks so that the loop can start processing
our invoke items and we get a chance to make progress. After that
re-acquire the locks.
This can happen when you change some of the core loop_locked() calls to
blocking _invoke functions that are called with the loop locked.
We have all core blocking invoke functions removed now so this is not
actually going to be used but just in case an application tries to
blocking invoke while locking the loop, this will now at least do
something else than deadlock.
The hooks were previously used to unlock the loop but now that the
lock is handled inside the loop itself and we don't unlock before the
blocking read anymore, we should also not call the hooks.
The blocking invoke function is not meant to be called with any of the
loop context locks acquired in order to avoid a deadlock. Make this (and
other blocking risks) clear in the documentation.
See #4472
A signed value doesn't really make sense in this context, so let's keep
it unsigned so the semantics are clear. This does break the interface,
but should be okay since it's not in a release yet.
When a source is destroyed, move it to a free_list and reuse the memory
when a new source is made. This way we can avoid doing a free() from
the epoll thread.
We can add a PTHREAD_PRIO_INHERIT lock to the loop to protect the
callbacks and then use this to update shared data in an RT-safe way.
This can avoid some invoke calls that require a context switch but
also due to the nature of epoll cause locking in the kernel with non-RT
guarantees.
Because we use PRIO_INHERIT, the code executed in the lock must not use
any RT-unsafe functions.
In timestamps, support different clocks and local time as formats.
Local real time timestamps are useful when trying to correlate logs from
different sources.
When we have no thread running the loop, we need to flush the queues
from the invoking thread. Make sure that when multiple threads attempt
this that we serialize the flushing because the flushing code is not
thread safe.
Add the overflow queues again. We can easily iterate atomically over the
overflow queues and flush them.
Overflowing a queue is quite common when heavy swapping is done and
should never cause a lockup, so allocate new queues as we need them. We
can share the eventfd with the main queue to avoid wastings fds.
The limit on the number of queues is then only for when concurrent
threads want to invoke things, so 128 is plenty enough.
When a queue overflows we place the queue back in the stack and try
again. Because it's at the top of the stack we take exactly the same
queue and keep on looping forever if the other thread is blocked for
some reason.
Instead, mark the queue as overflowed and only place it back in the
stack when we have flushed it.
This avoids a deadlock when the main-thread invokes on the data loop
and blocks and when the data loop invokes on the main-thread and
overflows the queue.
Don't use TSS to store per-thread queues but keep a lockfree stack of
queues. We can then pick off a queue and write to that one and place it
back after use.
We need to keep the queues indexed by id in the stack because otherwise
we would need to compare-and-swap 128 bits (pointer + tag), which is
more problematic.
Because we keep the queues in an array and no queue is ever removed and
the array can only grow, we can quite easily just iterate the array
without a lock. Without the lock we also fix one of the potential
problems with ardour where the queue_flush thread is canceled while
flushing and the queue_mutex remains locked.
Because we end up with all queues in the array now, we can overflow the
fixed max amount of queues we can manage. When that happens, sleep for a
while and try again. This is a case where more than QUEUES_MAX (128) threads
are invoking at the same time and is rather unlikely.
There is also the queue overflow case which we now also must handle with
a retry. This potentially uses more eventfds but again this should be
unlikely and cause no further problems.
See #4356
The loop in the TSS gets an extra refcount and is unreffed when the TSS
destroy is called.
We can then also ref the queue during the function callback. When the
queue (thread) was destroyed during the callback, ignore the result and
continue with the next queues.
See #4356
When we clear we need to have all our queues removed from the TSS when
we delete the tss key or else they are leaked, check an warn about this
using a refcount of queued in the TSS.
See #4356
Make it possible to call loop_queue_destroy() from both the TSS destroy
and impl_clear() without races. We make sure that only one can remove
the queue from the queue list and cleanup. We also store the IN_TSS flag
in the flags so that we can see them before the queue is added to the
queue list. Only free the IN_TSS queue when the TSS destroy is called.
See #4356
We don't actually need the extra allocation for the tss. We can just
mark the queue as being in the tss. When a queue is destroyed, mark it
as destroyed but when it is still in the tss, don't free the structure
yet. We free the structure when we destroy the tss.
We can also free the overflow queues of a queue when it is destroyed
immediately.
The thread that calls the impl_clear method might be the main thread and
is certainly not going to call the invoke function anymore so free the
tss if there is any.
Fixes a leak in the unit test.
Store a pointer to a pointer to a queue in the tss and point to it from
the queue.
When we destroy the queue when we _clear the support, we can clear the
pointer in the tss as well. This way, when the thread is later
destroyed, it will see the NULL pointer and not try to free the queue
again.
Use the helper instead of duplicating the same code.
Also add some helpers to parse a json array of uint32_t
Move some functions to convert between type name and id.
Add spa_json_begin_array/object to replace
spa_json_init+spa_json_begin_array/object
This function is better because it does not waste a useless spa_json
structure as an iterator. The relaxed versions also error out when the
container is mismatched because parsing a mismatched container is not
going to give any results anyway.
This reverts commit 9ae89b4247.
All invokes should be paired with a lock/unlock if the loop requires
this. For internal calls of invoke, this will also be true because all
pipewire functions should be called with the lock.
Fixes#4215
When the queue is full, before this patch we used to go into usleep in
the hope that the other thread will run and empty the queue and that we
can retry after the usleep.
This however does not always work because the other thread might be waiting
for the thread that does the invoke call and we lock forever.
Therefore we should always try to make progress in some way. Instead of
waiting, allocate an (or use the previously allocated) overflow queue and
write to that one. We can chain multiple overflow queues together as many
as we need (but we might want to bound that as well).
The loop.retry-timeout property is now deprecated.
See #4114
The control hooks of a loop are called before the loop starts polling
and after it has finished polling. Currently, this is used to implement
the locking in pw_thread_loop. This is used to guarantee that the thread
loop's lock is taken while the thread loop is dispatching, and that
the lock can be taken while the loop is polling, when it is running
no user-space code.
However, calling the thread control hooks of thread A when doing an
blocking invoke from thread B serves little purpose, and in fact
can cause issues: for example, issuing a blocking invoke on a
pw_thread_loop does not work unless the lock thereof is taken.
This behaviour, of calling the control hooks from other threads,
is also not documented, and goes contrary to what is currently
stated in the loop.h header file:
/** Executed right before waiting for events. It is typically used to
* release locks. */
...
/** Executed right after waiting for events. It is typically used to
* reacquire locks. */
At the moment the implementation allows any thread to queue invoke
items on any other thread without restrictions; calling the control
hooks only places extra restrictions on the usability of this mechanism
(in case of pw_thread_loop, having to take the loop's lock).
So do not call the control hooks when doing a blocking invoke.
Make a new flag that is set when the process function is called because
of a recover from a graph xrun.
Use this flag in the freewheel driver to detect a recover and to avoid
scheduling a new timeout. We should schedule a new timeout only when the
process function was called after completion.
This fixes export in ardour some more when the initial driver timeout
didn't complete (when, for example, some nodes were still starting up).
I believe the intent here is that if a `interval` is provided
but `value` is unset, then `value` should default to `period`
so the timer first fires after one `interval`.
Since `interval` is always a relative duration, `value` should
be interpreted as a relative duration, not an absolute one.
Add a count to each invoke item that is updated with an increasing
loop atomic counter. Flush items from the queues based on their count
so that items are flushed in the order they were added even if they
were added to different queues.
Because we now have a dedicated queue per thread, we can simply add our
invoke item to the queue and then flush all the queues when we are
running in the thread of the loop.
This simplifies some things and removes potential out-of-order messages
that got queued while flushing.
Keep a thread local queue. This makes it possible for multiple threads
to write to the ringbuffer.
There is a lock to protect the list of queues. It can only be contended
when new queues are created in the threads but this can be done at
thread startup.
Fixes#3983
Make an internal queue object that implements the invoke queue.
Because we can not do invokes concurrently from different threads, this
is required to make per-thread invoke queues later.
Debug and trace log messages are often written based on the stderr
logging, where code location is always visible.
journalctl does not show the code location without extra tricks,
which makes user-submitted debug logs from journal more cryptic.
Make journal log more similar to stderr logs by prepending the code
location and log level in the log message when the log topic
level is >= DEBUG.