mirror of
https://gitlab.freedesktop.org/pipewire/pipewire.git
synced 2025-11-03 09:01:54 -05:00
Update the scheduling doc with some information about how async scheduling works. Also add something about the latency. Async links add 1 quantum of latency so take that into account when aggregating latencies. Also a source directly linked to an async node does not add latency (we evaluate the tee before incrementing the cycle so that it effectively is executed in the previous cycle and consumed immediately by async nodes). We can do this because the driver source always provides data before the async node, and never concurrently. Add a listener to the link for the node driver change as well because that can now influence the latency for async nodes.
462 lines
17 KiB
Text
462 lines
17 KiB
Text
/** \page page_scheduling Graph Scheduling
|
|
|
|
This document tries to explain how the PipeWire graph is scheduled.
|
|
|
|
Graph are constructed from linked nodes together with their ports. This
|
|
results in a dependency graph between nodes. Special care is taken for
|
|
loopback links so that the graph remains a directed graph.
|
|
|
|
# Processing threads
|
|
|
|
The server (and clients) have two processing threads:
|
|
|
|
- A main thread that will do all IPC with clients and server and configures the
|
|
nodes in the graph for processing.
|
|
- A (or more) data processing thread that only does the data processing.
|
|
|
|
|
|
The data processing threads are given realtime priority and are designed to
|
|
run with as little overhead as possible. All of the node resources such as
|
|
buffers, io areas and metadata will be set up in shared memory before the
|
|
node is scheduled to run.
|
|
|
|
This document describes the processing that happens in the data processing
|
|
thread after the main-thread has configured it.
|
|
|
|
# Nodes
|
|
|
|
Nodes are objects with 0 or more input and output ports.
|
|
|
|
Each node also has:
|
|
|
|
- an eventfd to signal the node that it can start processing
|
|
- an activation record that lives in shared memory with memfd.
|
|
|
|
```
|
|
eventfd
|
|
+-^---------+
|
|
| |
|
|
in out
|
|
| |
|
|
+-v---------+
|
|
activation {
|
|
status:OK, // bitmask of NEED_DATA, HAVE_DATA or OK
|
|
pending:0, // number of unsatisfied dependencies to be able to run
|
|
required:0 // number of dependencies with other nodes
|
|
}
|
|
```
|
|
|
|
The activation record has the following information:
|
|
|
|
- processing state and pending dependencies. As long as there are pending dependencies
|
|
the node can not be processed. This is the only relevant information for actually
|
|
scheduling the graph and is shown in the above illustration.
|
|
- Current status of the node and profiling info (TRIGGERED, AWAKE, FINISHED, timestamps
|
|
when the node changed state).
|
|
- Timing information, mostly for drivers when the processing started, the time, duration
|
|
and rate (quantum) etc..
|
|
- Information about repositions (seek) and timebase owners.
|
|
|
|
|
|
# Links
|
|
|
|
When two nodes are linked together, the output node becomes a dependency for the input
|
|
node. This means the input node can only start processing when the output node is finished.
|
|
|
|
This dependency is reflected in the required counter in the activation record. In below
|
|
illustration, B's required field is incremented with 1. The pending field is set to the
|
|
required field when the graph is started. Node A will keep a list of all targets (B) that it
|
|
is a dependency of.
|
|
|
|
This dependency update is only performed when the link is ready (negotiated) and the nodes
|
|
are ready to schedule (runnable).
|
|
|
|
|
|
```
|
|
eventfd eventfd
|
|
+-^---------+ +-^---------+
|
|
| | link | |
|
|
in A out ---------------------> in B out
|
|
| | | |
|
|
+-v---------+ +-v---------+
|
|
activation { target activation {
|
|
status:OK, --------------------> status:OK,
|
|
pending:0, pending:1,
|
|
required:0 required:1
|
|
} }
|
|
```
|
|
|
|
Multiple links between A and B will only result in 1 target link between A and B.
|
|
|
|
|
|
# Drivers
|
|
|
|
The graph can only run if there is a driver node that is in some way linked to an
|
|
active node.
|
|
|
|
The driver is special because it will have to initiate the processing in the graph. It
|
|
will use a timer or some sort of interrupt from hardware to start the cycle.
|
|
|
|
Any node can also be a candidate for a driver (when the node.driver property is true).
|
|
PipeWire will select the node with the highest priority.driver property as the driver.
|
|
|
|
Nodes will be assigned to the driver node they will be scheduled with. Each node holds
|
|
a reference to the driver and increments the required field of the driver.
|
|
|
|
When a node is ready to be scheduled, the driver adds the node to its list of targets
|
|
and increments the required field.
|
|
|
|
|
|
```
|
|
eventfd eventfd
|
|
+-^---------+ +-^---------+
|
|
| | link | |
|
|
in A out ---------------------> in B out
|
|
| | | |
|
|
+-v---------+ +-v---------+
|
|
activation { target activation {
|
|
status:OK, --------------------> status:OK,
|
|
pending:0, pending:0,
|
|
required:1 required:2
|
|
} }
|
|
| ^ ^
|
|
| | / /
|
|
| | / /
|
|
| | / /
|
|
| | / /
|
|
| | / /
|
|
v | /-------------/ /
|
|
activation { /
|
|
status:OK, V---------------/
|
|
pending:0,
|
|
required:2
|
|
}
|
|
+-^---------+
|
|
| |
|
|
| driver |
|
|
| |
|
|
+-v---------+
|
|
eventfd
|
|
```
|
|
|
|
As seen in the illustration above, the driver holds a link to each node it needs to
|
|
schedule and each node holds a link to the driver. Some nodes hold a link to other
|
|
nodes.
|
|
|
|
It is possible that the driver is the same as a node in the graph (for example node A)
|
|
but conceptually, the links above are still valid.
|
|
|
|
The driver will then start processing the graph by emitting the ready signal. PipeWire
|
|
will then:
|
|
|
|
- Check the previous cycle. Did it complete? Mark xrun on unfinished nodes.
|
|
- Perform reposition requests if any, timebase changes, etc..
|
|
- The pending counter of each follower node is set to the required field.
|
|
- Update the cycle counter in the driver activation io.
|
|
- It then loops over all targets of the driver and atomically decrements the required
|
|
field of the activation record. When the required field is 0, the eventfd is signaled
|
|
and the node can be scheduled.
|
|
|
|
In our example above, Node A and B will have their pending state decremented. Node A
|
|
will be 0 and will be triggered first (node B has 2 pending dependencies to start with and
|
|
will not be triggered yet). The driver itself also has 2 dependencies left and will not
|
|
be triggered (complete) yet.
|
|
|
|
## Scheduling node A
|
|
|
|
When the eventfd is signaled on a node, we say the node is triggered and it will be able
|
|
to process data. It consumes the input on the input ports and produces more data on the
|
|
output ports.
|
|
|
|
After processing, node A goes through the list of targets and decrements each pending
|
|
field (node A has a reference to B and the driver).
|
|
|
|
In our above example, the driver is decremented (from 2 to 1) but is not yet triggered.
|
|
node B is decremented (from 1 to 0) and is triggered by writing to the eventfd.
|
|
|
|
## Scheduling node B
|
|
|
|
Node B is scheduled and processes the input from node A. It then goes through the list of
|
|
targets and decrements the pending fields. It decrements the pending field of the
|
|
driver (from 1 to 0) and triggers the driver.
|
|
|
|
## Scheduling the driver
|
|
|
|
The graph always completes after the driver is triggered and scheduled. All required
|
|
fields from all the nodes in the target list of the driver are now 0.
|
|
|
|
The driver calculates some stats about cpu time etc.
|
|
|
|
# Async scheduling
|
|
|
|
When a node has the node.async property set to true, it will be considered an async
|
|
node and will be scheduled differently.
|
|
|
|
Async nodes don't increment the pending counter of their peers and the upstream peers
|
|
also don't increment the async node pending counters. Only the driver increments the
|
|
pending counter to the async node.
|
|
|
|
This means that the async nodes do not depend on any other node and also are not a
|
|
dependency for other nodes. This also means that the async nodes can be scheduled as
|
|
soon as the driver has started the graph.
|
|
|
|
The completion of the async node does not influence the completion of the graph in
|
|
any way and async nodes are therefor interesting is real-time performance can not
|
|
be guaranteed, for example when the processing threads are not running in a real-time
|
|
priority.
|
|
|
|
A link between a port of an async node and another port (async or not) is called an
|
|
async link and will have the link.async=true property.
|
|
|
|
Because async nodes then run concurrently with other nodes, a method must be in place
|
|
to avoid concurrent access to buffer data. This is done by sending a spa_io_async_buffers
|
|
io to the (mixer) ports of an async link. The spa_io_async_buffers has 2 spa_io_buffer
|
|
slots.
|
|
|
|
The driver will increment a cycle counter for each cycle that it starts. Output ports
|
|
will write to the spa_io_async_buffers (cycle+1)&1 slot and input ports will read from
|
|
(cycle&1) slots. This way the async node will always consume the output of the previous
|
|
cycle and will provide data for the next cycle. They will therefore always add 1 cycle
|
|
of latency in the graph.
|
|
|
|
A special exception is made for the output ports of the driver node. When the driver is
|
|
started, the output port buffers are copied to the previous cycle spa_io_buffer slot.
|
|
This way, the async nodes will immediately pick up the new data from the driver source.
|
|
|
|
Because there are 2 buffers in flight on the spa_io_async_buffers io area, the link needs
|
|
to negotiate at least 2 buffers for this to work.
|
|
|
|
|
|
## Example
|
|
|
|
A, B, C are async nodes and have async links between their ports. The async
|
|
link has the spa_io_async_buffers with 2 slots (named 0 and 1) below. All the
|
|
slots are empty.
|
|
|
|
+--------+ +-------+ +-------+
|
|
| A | | B | | C |
|
|
| 0 -( )-> 0 0 -( )-> 0 |
|
|
| 1 ( ) 1 1 ( ) 1 |
|
|
+--------+ +-------+ +-------+
|
|
|
|
|
|
cycle 0: A produces a buffer AB0 on the output port in the (cycle+1)&1 slot (1).
|
|
B consumes slot cycle&1 (0) with the empty buffer and produces BC0 in slot 1
|
|
C consumes slot cycle&1 (0) with the empty buffer
|
|
|
|
+--------+ +-------+ +-------+
|
|
| A | | B | | C |
|
|
| (AB0) 0 -( )-> 0 ( ) 0 -( )-> 0 ( ) |
|
|
| 1 (AB0) 1 1 (BC0) 1 |
|
|
+--------+ +-------+ +-------+
|
|
|
|
|
|
cycle 1: A produces a buffer AB1 on the output port in the (cycle+1)&1 slot (0).
|
|
B consumes slot cycle&1 (1) with buffer AB0 and produces BC1 in slot 0
|
|
C consumes slot cycle&1 (1) with buffer BC0
|
|
|
|
+--------+ +-------+ +-------+
|
|
| A | | B | | C |
|
|
| (AB1) 0 -(AB1)-> 0 (AB0) 0 -(BC1)-> 0 (BC0) |
|
|
| 1 (AB0) 1 1 (BC0) 1 |
|
|
+--------+ +-------+ +-------+
|
|
|
|
cycle 2: A produces a buffer AB2 on the output port in the (cycle+1)&1 slot (1).
|
|
B consumes slot cycle&1 (0) with buffer AB1 and produces BC2 in slot 1
|
|
C consumes slot cycle&1 (0) with buffer BC1
|
|
|
|
+--------+ +-------+ +-------+
|
|
| A | | B | | C |
|
|
| (AB2) 0 -(AB1)-> 0 (AB1) 0 -(BC1)-> 0 (BC1) |
|
|
| 1 (AB2) 1 1 (BC2) 1 |
|
|
+--------+ +-------+ +-------+
|
|
|
|
Each async link adds 1 cycle of latency to the chain. Notice how AB0 from cycle 0,
|
|
produces BC1 in cycle 1, which arrives in node C at cycle 2.
|
|
|
|
## Latency reporting
|
|
|
|
Because the latency is really introduced by the links, the additional cycle of
|
|
latency is added when the SPA_PARAM_Latency is copied between the output and
|
|
input ports of a link.
|
|
|
|
It is possible for a sync node A to be linked to another sync node D and an
|
|
async node B:
|
|
|
|
+--------+ +-------+
|
|
| A | | B |
|
|
| (AB1) 0 -(AB1)-> 0 (AB0) 0 ...
|
|
| 1 \(AB0) 1 1
|
|
+--------+ \ +-------+
|
|
\
|
|
\ +-------+
|
|
\ | D |
|
|
-(AB1)-> 0 (AB1) |
|
|
| |
|
|
+-------+
|
|
|
|
The Output latency on A's output port is what A reports. When it copied to the
|
|
input port of B, 1 cycle is added and when it is copied to D, nothing is added.
|
|
|
|
|
|
# Remote nodes.
|
|
|
|
For remote nodes, the eventfd and the activation is transferred from the server
|
|
to the client.
|
|
|
|
This means that writing to the remote client eventfd will wake the client directly
|
|
without going to the server first.
|
|
|
|
All remote clients also get the activation and eventfd of the peer and driver they
|
|
are linked to and can directly trigger peers and drivers without going to the
|
|
server first.
|
|
|
|
## Remote driver nodes.
|
|
|
|
Remote drivers start the graph cycle directly without going to the server first.
|
|
|
|
After they complete (and only when the profiler is active), they will trigger an
|
|
extra eventfd to signal the server that the graph completed. This is used by the
|
|
server to generate the profiler info.
|
|
|
|
|
|
# Lazy scheduling
|
|
|
|
Normally, a driver will wake up the graph and all the followers need to process
|
|
the data in sync. There are cases where:
|
|
|
|
1. the follower might not be ready to process the data
|
|
2. the driver rate is not ideal, the follower rate is better
|
|
3. the driver might not know when new data is available in the follower and
|
|
might wake up the graph too often.
|
|
|
|
In these cases, the driver and follower roles need to be reversed and a mechanism
|
|
needs to be provided so that the follower can know when it is worth processing the
|
|
graph.
|
|
|
|
For notifying when the graph is ready to be processed, (non driver) nodes can send
|
|
a RequestProcess event which will arrive as a RequestProcess command in the driver.
|
|
The driver can then decide to run the graph or not.
|
|
|
|
When the graph is started or partially controlled by RequestProcess events and
|
|
commands we say we have lazy scheduling. The driver is not always scheduling according
|
|
to its own rhythm but also depending on the follower.
|
|
|
|
We can't just enable lazy scheduling when no follower will emit RequestProcess events
|
|
or when no driver will listen for RequestProcess commands. Two new node properties are
|
|
defined:
|
|
|
|
- node.supports-lazy = 0 | 1 | ...
|
|
|
|
0 means lazy scheduling as a driver is not supported
|
|
>1 means lazy scheduling as a driver is supported with increasing preference
|
|
|
|
- node.supports-request
|
|
|
|
0 means request events as a follower are not supported
|
|
>1 means request events as a follower are supported with increasing preference
|
|
|
|
We can only enable lazy scheduling when both the driver and (at least one) follower
|
|
has the node.supports-lazy and node.supports-request property respectively.
|
|
|
|
Node can end up as a driver (is_driver()) and lazy scheduling can be enabled (is_lazy()),
|
|
which results in the following cases:
|
|
|
|
driver producer
|
|
-> node.driver = true
|
|
-> is_driving() && !is_lazy()
|
|
-> calls trigger_process() to start the graph
|
|
|
|
lazy producer
|
|
-> node.driver = true
|
|
-> node.supports-lazy = 1
|
|
-> is_driving() && is_lazy()
|
|
-> listens for RequestProcess and calls trigger_process() to start the graph
|
|
|
|
requesting producer
|
|
-> node.supports-request = 1
|
|
-> !is_driving() && is_lazy()
|
|
-> emits RequestProcess to suggest starting the graph
|
|
|
|
follower producer
|
|
-> !is_driving() && !is_lazy()
|
|
|
|
|
|
driver consumer
|
|
-> node.driver = true
|
|
-> is_driving() && !is_lazy()
|
|
-> calls trigger_process() to start the graph
|
|
|
|
lazy consumer
|
|
-> node.driver = true
|
|
-> node.supports-lazy = 1
|
|
-> is_driving() && is_lazy()
|
|
-> listens for RequestProcess and calls trigger_process() to start the graph
|
|
|
|
requesting consumer
|
|
-> node.supports-request = 1
|
|
-> !is_driving() && is_lazy()
|
|
-> emits RequestProcess to suggest starting the graph
|
|
|
|
follower consumer
|
|
-> !is_driving() && !is_lazy()
|
|
|
|
|
|
Some use cases:
|
|
|
|
1. Screensharing - driver producer, follower consumer
|
|
- The producer starts the graph when a new frame is available.
|
|
- The consumer consumes the new frames.
|
|
-> throttles to the rate of the producer and idles when no frames
|
|
are available.
|
|
|
|
producer
|
|
- node.driver = true
|
|
|
|
consumer
|
|
- node.driver = false
|
|
|
|
-> producer selected as driver, consumer is simple follower.
|
|
lazy scheduling inactive (no lazy driver or no request follower)
|
|
|
|
|
|
2. headless server - requesting producer, (semi) lazy driver consumer
|
|
|
|
- The producer emits RequestProcess when new frames are available.
|
|
- The consumer requests new frames from the producer according to its
|
|
refresh rate when there are RequestProcess commands.
|
|
-> this throttles the framerate to the consumer but idles when there is
|
|
no activity on the producer.
|
|
|
|
producer
|
|
- node.driver = true
|
|
- node.supports-request = 1
|
|
|
|
consumer
|
|
- node.driver = true
|
|
- node.supports-lazy = 2
|
|
|
|
-> consumer is selected as driver (lazy > request)
|
|
lazy scheduling active (1 lazy driver and at least 1 request follower)
|
|
|
|
|
|
3. frame encoder - lazy driver producer, requesting follower consumer
|
|
|
|
- The consumer pulls a frame when it is ready to encode the next one.
|
|
- The producer produces the next frame on demand.
|
|
-> throttles the speed to the consumer without idle.
|
|
|
|
producer
|
|
- node.driver = true
|
|
- node.supports-lazy = 1
|
|
|
|
consumer
|
|
- node.driver = true
|
|
- node.supports-request = 1
|
|
|
|
-> producer is selected as driver (lazy <= request)
|
|
lazy scheduling active (1 lazy driver and at least 1 request follower)
|
|
|
|
|
|
|
|
*/
|