pipewire/doc/dox/internals/scheduling.dox

/** \page page_scheduling Graph Scheduling

This document tries to explain how the PipeWire graph is scheduled.

Graphs are constructed from linked nodes together with their ports. This
results in a dependency graph between nodes. Special care is taken for
loopback links so that the graph remains a directed graph.

# Processing threads

The server (and clients) has two processing threads:

- A main thread that will do all IPC with clients and server and configure the
  nodes in the graph for processing.
- One (or more) data processing threads that only do the data processing.


The data processing threads are given realtime priority and are designed to
run with as little overhead as possible. All of the node resources such as
buffers, I/O areas and metadata will be set up in shared memory before the
node is scheduled to run.

This document describes the processing that happens in the data processing
thread after the main thread has configured it.

# Nodes

Nodes are objects with 0 or more input and output ports.

Each node also has:

- an eventfd to signal the node that it can start processing
- an activation record that lives in shared memory with memfd.

```
   eventfd
  +-^---------+
  |           |
 in          out
  |           |
  +-v---------+
    activation {
      status:OK,        // bitmask of NEED_DATA, HAVE_DATA or OK
      pending:0,	// number of unsatisfied dependencies needed to be able to run
      required:0        // number of dependencies with other nodes
    }
```

The activation record has the following information:

 - processing state and pending dependencies. As long as there are pending dependencies
   the node cannot be processed. This is the only relevant information for actually
   scheduling the graph and is shown in the above illustration.
 - Current status of the node and profiling info (TRIGGERED, AWAKE, FINISHED, timestamps
   when the node changed state).
 - Timing information, mostly for drivers when the processing started, the time, duration
   and rate (quantum) etc..
 - Information about repositions (seek) and timebase owners.


# Links

When two nodes are linked together, the output node becomes a dependency for the input
node. This means the input node can only start processing when the output node is finished.

This dependency is reflected in the required counter in the activation record. In below
illustration, B's required field is incremented with 1. The pending field is set to the
required field when the graph is started. Node A will keep a list of all targets (B) that it
is a dependency of.

This dependency update is only performed when the link is ready (negotiated) and the nodes
are ready to schedule (runnable).


```
   eventfd                                eventfd
  +-^---------+                          +-^---------+
  |           |        link              |           |
 in     A    out ---------------------> in     B    out
  |           |                          |           |
  +-v---------+                          +-v---------+
    activation {       target              activation {
      status:OK,  -------------------->      status:OK,
      pending:0,                             pending:1,
      required:0                             required:1
    }                                      }
```

Multiple links between A and B will only result in 1 target link between A and B.


# Drivers

The graph can only run if there is a driver node that is in some way linked to an
active node.

The driver is special because it will have to initiate the processing in the graph. It
will use a timer or some sort of interrupt from hardware to start the cycle.

Any node can also be a candidate for a driver (when the node.driver property is true).
PipeWire will select the node with the highest priority.driver property as the driver.

Nodes will be assigned to the driver node they will be scheduled with. Each node holds
a reference to the driver and increments the required field of the driver.

When a node is ready to be scheduled, the driver adds the node to its list of targets
and increments the required field.


```
   eventfd                                eventfd
  +-^---------+                          +-^---------+
  |           |        link              |           |
 in     A    out ---------------------> in     B    out
  |           |                          |           |
  +-v---------+                          +-v---------+
    activation {       target              activation {
      status:OK,  -------------------->      status:OK,
      pending:0,                             pending:0,
      required:1                             required:2
    }                                      }
      |    ^                         ^
      |    |                        /   /
      |    |                       /   /
      |    |                      /   /
      |    |                     /   /
      |    |                    /   /
      v    |     /-------------/   /
    activation {                  /
      status:OK, V---------------/
      pending:0,
      required:2
    }
  +-^---------+
  |           |
  |   driver  |
  |           |
  +-v---------+
   eventfd
```

As seen in the illustration above, the driver holds a link to each node it needs to
schedule and each node holds a link to the driver. Some nodes hold a link to other
nodes.

It is possible that the driver is the same as a node in the graph (for example node A)
but conceptually, the links above are still valid.

The driver will then start processing the graph by emitting the ready signal. PipeWire
will then:

 - Check the previous cycle. Did it complete? Mark xrun on unfinished nodes.
 - Perform reposition requests if any, timebase changes, etc..
 - The pending counter of each follower node is set to the required field.
 - Update the cycle counter in the driver activation io.
 - It then loops over all targets of the driver and atomically decrements the required
   field of the activation record. When the required field is 0, the eventfd is signaled
   and the node can be scheduled.

In our example above, nodes A and B will have their pending state decremented. Node A
will be 0 and will be triggered first (node B has 2 pending dependencies to start with and
will not be triggered yet). The driver itself also has 2 dependencies left and will not
be triggered (completed) yet.

## Scheduling node A

When the eventfd is signaled on a node, we say the node is triggered and it will be able
to process data. It consumes the input on the input ports and produces more data on the
output ports.

After processing, node A goes through the list of targets and decrements each pending
field (node A has a reference to B and the driver).

In our above example, the driver is decremented (from 2 to 1) but is not yet triggered.
Node B is decremented (from 1 to 0) and is triggered by writing to the eventfd.

## Scheduling node B

Node B is scheduled and processes the input from node A. It then goes through the list of
targets and decrements the pending fields. It decrements the pending field of the
driver (from 1 to 0) and triggers the driver.

## Scheduling the driver

The graph always completes after the driver is triggered and scheduled. All required
fields from all the nodes in the target list of the driver are now 0.

The driver calculates some stats about CPU time etc.

# Async scheduling

When a node has the node.async property set to true, it will be considered an async
node and will be scheduled differently.

Async nodes don't increment the pending counter of their peers and the upstream peers
also don't increment the async node pending counters. Only the driver increments the
pending counter to the async node.

This means that the async nodes do not depend on any other node and also are not a
dependency for other nodes. This also means that the async nodes can be scheduled as
soon as the driver has started the graph.

The completion of the async node does not influence the completion of the graph in
any way and async nodes are therefore interesting when real-time performance can not
be guaranteed, for example when the processing threads are not running with a real-time
priority.

A link between a port of an async node and another port (async or not) is called an
async link and will have the link.async=true property.

Because async nodes then run concurrently with other nodes, a method must be in place
to avoid concurrent access to buffer data. This is done by sending a spa_io_async_buffers
I/O to the (mixer) ports of an async link. The spa_io_async_buffers has 2 spa_io_buffer
slots.

The driver will increment a cycle counter for each cycle that it starts. Output ports
will write to the spa_io_async_buffers (cycle+1)&1 slot and input ports will read from
(cycle&1) slots. This way the async node will always consume the output of the previous
cycle and will provide data for the next cycle. They will therefore always add 1 cycle
of latency in the graph.

A special exception is made for the output ports of the driver node. When the driver is
started, the output port buffers are copied to the previous cycle spa_io_buffer slot.
This way, the async nodes will immediately pick up the new data from the driver source.

Because there are 2 buffers in flight on the spa_io_async_buffers I/O area, the link needs
to negotiate at least 2 buffers for this to work.


## Example

A, B, C are async nodes and have async links between their ports. The async
link has the spa_io_async_buffers with 2 slots (named 0 and 1) below. All the
slots are empty.

```
   +--------+          +-------+          +-------+
   | A      |          |  B    |          |    C  |
   |        0 -(   )-> 0       0 -(   )-> 0       |
   |        1  (   )   1       1  (   )   1       |
   +--------+          +-------+          +-------+
```

cycle 0: A produces a buffer AB0 on the output port in the (cycle+1)&1 slot (1).
         B consumes slot cycle&1 (0) with the empty buffer and produces BC0 in slot 1
         C consumes slot cycle&1 (0) with the empty buffer

```
   +--------+          +-------+          +-------+
   | A      |          |  B    |          |    C  |
   |  (AB0) 0 -(   )-> 0 (   ) 0 -(   )-> 0 (   ) |
   |        1  (AB0)   1       1  (BC0)   1       |
   +--------+          +-------+          +-------+
```

cycle 1: A produces a buffer AB1 on the output port in the (cycle+1)&1 slot (0).
         B consumes slot cycle&1 (1) with buffer AB0 and produces BC1 in slot 0
         C consumes slot cycle&1 (1) with buffer BC0

```
   +--------+          +-------+          +-------+
   | A      |          |  B    |          |    C  |
   |  (AB1) 0 -(AB1)-> 0 (AB0) 0 -(BC1)-> 0 (BC0) |
   |        1  (AB0)   1       1  (BC0)   1       |
   +--------+          +-------+          +-------+
```

cycle 2: A produces a buffer AB2 on the output port in the (cycle+1)&1 slot (1).
         B consumes slot cycle&1 (0) with buffer AB1 and produces BC2 in slot 1
         C consumes slot cycle&1 (0) with buffer BC1

```
   +--------+          +-------+          +-------+
   | A      |          |  B    |          |    C  |
   |  (AB2) 0 -(AB1)-> 0 (AB1) 0 -(BC1)-> 0 (BC1) |
   |        1  (AB2)   1       1  (BC2)   1       |
   +--------+          +-------+          +-------+
```

Each async link adds 1 cycle of latency to the chain. Notice how AB0 from cycle 0,
produces BC1 in cycle 1, which arrives in node C at cycle 2.

## Latency reporting

Because the latency is really introduced by the links, the additional cycle of
latency is added when the SPA_PARAM_Latency is copied between the output and
input ports of a link.

It is possible for a sync node A to be linked to another sync node D and an
async node B:

```
   +--------+          +-------+
   | A      |          |  B    |
   |  (AB1) 0 -(AB1)-> 0 (AB0) 0 ...
   |        1 \(AB0)   1       1
   +--------+  \       +-------+
                \
                 \          +-------+
                  \         |  D    |
                   -(AB1)-> 0 (AB1) |
                            |       |
                            +-------+
```

The output latency on A's output port is what A reports. When it is copied to the
input port of B, 1 cycle is added and when it is copied to D, nothing is added.


# Remote nodes

For remote nodes, the eventfd and the activation are transferred from the server
to the client.

This means that writing to the remote client eventfd will wake the client directly
without going to the server first.

All remote clients also get the activation and eventfd of the peer and driver they
are linked to and can directly trigger peers and drivers without going to the
server first.

## Remote driver nodes

Remote drivers start the graph cycle directly without going to the server first.

After they complete (and only when the profiler is active), they will trigger an
extra eventfd to signal the server that the graph completed. This is used by the
server to generate the profiler info.


# Lazy scheduling

Normally, a driver will wake up the graph and all the followers need to process
the data in sync. There are cases where:

 1. the follower might not be ready to process the data
 2. the driver rate is not ideal, the follower rate is better
 3. the driver might not know when new data is available in the follower and
    might wake up the graph too often.

In these cases, the driver and follower roles need to be reversed and a mechanism
needs to be provided so that the follower can know when it is worth processing the
graph.

For notifying when the graph is ready to be processed, (non driver) nodes can send
a RequestProcess event which will arrive as a RequestProcess command in the driver.
The driver can then decide to run the graph or not.

When the graph is started or partially controlled by RequestProcess events and
commands we say we have lazy scheduling. The driver is not always scheduling according
to its own rhythm but also depending on the follower.

We cannot just enable lazy scheduling when no follower will emit RequestProcess events
or when no driver will listen for RequestProcess commands. Two new node properties are
defined:

  - node.supports-lazy = 0 | 1 | ...

  0 means lazy scheduling as a driver is not supported
  >1 means lazy scheduling as a driver is supported with increasing preference

  - node.supports-request

  0 means request events as a follower are not supported
  >1 means request events as a follower are supported with increasing preference

 We can only enable lazy scheduling when both the driver and (at least one) follower
 have the node.supports-lazy and node.supports-request properties respectively.

 Nodes can end up as a driver (is_driver()) and lazy scheduling can be enabled (is_lazy()),
 which results in the following cases:

  driver producer
    -> node.driver = true
    ->  is_driving() && !is_lazy()
    ->  calls trigger_process() to start the graph

  lazy producer
    ->  node.driver = true
    ->  node.supports-lazy = 1
    ->  is_driving() && is_lazy()
    ->  listens for RequestProcess and calls trigger_process() to start the graph

  requesting producer
    ->  node.supports-request = 1
    ->  !is_driving() && is_lazy()
    ->  emits RequestProcess to suggest starting the graph

  follower producer
    ->  !is_driving() && !is_lazy()


  driver consumer
    -> node.driver = true
    ->  is_driving() && !is_lazy()
    ->  calls trigger_process() to start the graph

  lazy consumer
    ->  node.driver = true
    ->  node.supports-lazy = 1
    ->  is_driving() && is_lazy()
    ->  listens for RequestProcess and calls trigger_process() to start the graph

  requesting consumer
    ->  node.supports-request = 1
    ->  !is_driving() && is_lazy()
    ->  emits RequestProcess to suggest starting the graph

  follower consumer
    ->  !is_driving() && !is_lazy()


Some use cases:

 1. Screensharing - driver producer, follower consumer
    - The producer starts the graph when a new frame is available.
    - The consumer consumes the new frames.
    -> throttles to the rate of the producer and idles when no frames
       are available.

    producer
      - node.driver = true

    consumer
      - node.driver = false

    -> producer selected as driver, consumer is a simple follower.
       lazy scheduling inactive (no lazy driver or no request follower)


 2. headless server - requesting producer, (semi) lazy driver consumer

   - The producer emits RequestProcess when new frames are available.
   - The consumer requests new frames from the producer according to its
     refresh rate when there are RequestProcess commands.
   -> this throttles the framerate to the consumer but idles when there is
      no activity on the producer.

    producer
      - node.driver = true
      - node.supports-request = 1

    consumer
      - node.driver = true
      - node.supports-lazy = 2

    -> consumer is selected as driver (lazy > request)
       lazy scheduling active (1 lazy driver and at least 1 request follower)


 3. frame encoder - lazy driver producer, requesting follower consumer

   - The consumer pulls a frame when it is ready to encode the next one.
   - The producer produces the next frame on demand.
   -> throttles the speed to the consumer without idle.

    producer
      - node.driver = true
      - node.supports-lazy = 1

    consumer
      - node.driver = true
      - node.supports-request = 1

    -> producer is selected as driver (lazy <= request)
       lazy scheduling active (1 lazy driver and at least 1 request follower)


*/