Extend version 3 protocol with message footers, which are for passing
around global state data that is not addressed to a specific object.
The extension is backward compatible with previous v3 clients, and won't
e.g. result to error spam in logs.
The footer is a single SPA POD, appended after the main message POD.
Because both the protocol message and the message POD record their
length, it's possible to append trailing data. Earlier clients will
ignore any data trailing the message POD.
The footer POD contains a sequence [Id opcode, Struct {...}]*,
so there is room to extend with new opcodes later as necessary.
There are separate marshal/demarshal routines for messages aimed at
resources and proxies.
At the moment, file descriptors may be leaked
due to a malicious/buggy client:
1. If the control messages have been truncated, some file descriptors
may still have been successfully transferred. Currently, seeing
the MSG_CTRUNC bit causes `refill_buffer()` to immediately return
-EPROTO without doing anything with the control messages, which
may contain file descriptors.
2. When there is no truncation, it is still possible that the current
batch of file descriptors causes the total file descriptor count
to go over the maximum number of fds for the given buffer (currently 1024).
In this case, too, `refill_buffer()` immediately returns -EPROTO
without closing the file descriptors that can not be saved.
Fix both of these cases by closing all file descriptors in all
remaining cmsgs when one of the mentioned conditions occur.
This is more complicated than a normal module because we have two
logging topics: mod.protocol-native and conn.protocol-native for wire
messages. Because the latter use spa_debug (through spa_debug_pod) we
need to #define our way around so those too use the right topics.
Note that this removes the previous "connection" category, it is now
"conn.protocol-native" instead.
We can't recover from truncated control data so return a fatal error
that should stop the client. Truncated control data can happen when
there are no more fds available, for example.
See #1305
SPA_MEMBER is misleading, all we're doing here is pointer+offset and a
type-casting the result. Rename to SPA_PTROFF which is more expressive (and
has the same number of characters so we don't need to re-indent).
The branch should be taken if errno is neither EAGAIN,
nor EWOULDBLOCK.
Previously,
if (errno != EAGAIN || errno != EWOULDBLOCK)
would be taken for all values of errno if EAGAIN != EWOULDBLOCK.
(Except for the ones that are filtered out before.)
Fix that by changing `||` to `&&`.
The message structures returned by pw_protocol_native_connection_get_next
point to data that is contained in the buffer of the connection.
The data was invalidated when pw_protocol_native_connection_get_next was
called the next time, which made the connection loop non-reentrant, in
cases where it was re-entered from demarshal callbacks.
Fix this by allocating new buffers when reentering and stashing the old
buffers onto a stack. The returned message structure is also stored on
the stack to make lifetimes to match.
Make sure the hook lists are emptied so that the removed callbacks
are called. The callers should really remove the hook they installed
themselves but this is a last chance to fix things up.
This tool detects and fixes common English spelling mistakes, with
generally very few mistakes.
Here is the command I used to generate this commit. There were a few
changes that had to be done manually, and of course adding the ignore
file:
```shell
codespell -I .codespell-ignore -x .codespell-ignore -w
```
I didn’t add it to the CI, but this would be a good place for it.
dup the fd when added to the outgoing buffer and close it againç
when sent. This ensures the fd remains valid in the buffer. A
quick add/remove of memory before a buffer flush could close the
fd before we can send it and then we get a bad fd and disconnect
the client.
Log an error when we send an error to the client so that we don't need
to log and error anymore.
Improve the error messages when we can
Move some warnings and errors to debug
Check the type of the pod in the message instead. Old versions
should not have 0 there, new versions keep the number of file
descriptors, which should be 0 for the first message.
The proxy API is the one that we would like to expose for applications
and the other API is used internally when implementing modules or
factories.
The current pw_core object is really a context for all objects so
name it that way. It also makes it possible to rename pw_core_proxy
to pw_proxy later.
Make the connection as soon as we create the client. We create it
without file descriptor and then set it when we connect. This
makes it possible to use the connection to queue messages before
we connect.
For flatpaks we need to be able to support older v0 protocol clients.
To handle this we have:
- the connection detects an old client when it receives the first
message. It can do this by checking the sequence number, on old
versions it contains the message size and is never 0, on new
clients the sequence number is 0.
- We add a new signal at the start of the connection with the detected
version number. This installs the right version of the core proxy.
We also move the binding of the client until the hello message is
received. This way we can have a new client connect (portal),
hand over the connection to an old client, which then removes the
client binding again in the hello request with a v0 version.
There are some changes to the passing of fds in v0 vs v3 which need
to investigated some more.
- bump version of our interfaces to 3. This makes it possible to
have v0 and v3 protocol marshal functions.
- Add version number in the proxy. This is mostly automatically done
internally based on the version numbers the library is compiled
with. Where the version number was in the API before, it is now
actually used to look up the right protocol marshal functions. For
Proxies there is usually just 1 version, the current one. It is the
server that will support different versions.
- Add v0 compat marshal functions to convert from and to v0 format.
This has some complications. v0 has a type map it keeps in sync
with the server. For this we have a static type map with mappings
to our own v3 types. Pods are mostly the same except for objects
that used to have arbitrary pods in v0 vs spa_pod_prop in v3. Also
convert between v0 spa_pod_prop and v3 spa_pod_choice.
Formats and commands are also slightly different so handle those
mappings as well.
We only have marshal functions for the server side (resource)
v0 functions.
- Add v0 compatible client-node again. It's a bit tricky to map, v0
client-node basically lets the server to the mixing and teeing
and just does the processing of the internal node.
Pass a message around to make things more extensible later.
Keep fds per message if we ever want to write individual
messages.
Pass number of fds in the message header. We might need this to
close the fds when the proxy is gone.
Don't use special callback in node to receive the results. Instead,
use a generic result callback to receive the result. This makes things
a bit more symetric and generic again because then you can choose how
to match the result to the request and you have a generic way to handle
both the sync and async case. We can then also remove the wait method.
This also makes the remote interface and spa interface to objects very
similar.
Make a helper object to receive and dispatch results. Use this in the
helper for enum_params.
Make device use the same result callbacks.
Remove the done and error callbacks. The error callback is in an
error message. The done callback is replace with spa_pending.
Make enum_params take a callback and data for the results. This allows
us to push the results one after another to the app and avoids ownership
issues of the passed data. We can then extend this to handle the async
case by doing a _wait call with a spa_pending+callback+data that will
be called when the _enum_params returns and async result.
Add a sync method.
All methods can now return SPA_RESULT_IS_ASYNC return values and you
can use spa_node_wait() to register a callback when they complete
with optional extra parameters. This makes it easier to sync and
handle the reply.
Make helper methods to simulate the sync enum_params behaviour for
sync nodes.
Let the transport generate the sequence number for pw_resource_sync()
and pw_proxy_sync(). That way we don't need to keep track of numbers
ourselves and we can match the reply to the request easily.