It's possible, but unlikely, that we've pushed a "pre-apply damage"
job to the renderer thread queue (or that we've pushed it, and the
a thread is now working on it) when we shutdown a terminal instance.
This is sometimes caught in an assertion in term_destroy(), where we
check the queue length is 0. Other times, or in release builds, we
might crash in the thread, or in the shutdown logic when freeing the
buffer chains associated with the terminal instance.
Fix by ensuring there's no pre-apply damage operation queued, or
running, before shutting down a terminal instance.
Closes#2263
Foot likes it when compositor releases buffer immediately, as that
means we only have to re-render the cells that have changed since the
last frame.
For various reasons, not all compositors do this. In this case, foot
is typically forced to switch between two buffers, i.e. double-buffer.
In this case, each frame starts with copying over the damage from the
previous frame, to the new frame. Then we start rendering the updated
cells.
Bringing over the previous frame's damage can be slow, if the changed
area was large (e.g. when scrolling one or a few lines, or on full
screen updates). It's also done single-threaded. Thus it not only
slows down frame rendering, but pauses everything else (i.e. input
processing). All in all, it reduces performance and increases input
latency.
But we don't have to wait until it's time to render a frame to copy
over the previous frame's damage. We can do that as soon as the
compositor has released the buffer (for the frame _before_ the
previous frame). And we can do this in a thread.
This frees up foot to continue processing input, and reduces frame
rendering time since we can now start rendering the modified cells
immediately, without first doing a large memcpy(3).
In worst case scenarios (or perhaps we should consider them best case
scenarios...), I've seen up to a 10x performance increase in frame
rendering times (this obviously does *not* include the time it takes
to copy over the previous frame's damage, since that doesn't affect
neither input processing nor frame rendering).
Implemented by adding a callback mechanism to the shm abstraction
layer. Use it for the grid buffers, and kick off a thread that copies
the previous frame's damage, and resets the buffers age to 0 (so that
foot understands it can start render to it immediately when it later
needs to render a frame).
Since we have certain way of knowing if a compositor releases buffers
immediately or not, use a bit of heuristics; if we see 10 consecutive
non-immediate releases (that is, we reset the counter as soon as we do
see an immediate release), this new "pre-apply damage" logic is
enabled. It can be force-disabled with tweak.pre-apply-damage=no.
We also need to take care to wait for the thread before resetting the
render's "last_buf" pointer (or we'll SEGFAULT in the thread...).
We must also ensure we wait for the thread to finish before we start
rendering a new frame. Under normal circumstances, the wait time is
always 0, the thread has almost always finished long before we need to
render the next frame. But it _can_ happen.
Closes#2188
When set to 'auto', use 10-bit surfaces if gamma-correct blending is
enabled, and 8-bit surfaces otherwise.
Note that we may still fallback to 8-bit surfaces (without disabling
gamma-correct blending) if the compositor does not support 10-bit
surfaces.
Closes#2082
This implements gamma-correct blending, which mainly affects font
rendering.
The implementation requires compile-time availability of the new
color-management protocol (available in wayland-protocols >= 1.41),
and run-time support for the same in the compositor (specifically, the
EXT_LINEAR TF function and sRGB primaries).
How it works: all colors are decoded from sRGB to linear (using a
lookup table, generated in the exact same way pixman generates it's
internal conversion tables) before being used by pixman. The resulting
image buffer is thus in decoded/linear format. We use the
color-management protocol to inform the compositor of this, by tagging
the wayland surfaces with the 'ext_linear' image attribute.
Sixes: all colors are sRGB internally, and decoded to linear before
being used in any sixels. Thus, the image buffers will contain linear
colors. This is important, since otherwise there would be a
decode/encode penalty every time a sixel is blended to the grid.
Emojis: we require fcft >= 3.2, which adds support for sRGB decoding
color glyphs. Meaning, the emoji pixman surfaces can be blended
directly to the grid, just like sixels.
Gamma-correct blending is enabled by default *when the compositor
supports it*. There's a new option to explicitly enable/disable it:
gamma-correct-blending=no|yes. If set to 'yes', and the compositor
does not implement the required color-management features, warning
logs are emitted.
There's a loss of precision when storing linear pixels in 8-bit
channels. For this reason, this patch also adds supports for 10-bit
surfaces. For now, this is disabled by default since such surfaces
only have 2 bits for alpha. It can be enabled with
tweak.surface-bit-depth=10-bit.
Perhaps, in the future, we can enable it by default if:
* gamma-correct blending is enabled
* the user has not enabled a transparent background
* The toplevel icon is now set to the app-id, unless "overridden" by
OSC-1 or OSC-0.
* Implemented OSC-1
* OSC-0 extended to also set the icon
* Implemented CSI 20 t - report window icon
* Implemented CSI 21 t - report window title
* Implemented CSI 22 ; 1 t - push window icon
* Implemented CS 23 ; 1 t - pop window icon
* Extended CSI 22/23 ; 0 t to also push/pop the icon
* Verify app-id set by OSC-176 is valid UTF-8
* Verify icon set by OSC-0/1 is valid UTF-8
This adds support for a new OSC escape sequence: OSC 176, that lets
terminal programs tell the terminal the name of the app that is
running. foot then sets the app ID of the toplevel to that ID,
which lets the compositor know which app is running, and typically
sets the appropriate icon, window grouping, ...
See: https://gist.github.com/delthas/d451e2cc1573bb2364839849c7117239
POSIX.1-2008 has marked gettimeofday(2) as obsolete, recommending the
use of clock_gettime(2) instead.
CLOCK_MONOTONIC has been used instead of CLOCK_REALTIME because it is
unaffected by manual changes in the system clock. This makes it better
for our purposes, namely, measuring the difference between two points in
time.
tv_sec has been casted to long in most places since POSIX does not
define the actual type of time_t.
* xcursor always set for all pointers
* xcursor sometimes not updated when it should be
* mouse grabbed state wasn't per seat, but global (i.e. "does at least
one seat enable mouse grabbing")
* selection enabled state wasn't per seat
Handle the CSDs and the search box the same way we handle the main
grid; when we need to redraw them, call
render_refresh_{csd,search}(). This sets a flag that is checked after
each FDM iteration. All actual rendering is done here.
This also ties the commits of the Wayland sub-surfaces to the commit
of the main surface.
With a bad behaving client (e.g. 'less' with mouse support enabled),
we can end up with a *lot* of xcursor updates (so much, that we
flooded the wayland socket before we implemented a blocking
wayl_flush()).
Since there's little point in updating the cursor more than once per
frame interval, use frame callbacks to throttle the updates.
This works more or lesslike normal terminal refreshes:
render_xcursor_set() stores the last terminal (window) that had (and
updated) the cursor.
The renderer's FDM hook checks if we have such a pending terminal set,
and if so, tries to refresh the cursor.
This is done by first checking if we're already waiting for a callback
from a previous cursor update, and if so we do nothing; the callback
will update the cursor for the next frame. If we're *not* already
waiting for a callback, we update the cursor immediately.
In some cases, we end up calling render_refresh() multiple times in
the same FDM iteration. This means will render the first update
immediately, and then set the 'pending' flag, causing the updated
content to be rendered in the next frame.
This can cause flicker, or flashes, since we're presenting one or more
intermediate frames until the final content is shown.
Not to mention that it is inefficient to render multiple frames like
this.
Fix by:
* render_refresh() only sets a flag in the terminal
* install an FDM hook; this hook loops all terminals and executes what
render_refresh() _used_ to do (that is, render immediately if we're
not waiting for a frame callback, otherwise set 'pending' flag). for
all terminals that have the 'refresh_needed' flag set.
Our surface may be on multiple outputs at the same time. In this case,
we use the largest scale factor, and let the compositor down scale on
the "other" output(s).
To avoid having to re-generate glyphs, cache the glyphs.
For now, we only cache ASCII characters, as this allows us to lookup
the cache by simply indexing with the character (into a 256-entry
array).