Reflowing a large scrollback is *slow*. During an interactive resize,
it can easily take long enough that the compositor fills the Wayland
socket with configure events. Eventually, the socket becomes full and
the compositor terminates the connection, causing foot to exit.
This patch is work-in-progress, and the first step towards alleviating
this.
It delays the reflow by:
* Snapshotting (copying) the original grid when an interactive resize
is started.
* While resizing, we apply a simple truncation resize of the
grid (like we handle the alt screen).
* When the resize is done, or paused for ‘resize-delay-ms’, the grid
is reflowed.
TODO: we *must* not allow any changes to the temporary (truncated)
grid during the resize. Any changes to the grid would be lost when the
final reflow is applied. That is, we must completely pause the ptmx
pipe while a resize is in progress.
Future improvements:
The initial copy can be slow. We should be able to avoid it by
rewriting the reflow algorithm to not free anything. This is
complicated by the fact that some resources (e.g. sixel images) are
currently *moved* to the new grid. They’d instead have to be copied.
Before this patch, we would line-wrap the last row, just like any
other row, and then afterwards try to reverse this, by adjusting the
offset and free:ing and NULL:ing the "last row".
The problem with this is if the scrollback is full. In this case, the
row we’re freeing is the first row in the scrollback history. This
means we’ll crash as soon as the viewport is moved to the top of the
scrollback.
The fix is fairly, simple. Skip the post-processing logic, and instead
detect when we’re line-wrapping the last row, and skip the call to
line_wrap().
This way, the last row in the new grid corresponds to the last row in
the old grid.
Do this by using scrollback relative coordinates, and ensure the new
viewport is not larger than (grid_rows - screen_rows), as that would
mean the viewport crosses the scrollback wrap-around.
When the window is resized and we reflow the text, we ended up
inserting an empty row at the bottom.
This happens whenever the actual last row has a hard linebreak (which
almost always is the case); we then end the reflow with a line break,
causing an extra, empty, row to be allocated and inserted.
This patch fixes this by detecting when:
1) the last row is empty
2) the next to last row has a hard line break
In this case, we roll back the last line break, by adjusting the new
offset we just calculated, and free:ing the empty row.
TODO: it would be nice if we could detect this in the reflow loop
instead, and avoid doing the last line break all together. I haven’t
yet been able to find a way to do this correctly.
Closes#1108
When reflowing the grid, we truncate lines with a hard linebreak after
the last non-empty cell. This way we don’t reflow trailing empty cells
to a new line when resizing the window to a smaller size.
However, “logical” lines (i.e. those without a hard linebreak)
are *not* truncated. This is to ensure we don’t trim empty cells in
the middle of a logical line spanning multiple physical lines.
Since newly allocated rows are initialized with linebreak=false, we
need to ensure _those_ are still truncated - otherwise all that empty
space under the current prompt will be reflowed.
Note that this is a temporary workaround. The correct solution, I
think, is to track whether a line has been printed to or not, and
simply ignore (not reflow) lines that haven’t yet been touched.
This patch adds support for the OSC-133;A sequence, introduced by
FinalTerm and implemented by iTerm2, Kitty and more. See
https://iterm2.com/documentation-one-page.html#documentation-escape-codes.html.
The shell emits the OSC just before printing the prompt. This lets the
terminal know where, in the scrollback, there are prompts.
We implement this using a simple boolean in the row struct ("this row
has a prompt"). The prompt marker must be reflowed along with the text
on window resizes.
In an ideal world, erasing, or overwriting the cell where the OSC was
emitted, would remove the prompt mark. Since we don't store this
information in the cell struct, we can't do that. The best we can do
is reset it in erase_line(). This works well enough in the "normal"
screen, when used with a "normal" shell. It doesn't really work in
fullscreen apps, on the alt screen. But that doesn't matter since we
don't support jumping between prompts on the alt screen anyway.
To be able to jump between prompts, two new key bindings have been
added: prompt-prev and prompt-next, bound to ctrl+shift+z and
ctrl+shift+x respectively.
prompt-prev will jump to the previous, not currently visible, prompt,
by moving the viewport, ensuring the prompt is at the top of the
screen.
prompt-next jumps to the next prompt, visible or not. Again, by moving
the viewport to ensure the prompt is at the top of the screen. If
we're at the bottom of the scrollback, the viewport is instead moved
as far down as possible.
Closes#30
When a line in the old grid (the one being reflowed) doesn’t have a
hard linebreak, don’t trim trailing empty cells.
Doing so means we’ll “compress” (remove) empty cells between text
if/when we revert to a larger window size.
The output from neofetch suffers from this; it prints a logo to the
left, and system information to the right. The logo and the system
info column is separated by empty cells (i.e. *not* spaces).
If the window is reduced in size such that the system info is pushed
to a new line, each logo line ends with a number of empty cells. The
next time the window is resized, these empty cells were
ignored (i.e. removed).
That meant that once the window was enlarged again, the system info
column was a) no longer aligned, and b) had been pulled closer to the
logo.
This patch doesn’t special case trailing empty cells when the line
being reflowed doesn’t have a hard linebreak. This means e.g. ‘ls’
output is unaffected.
Closes#1055
These functions convert row numbers between absolute coordinates and
“scrollback relative” coordinates.
Absolute row numbers can be used to index into the grid->rows[] array.
Scrollback relative numbers are ordered with the *oldest* row first,
and the *newest* row last. That is, in these coordinates, row 0 is the
*first* (oldest) row in the scrollback history, and row N is the
*last* (newest) row.
Scrollback relative numbers are used when we need to sort things after
their age, when determining if something has scrolled out, or when
limiting an operation to ensure we don’t go past the scrollback
wrap-around.
POSIX.1-2008 has marked gettimeofday(2) as obsolete, recommending the
use of clock_gettime(2) instead.
CLOCK_MONOTONIC has been used instead of CLOCK_REALTIME because it is
unaffected by manual changes in the system clock. This makes it better
for our purposes, namely, measuring the difference between two points in
time.
tv_sec has been casted to long in most places since POSIX does not
define the actual type of time_t.
Inserting elements into the URI range vector typically triggers a
vector resize. This is done using realloc(). Sometimes this causes the
vector to move, thus invalidating all existing pointers into the
vector.
The URI ranges are now an array, which means we can get the next URI
range by incrementing the pointer.
Instead of checking if range is not NULL, check that it isn’t the
range terminator (which is NULL when we don’t have any ranges, and
the first address *after* the last range otherwise).
grid_row_uri_range_add() was only used while reflowing. In this case,
we know the new URIs being added are always going at the end of the
URI list (since we’re going top-to-bottom, left-to-right).
Thus, we don’t need the insertion logic, and can simply append instead.
At first, an OSC-8 URI range was added when we received the closing
OSC-8 escape (i.e. with an empty URI).
But, this meant that cursor movements while the OSC-8 escape was in
effect wasn’t handled correctly, since we’d add a range that spanned
the cursor movements.
Attempts were made to handle this in the cursor movement functions, by
closing and re-opening the URI.
However, there are too many corner cases to make this a viable
approach. Scrolling is one such example, line-wrapping another.
This patch takes a different approach; emit, or update the URI range
when we print to the grid. This models the intended behavior much more
closely, where an active OSC-8 URI act like any other SGR attribute -
it is applied to all cells printed to, but otherwise have no effect.
To avoid killing performance, this is only done in the “generic”
printer. This means OSC-8 open/close calls must now “switch” the ASCII
printer.
Note that the “fast” printer still needs to *erase* pre-existing OSC-8
URIs.
Closes#816
The old algorithm always created a new URI, followed by (maybe)
removing the existing URI, when an URI needed to be modified.
That is, if e.g. the tail of an URI was being erased, the old
algorithm would create a new URI for the part of the URI that should
remain, and then removed the old URI.
This isn’t very effective. The new algorithm instead identifies all
possible overlap cases, and handles each one differently:
* URI ends *before* erase range starts - continue with the next URI
without further checks
* URI starts *after* the erase range ends - return, we’re done
* Erase range erases the entire URI - remove the URI
* Erase range erases a part in the middle - split the URI
* Erase range erases the head of the URI - adjust the URI’s start
* Erase range erases the tail of the URI - adjust the URI’s end
This function handles erasing of an URI range. That is, a range of the
row is being either erased, or overwritten (from the URI perspective,
these two are the same thing).
We handle both partial overwrites (split up, or truncate URI), as well
as complete overwrites (remove URI).
The previous implementation stored compose chains in a dynamically
allocated array. Adding a chain was easy: resize the array and append
the new chain at the end. Looking up a compose chain given a compose
chain key/index was also easy: just index into the array.
However, searching for a pre-existing chain given a codepoint sequence
was very slow. Since the array wasn’t sorted, we typically had to scan
through the entire array, just to realize that there is no
pre-existing chain, and that we need to add a new one.
Since this happens for *each* codepoint in a grapheme cluster, things
quickly became really slow.
Things were ok:ish as long as the compose chain struct was small, as
that made it possible to hold all the chains in the cache. Once the
number of chains reached a certain point, or when we were forced to
bump maximum number of allowed codepoints in a chain, we started
thrashing the cache and things got much much worse.
So what can we do?
We can’t sort the array, because
a) that would invalidate all existing chain keys in the grid (and
iterating the entire scrollback and updating compose keys is *not* an
option).
b) inserting a chain becomes slow as we need to first find _where_ to
insert it, and then memmove() the rest of the array.
This patch uses a binary search tree to store the chains instead of a
simple array.
The tree is sorted on a “key”, which is the XOR of all codepoints,
truncated to the CELL_COMB_CHARS_HI-CELL_COMB_CHARS_LO range.
The grid now stores CELL_COMB_CHARS_LO+key, instead of
CELL_COMB_CHARS_LO+index.
Since the key is truncated, collisions may occur. This is handled by
incrementing the key by 1.
Lookup is of course slower than before, O(log n) instead of
O(1).
Insertion is slightly slower as well: technically it’s O(log n)
instead of O(1). However, we also need to take into account the
re-allocating the array will occasionally force a full copy of the
array when it cannot simply be growed.
But finding a pre-existing chain is now *much* faster: O(log n)
instead of O(n). In most cases, the first lookup will either
succeed (return a true match), or fail (return NULL). However, since
key collisions are possible, it may also return false matches. This
means we need to verify the contents of the chain before deciding to
use it instead of inserting a new chain. But remember that this
comparison was being done for each and every chain in the previous
implementation.
With lookups being much faster, and in particular, no longer requiring
us to check the chain contents for every singlec chain, we can now use
a dynamically allocated ‘chars’ array in the chain. This was
previously a hardcoded array of 10 chars.
Using a dynamic allocated array means looking in the array is slower,
since we now need two loads: one to load the pointer, and a second to
load _from_ the pointer.
As a result, the base size of a compose chain (i.e. an “empty” chain)
has now been reduced from 48 bytes to 32. A chain with two codepoints
is 40 bytes. This means we have up to 4 codepoints while still using
less, or the same amount, of memory as before.
Furthermore, the Unicode random test (i.e. write random “unicode”
chars) is now **faster** than current master (i.e. before text-shaping
support was added), **with** test-shaping enabled. With text-shaping
disabled, we’re _even_ faster.
Instead of walking the old grid cell-by-cell, and checking for
tracking points, OSC-8 URIs etc on each cell, memcpy() sequences of
cells.
For each row, find the end column, by scanning backward, looking for
the first non-empty cell.
Chunk the row based on tracking point coordinates. If there aren’t any
tracking coordinates, or OSC-8 URIs on the current row, the entire row
is copied in one go.
The chunk of cells is copied to the new grid. We may have to split it
up into multiple copies, since not all cells may fit on the current
“new” row.
Care must also be taken to not line break in the middle of a
multi-column character.
When we resize the alt screen, we don’t reflow the text, we simply
truncate all the lines.
When doing this, make sure we don’t truncate in the middle of a
multi-column character.
Since we know the following:
* URI ranges are sorted
* URI coordinates are unique
* URI ranges don’t cross rows
We can optimize URI range reflowing by:
* Checking if the *first* URI range’s start coordinate is on the
current column. If so, keep a pointer to it.
* Use this pointer as source when instantiating the reflowed URI range
* If we already have a non-NULL range pointer, check its end
coordinate instead.
* If it matches, close the *last* URI range we inserted on the new
row, and free/remove the range from the old row.
* When line breaking, we only need to check if the *last* URI range is
unclosed.