At first, an OSC-8 URI range was added when we received the closing
OSC-8 escape (i.e. with an empty URI).
But, this meant that cursor movements while the OSC-8 escape was in
effect wasn’t handled correctly, since we’d add a range that spanned
the cursor movements.
Attempts were made to handle this in the cursor movement functions, by
closing and re-opening the URI.
However, there are too many corner cases to make this a viable
approach. Scrolling is one such example, line-wrapping another.
This patch takes a different approach; emit, or update the URI range
when we print to the grid. This models the intended behavior much more
closely, where an active OSC-8 URI act like any other SGR attribute -
it is applied to all cells printed to, but otherwise have no effect.
To avoid killing performance, this is only done in the “generic”
printer. This means OSC-8 open/close calls must now “switch” the ASCII
printer.
Note that the “fast” printer still needs to *erase* pre-existing OSC-8
URIs.
Closes#816
This function handles erasing of an URI range. That is, a range of the
row is being either erased, or overwritten (from the URI perspective,
these two are the same thing).
We handle both partial overwrites (split up, or truncate URI), as well
as complete overwrites (remove URI).
The previous implementation stored compose chains in a dynamically
allocated array. Adding a chain was easy: resize the array and append
the new chain at the end. Looking up a compose chain given a compose
chain key/index was also easy: just index into the array.
However, searching for a pre-existing chain given a codepoint sequence
was very slow. Since the array wasn’t sorted, we typically had to scan
through the entire array, just to realize that there is no
pre-existing chain, and that we need to add a new one.
Since this happens for *each* codepoint in a grapheme cluster, things
quickly became really slow.
Things were ok:ish as long as the compose chain struct was small, as
that made it possible to hold all the chains in the cache. Once the
number of chains reached a certain point, or when we were forced to
bump maximum number of allowed codepoints in a chain, we started
thrashing the cache and things got much much worse.
So what can we do?
We can’t sort the array, because
a) that would invalidate all existing chain keys in the grid (and
iterating the entire scrollback and updating compose keys is *not* an
option).
b) inserting a chain becomes slow as we need to first find _where_ to
insert it, and then memmove() the rest of the array.
This patch uses a binary search tree to store the chains instead of a
simple array.
The tree is sorted on a “key”, which is the XOR of all codepoints,
truncated to the CELL_COMB_CHARS_HI-CELL_COMB_CHARS_LO range.
The grid now stores CELL_COMB_CHARS_LO+key, instead of
CELL_COMB_CHARS_LO+index.
Since the key is truncated, collisions may occur. This is handled by
incrementing the key by 1.
Lookup is of course slower than before, O(log n) instead of
O(1).
Insertion is slightly slower as well: technically it’s O(log n)
instead of O(1). However, we also need to take into account the
re-allocating the array will occasionally force a full copy of the
array when it cannot simply be growed.
But finding a pre-existing chain is now *much* faster: O(log n)
instead of O(n). In most cases, the first lookup will either
succeed (return a true match), or fail (return NULL). However, since
key collisions are possible, it may also return false matches. This
means we need to verify the contents of the chain before deciding to
use it instead of inserting a new chain. But remember that this
comparison was being done for each and every chain in the previous
implementation.
With lookups being much faster, and in particular, no longer requiring
us to check the chain contents for every singlec chain, we can now use
a dynamically allocated ‘chars’ array in the chain. This was
previously a hardcoded array of 10 chars.
Using a dynamic allocated array means looking in the array is slower,
since we now need two loads: one to load the pointer, and a second to
load _from_ the pointer.
As a result, the base size of a compose chain (i.e. an “empty” chain)
has now been reduced from 48 bytes to 32. A chain with two codepoints
is 40 bytes. This means we have up to 4 codepoints while still using
less, or the same amount, of memory as before.
Furthermore, the Unicode random test (i.e. write random “unicode”
chars) is now **faster** than current master (i.e. before text-shaping
support was added), **with** test-shaping enabled. With text-shaping
disabled, we’re _even_ faster.
Alt screen applications normally reflow/readjust themselves on a
window resize.
When we do it too, the result is graphical glitches/flashes since our
re-flowed text is rendered in one frame, and the application re-flowed
text soon thereafter.
We can’t avoid rendering some kind of re-flowed frame, since we don’t
know when, or even if, the application will update itself. To avoid
glitches, we need to render, as closely as possible, what the
application itself will render shortly.
This is actually pretty simple; we just need to copy the visible
content over from the old grid to the new grid. We don’t bother with
text re-flow, but simply truncate long lines.
To simplify things, we simply cancel any active selection (since often
times, it will be corrupted anyway when the application redraws
itself).
Since we’re not reflowing text, there’s no need to translate e.g. the
cursor position - we just keep the current position (but bounded to
the new dimensions).
Fun thing: ‘less’ gets corrupted if we don’t leave the cursor at
the (new) bottom row. To handle this, we check if the cursor (before
resize) is at the bottom row, and if so, we move it to the new bottom
row.
Closes#221
* match search buffer against scrollback content
* adjust view to ensure matched content is visible
* create selection on a successful match
* finalize selection when user presses enter (to "commit" the search)
* ctrl+r searches for the next match. Needs more work though.
With this assumption, we can replace 'a % b' with 'a & (b - 1)'. In
terms of instructions, this means a fast 'and' instead of a slow
'div'.
Further optimize scrolling by:
* not double-initializing empty rows. Previously, grid_row_alloc()
called calloc(), which was then followed by a memset() when
scrolling. This is of course unnecessary.
* Don't loop the entire set of visible rows (this was done to ensure
all visible rows had been allocated, and to prefetch the cell
contents).
This isn't necessary; only newly pulled in rows can be NULL. For
now, don't prefetch at all.
The row array may now contain NULL pointers. This means the
corresponding row hasn't yet been allocated and initialized.
On a resize, we explicitly allocate the visible rows.
Uninitialized rows are then allocated the first time they are
referenced.
Can scroll up and down, and stops when the beginning/end of history is
reached.
However, it probably breaks when the entire scrollback buffer has been
filled - we need to detect when the view has wrapped around to the
current terminal offset.
The detection of when we've reached the bottom of the history is also
flawed, and only works when we overshoot the bottom with at least a
page.
Resizing the windows while in a view most likely doesn't work.
The view will not detect a wrapped around scrollback buffer. I.e. if
the user has scrolled back, and is stationary at a view, but there is
still output being produced. Then eventually the scrollback buffer
will wrap around. In this case, the correct thing to do is make the
view start following the beginning of the history. Right now it
doesn't, meaning once the scrollback buffer wraps around, you'll start
seeing command output...
The grid is now represented with an array of row *pointers*. Each row
contains an array of cells (the row's columns).
The main point of having row pointers is we can now move rows around
almost for free.
This is useful when scrolling with scroll margins for example, where
we previously had to copy the lines in the margins. Now it's just a
matter of swapping two pointers.
Vim, for example, changes the scroll region every time you scroll a
single line. Thus, resetting the damage queue is slow.
This reworks the damage handling of scroll updates:
* Split damage queue into two: one for scroll operations and one for
update/erase operations.
* Don't separate update/erase operations inside/outside the scroll
region
* Store the current scroll region in the scroll damage operation. This
allows us to stack multiple scroll operations with different scroll
regions.
* When updating update/erase operations after a scroll operation,
split the update/erase operations if necessary (the current scroll
operation may have a scroll region different from before, thus
forcing us to split existing update/erase operations.
* The renderer no longer erases after a scroll. The scroll operation
also adds an erase operation. This also means that erase operation
are subject to adjustments by later scroll operations.
This is largely untested, but existing scrolling code has been
converted to using a terminal-global scrolling region that is defined
as start-end of the scrollable region.
This is compared to the old code where the scrolling region where
defined in terms of marginals, counted in lines from top and from
bottom.
Instead of having each cell in the grid track it's own dirtiness, grid
operations now append "damage" to a list.
This list is consumed every time we render the grid.
This allows us to special case some operations, like erase (and in the
future, scroll).