Commit graph

123 commits

Author SHA1 Message Date
Daniel Eklöf
f27ccd999e
grid: refactor grid_row_uri_range_erase()
The old algorithm always created a new URI, followed by (maybe)
removing the existing URI, when an URI needed to be modified.

That is, if e.g. the tail of an URI was being erased, the old
algorithm would create a new URI for the part of the URI that should
remain, and then removed the old URI.

This isn’t very effective. The new algorithm instead identifies all
possible overlap cases, and handles each one differently:

* URI ends *before* erase range starts - continue with the next URI
  without further checks
* URI starts *after* the erase range ends - return, we’re done
* Erase range erases the entire URI - remove the URI
* Erase range erases a part in the middle - split the URI
* Erase range erases the head of the URI - adjust the URI’s start
* Erase range erases the tail of the URI - adjust the URI’s end
2021-11-21 18:09:27 +01:00
Daniel Eklöf
503c2ebd50
grid: row_uri_range_erase(): assume caller has checked row->extra != NULL 2021-11-21 18:09:27 +01:00
Daniel Eklöf
1a0de0017f
grid: add grid_row_uri_range_erase()
This function handles erasing of an URI range. That is, a range of the
row is being either erased, or overwritten (from the URI perspective,
these two are the same thing).

We handle both partial overwrites (split up, or truncate URI), as well
as complete overwrites (remove URI).
2021-11-21 18:09:26 +01:00
Daniel Eklöf
fe8ca23cfe
composed: store compose chains in a binary search tree
The previous implementation stored compose chains in a dynamically
allocated array. Adding a chain was easy: resize the array and append
the new chain at the end. Looking up a compose chain given a compose
chain key/index was also easy: just index into the array.

However, searching for a pre-existing chain given a codepoint sequence
was very slow. Since the array wasn’t sorted, we typically had to scan
through the entire array, just to realize that there is no
pre-existing chain, and that we need to add a new one.

Since this happens for *each* codepoint in a grapheme cluster, things
quickly became really slow.

Things were ok:ish as long as the compose chain struct was small, as
that made it possible to hold all the chains in the cache. Once the
number of chains reached a certain point, or when we were forced to
bump maximum number of allowed codepoints in a chain, we started
thrashing the cache and things got much much worse.

So what can we do?

We can’t sort the array, because

a) that would invalidate all existing chain keys in the grid (and
iterating the entire scrollback and updating compose keys is *not* an
option).

b) inserting a chain becomes slow as we need to first find _where_ to
insert it, and then memmove() the rest of the array.

This patch uses a binary search tree to store the chains instead of a
simple array.

The tree is sorted on a “key”, which is the XOR of all codepoints,
truncated to the CELL_COMB_CHARS_HI-CELL_COMB_CHARS_LO range.

The grid now stores CELL_COMB_CHARS_LO+key, instead of
CELL_COMB_CHARS_LO+index.

Since the key is truncated, collisions may occur. This is handled by
incrementing the key by 1.

Lookup is of course slower than before, O(log n) instead of
O(1).

Insertion is slightly slower as well: technically it’s O(log n)
instead of O(1). However, we also need to take into account the
re-allocating the array will occasionally force a full copy of the
array when it cannot simply be growed.

But finding a pre-existing chain is now *much* faster: O(log n)
instead of O(n). In most cases, the first lookup will either
succeed (return a true match), or fail (return NULL). However, since
key collisions are possible, it may also return false matches. This
means we need to verify the contents of the chain before deciding to
use it instead of inserting a new chain. But remember that this
comparison was being done for each and every chain in the previous
implementation.

With lookups being much faster, and in particular, no longer requiring
us to check the chain contents for every singlec chain, we can now use
a dynamically allocated ‘chars’ array in the chain. This was
previously a hardcoded array of 10 chars.

Using a dynamic allocated array means looking in the array is slower,
since we now need two loads: one to load the pointer, and a second to
load _from_ the pointer.

As a result, the base size of a compose chain (i.e. an “empty” chain)
has now been reduced from 48 bytes to 32. A chain with two codepoints
is 40 bytes. This means we have up to 4 codepoints while still using
less, or the same amount, of memory as before.

Furthermore, the Unicode random test (i.e. write random “unicode”
chars) is now **faster** than current master (i.e. before text-shaping
support was added), **with** test-shaping enabled. With text-shaping
disabled, we’re _even_ faster.
2021-06-24 17:30:49 +02:00
Daniel Eklöf
3292bb5b8e
grid: reflow: ‘amount’ has already been added to ‘from’ 2021-06-02 20:13:52 +02:00
Daniel Eklöf
a003e56fdc
grid: reflow: URI range start: take over ownership of URI string
Instead of strdup:ing the URI, take over ownership. This is ok since
the old URI range will be free:d anyway.
2021-06-02 20:13:52 +02:00
Daniel Eklöf
8a9643de67
grid: reflow: debug logging 2021-06-02 20:13:51 +02:00
Daniel Eklöf
ceab9b9367
grid: reflow: when determining row end coord, check *last* URI range 2021-06-02 20:13:51 +02:00
Daniel Eklöf
5a08ed641b
grid: reflow: when determining row end coord, check *last* tracking point 2021-06-02 20:13:51 +02:00
Daniel Eklöf
ef1fdc40c8
grid: reflow: check the *entire* row for non-empty cells 2021-06-02 20:13:51 +02:00
Daniel Eklöf
c2314d689e
grid: reflow: avoid unnecessary if-statements before chunking a row 2021-06-02 20:13:51 +02:00
Daniel Eklöf
2029d201b5
grid: disable reflow timing by default 2021-06-02 20:13:51 +02:00
Daniel Eklöf
ac97f20f99
grid: reflow: comments 2021-06-02 20:13:51 +02:00
Daniel Eklöf
5325ea042d
grid: no need to keep the tp_col/uri_col variables around 2021-06-02 20:13:51 +02:00
Daniel Eklöf
3453f091a3
grid: fix col max calculation when row contains URI ranges 2021-06-02 20:13:51 +02:00
Daniel Eklöf
a56b54ad2f
grid: set tp/uri break flags explicitly when we know them to be true 2021-06-02 20:13:51 +02:00
Daniel Eklöf
4b7e4fb885
grid: reflow: slightly simplified logic for end-coordinate calculation 2021-06-02 20:13:51 +02:00
Daniel Eklöf
315865f18c
grid: reflow: rename _range -> range 2021-06-02 20:13:51 +02:00
Daniel Eklöf
7c3a4b24d9
grid: reflow: remove dead code 2021-06-02 20:13:50 +02:00
Daniel Eklöf
40ca86b2d3
grid: reflow: memcpy() chunks of cells, instead of single cell-by-cell
Instead of walking the old grid cell-by-cell, and checking for
tracking points, OSC-8 URIs etc on each cell, memcpy() sequences of
cells.

For each row, find the end column, by scanning backward, looking for
the first non-empty cell.

Chunk the row based on tracking point coordinates. If there aren’t any
tracking coordinates, or OSC-8 URIs on the current row, the entire row
is copied in one go.

The chunk of cells is copied to the new grid. We may have to split it
up into multiple copies, since not all cells may fit on the current
“new” row.

Care must also be taken to not line break in the middle of a
multi-column character.
2021-06-02 20:13:50 +02:00
Daniel Eklöf
b7709cc013
grid: don’t cut multi-column chars in half when resizing the alt screen
When we resize the alt screen, we don’t reflow the text, we simply
truncate all the lines.

When doing this, make sure we don’t truncate in the middle of a
multi-column character.
2021-06-02 19:32:05 +02:00
Daniel Eklöf
9a849b25cc
grid: reflow: uri-ranges: avoid looping URI ranges when reflowing
Since we know the following:

* URI ranges are sorted
* URI coordinates are unique
* URI ranges don’t cross rows

We can optimize URI range reflowing by:

* Checking if the *first* URI range’s start coordinate is on the
  current column. If so, keep a pointer to it.
* Use this pointer as source when instantiating the reflowed URI range
* If we already have a non-NULL range pointer, check its end
  coordinate instead.
* If it matches, close the *last* URI range we inserted on the new
  row, and free/remove the range from the old row.
* When line breaking, we only need to check if the *last* URI range is
  unclosed.
2021-05-23 10:29:45 +02:00
Daniel Eklöf
7272a5469e
grid: row_add_uri_range: ensure the URIs are sorted 2021-05-22 18:16:54 +02:00
Daniel Eklöf
09eefabf33
grid: disable timing of resize operations 2021-05-17 19:04:50 +02:00
Daniel Eklöf
1aa4a31c6f
grid: reflow: free old rows as soon as we’re done with them
This reduces the memory cost of reflowing text, as we no longer needs
to hold both the old and the new grid, in their entirety, in memory at
the same time.
2021-05-17 17:57:41 +02:00
Daniel Eklöf
11c7990ec8
grid: reflow: don’t initialize newly allocated rows
We’re going to write to it immediately anyway. In most cases, *all*
newly allocated, and zero-initialized, cells are overwritten.

So let’s skip the zero-initialization of the new cells.

There are two cases where we need to explicitly clear cells now:

* When inserting a hard line break - erase the remaining cells
* When done, the *last* row may not have been completely written -
  erase the remaining cells
2021-05-17 17:57:41 +02:00
Daniel Eklöf
8d1b724056
grid: reflow: qsort_r() is not portable
Replace with qsort() + global variable. Not thread safe!
2021-05-15 13:37:46 +02:00
Daniel Eklöf
aa1f589e3f
grid: include <stdlib.h>, for qsort_r() 2021-05-15 13:32:10 +02:00
Daniel Eklöf
c7e51bdf72
grid: reflow: always run qsort_r(), handle rows == 0 in tp_cmp() instead 2021-05-15 13:00:46 +02:00
Daniel Eklöf
528e91aece
grid: take scrollback start into account when sorting the tracking points array
The row numbers in the tracking points are in absolute
numbers. However, when we walk the old grid, we do so starting in the
beginning of the scrollback history.

We must ensure the tracking points are sorted such that the *first*
one we see is the “oldest” one. I.e. the one furthest back in the
scrollback history.
2021-05-15 12:54:59 +02:00
Daniel Eklöf
60a55d04ac
grid: fix 32-bit compilation 2021-05-15 12:11:58 +02:00
Daniel Eklöf
a5d7f2e592
grid: reflow: tag tracking point if-statements with likely/unlikely 2021-05-15 11:44:13 +02:00
Daniel Eklöf
0d6abf1515
grid: reflow: use a sorted array for tracking points
Instead of iterating a linked list of tracking points, for *each and
every* cell in the old grid, use a sorted array.

This allows us to step through the array of tracking points as we walk
the old grid; each time we match a tracking point, we move to the next
one.

This means we only have to check a single tracking point for each cell.
2021-05-15 11:40:39 +02:00
Daniel Eklöf
a5ec26ccc9
grid: reflow: no need to check for combining characters
Since we no longer call wcwidth(), we don’t need the base character.
2021-05-15 00:12:51 +02:00
Daniel Eklöf
8e05f42a1c
grid: don’t depend on wcwidth()
Calling wcwidth() on every character in the entire scrollback history
is slow.

We already have the character width encoded in the grid; it’s in the
CELL_SPACERs following a multi-column character.

Thus, when we see a non-SPACER character, that isn’t in the last
column, peek the next character. If it’s a SPACER, get the current
characters width from it.

The only thing we need the width for, is to be able to print padding
SPACERS in the right margin, if the there isn’t enough space on the
current row for the current character.
2021-05-14 16:32:06 +02:00
Daniel Eklöf
d9e1aefb91
term: rename CELL_MULT_COL_SPACER -> CELL_SPACER, and change its definition
Instead of using CELL_SPACER for *all* cells that previously used
CELL_MULT_COL_SPACER, include the remaining number of spacers
following, and including, itself. This is encoded by adding to the
CELL_SPACER value.

So, a double width character will now store the character itself in
the first cell (just like before), and CELL_SPACER+1 in the second
cell.

A three-cell character would store the character itself, then
CELL_SPACER+2, and finally CELL_SPACER+1.

In other words, the last spacer is always CELL_SPACER+1.

CELL_SPACER+0 is used when padding at the right margin. I.e. when
writing e.g. a double width character in the last column, we insert a
CELL_SPACER+0 pad character, and then write the double width character
in the first column on the next row.
2021-05-14 14:41:02 +02:00
Daniel Eklöf
5bec83c406
grid: add compile-time define to enable timing of the reflow operation 2021-05-14 14:30:18 +02:00
Daniel Eklöf
792202bf29
grid: snapshot: don’t mark all cells as dirty - copy state from source grid 2021-02-26 09:25:27 +01:00
Daniel Eklöf
ed47a65afc
csi: remove extra ‘;’ 2021-02-26 09:25:17 +01:00
Daniel Eklöf
a0f021b7db
grid: snapshot: copy scroll damage 2021-02-26 09:15:45 +01:00
Daniel Eklöf
2cb624ee43
grid: grid_free(): free scroll damage list 2021-02-26 09:15:45 +01:00
Daniel Eklöf
ae3ec52507
grid: add grid_snapshot()
This function deep copies a grid into a newly *allocated* grid struct.
2021-02-26 09:15:45 +01:00
Daniel Eklöf
bb74fe3f7d
grid: add grid_free() 2021-02-26 09:15:45 +01:00
Daniel Eklöf
8ffa0f731b
grid: reflow_uri_ranges(): URI end point is *inclusive*
Which means, when we match URI start and end points against the
current column index, we must *not* use ‘if...else if’, but two
‘if... if’.

Fixes an assertion when resizing a window with an URI range of just
one cell.
2021-02-25 20:54:05 +01:00
Daniel Eklöf
a0b977fcee
grid: refactor: break out allocation of ‘extra’ member to separate function 2021-02-21 20:15:32 +01:00
Daniel Eklöf
d42b129814
grid: refactor: use grid_row_add_uri_range() 2021-02-21 20:15:32 +01:00
Daniel Eklöf
5eea06cff9
grid: add new function grid_row_add_uri_range() 2021-02-21 20:15:32 +01:00
Daniel Eklöf
fd505f2274
grid: resize_without_reflow: allocate ‘extra’ on-demand on ‘new’ rows
Even if we have URI ranges on the old row, all those ranges may lay
outside the new grid’s range.
2021-02-21 20:15:32 +01:00
Daniel Eklöf
8da82c897b
grid: grid_resize_without_reflow: transfer URI ranges 2021-02-21 20:15:32 +01:00
Daniel Eklöf
3ca5a65c33
grid: reflow: translate URI ranges
URI ranges are per row. Translate by detecting URI range start/end
coordinates, and opening and closing a corresponding URI range on the
new grid.

We need to take care when line-wrapping the new grid; here we need to
manually close the still-open URI ranges (on the new grid), and
re-opening them on the next row.
2021-02-21 20:15:32 +01:00