mirrors/foot - Forgejo: Beyond coding. We Forge.

mirror of https://codeberg.org/dnkl/foot.git synced 2026-02-05 04:06:08 -05:00

Author	SHA1	Message	Date
Daniel Eklöf	69e2bff8c8	extract: ensure line-based selections are terminated with a newline Closes #869	2022-02-03 17:58:25 +01:00
Daniel Eklöf	fe8ca23cfe	composed: store compose chains in a binary search tree The previous implementation stored compose chains in a dynamically allocated array. Adding a chain was easy: resize the array and append the new chain at the end. Looking up a compose chain given a compose chain key/index was also easy: just index into the array. However, searching for a pre-existing chain given a codepoint sequence was very slow. Since the array wasn’t sorted, we typically had to scan through the entire array, just to realize that there is no pre-existing chain, and that we need to add a new one. Since this happens for each codepoint in a grapheme cluster, things quickly became really slow. Things were ok:ish as long as the compose chain struct was small, as that made it possible to hold all the chains in the cache. Once the number of chains reached a certain point, or when we were forced to bump maximum number of allowed codepoints in a chain, we started thrashing the cache and things got much much worse. So what can we do? We can’t sort the array, because a) that would invalidate all existing chain keys in the grid (and iterating the entire scrollback and updating compose keys is not an option). b) inserting a chain becomes slow as we need to first find _where_ to insert it, and then memmove() the rest of the array. This patch uses a binary search tree to store the chains instead of a simple array. The tree is sorted on a “key”, which is the XOR of all codepoints, truncated to the CELL_COMB_CHARS_HI-CELL_COMB_CHARS_LO range. The grid now stores CELL_COMB_CHARS_LO+key, instead of CELL_COMB_CHARS_LO+index. Since the key is truncated, collisions may occur. This is handled by incrementing the key by 1. Lookup is of course slower than before, O(log n) instead of O(1). Insertion is slightly slower as well: technically it’s O(log n) instead of O(1). However, we also need to take into account the re-allocating the array will occasionally force a full copy of the array when it cannot simply be growed. But finding a pre-existing chain is now much faster: O(log n) instead of O(n). In most cases, the first lookup will either succeed (return a true match), or fail (return NULL). However, since key collisions are possible, it may also return false matches. This means we need to verify the contents of the chain before deciding to use it instead of inserting a new chain. But remember that this comparison was being done for each and every chain in the previous implementation. With lookups being much faster, and in particular, no longer requiring us to check the chain contents for every singlec chain, we can now use a dynamically allocated ‘chars’ array in the chain. This was previously a hardcoded array of 10 chars. Using a dynamic allocated array means looking in the array is slower, since we now need two loads: one to load the pointer, and a second to load _from_ the pointer. As a result, the base size of a compose chain (i.e. an “empty” chain) has now been reduced from 48 bytes to 32. A chain with two codepoints is 40 bytes. This means we have up to 4 codepoints while still using less, or the same amount, of memory as before. Furthermore, the Unicode random test (i.e. write random “unicode” chars) is now faster than current master (i.e. before text-shaping support was added), with test-shaping enabled. With text-shaping disabled, we’re _even_ faster.	2021-06-24 17:30:49 +02:00
Daniel Eklöf	b9ef703eb1	wip: grapheme shaping	2021-06-24 17:30:45 +02:00
Daniel Eklöf	4d56dd430b	extract: consume spaces following a tab That is, we choose to copy the tab character, and not the spaces it represents. Most importantly, we don’t copy both the tab and the spaces.	2021-06-08 19:53:26 +02:00
Daniel Eklöf	a6d9f01c0d	extract: move ‘strip_trailing_empty’ parameter from extra_finish() to extract_begin()	2021-05-17 18:14:10 +02:00
Daniel Eklöf	1bc9fd5fe1	extract: add extract_finish_wide(), and optionally skip stripping trailing empty cells extract_finish() returns the extracted text in UTF-8, while extract_finish_wide() returns the extracted text in Unicode. This patch also adds a new argument to extract_finish{,_wide}, that when set to true, skips stripping trailing empty cells.	2021-05-17 18:14:09 +02:00
Daniel Eklöf	d9e1aefb91	term: rename CELL_MULT_COL_SPACER -> CELL_SPACER, and change its definition Instead of using CELL_SPACER for all cells that previously used CELL_MULT_COL_SPACER, include the remaining number of spacers following, and including, itself. This is encoded by adding to the CELL_SPACER value. So, a double width character will now store the character itself in the first cell (just like before), and CELL_SPACER+1 in the second cell. A three-cell character would store the character itself, then CELL_SPACER+2, and finally CELL_SPACER+1. In other words, the last spacer is always CELL_SPACER+1. CELL_SPACER+0 is used when padding at the right margin. I.e. when writing e.g. a double width character in the last column, we insert a CELL_SPACER+0 pad character, and then write the double width character in the first column on the next row.	2021-05-14 14:41:02 +02:00
Craig Barnes	e56136ce11	debug: rename assert() to xassert(), to avoid clashing with <assert.h>	2021-01-16 20:16:00 +00:00
Daniel Eklöf	3a9172342f	selection: combine enum selection_kind with selection_semantic	2021-01-06 10:53:27 +01:00
Daniel Eklöf	fd5d68c819	extract: finish: increase ‘idx’ when pushing new data, for consistency We don’t write anything more to the buffer after this, but this makes this code consistent with all other code that pushes new data to the buffer. This makes it easier to search, and validate, the ensure_size()+push-data pattern.	2021-01-04 19:48:44 +01:00
Daniel Eklöf	07078da0f0	extract: finish: fix bad assertion - ‘idx’ may be equal to ‘size’ ‘idx’ is where _new_ data should be pushed into the buffer. Thus it is perfectly valid for it to be equal to ‘size’ - it just means we need to allocate more space before pushing data to it.	2021-01-04 19:48:44 +01:00
Daniel Eklöf	f85cf47b65	extract: only emit newlines if followed by non-empty cells Closes #97	2020-08-20 19:22:13 +02:00
Craig Barnes	7a77958ba2	Convert most dynamic allocations to use functions from xmalloc.h	2020-08-08 20:37:57 +01:00
Daniel Eklöf	d4ee9be4d7	config: add 'hide-when-typing' When enabled, the mouse cursor is hidden when the user types in the terminal. It is un-hidden when the user moves the mouse, or when the window loses keyboard focus.	2020-07-31 17:09:06 +02:00
Daniel Eklöf	e5401c845c	extract: finish: allocate buffer before writing the terminator When all cells were empty, we'll have written 0 bytes to the buffer and hence it will still be un-allocated (i.e. NULL).	2020-07-16 17:46:02 +02:00
Daniel Eklöf	e6acafa118	extract: extract_one() sets a fail flag that extract_finish() reads This allows us to safely call extract_finish() when extract_one() failed, and we'll get the expected result; false, indicating extract_finish() failed.	2020-07-15 11:32:40 +02:00
Daniel Eklöf	2539e3cbb2	extract: extract_one: make arguments const	2020-07-15 11:31:38 +02:00
Daniel Eklöf	aafa120f92	selection: refactor: break out text extraction to a separate file	2020-07-15 11:19:18 +02:00

18 commits