mirrors/foot - Forgejo: Beyond coding. We Forge.

mirror of https://codeberg.org/dnkl/foot.git synced 2026-02-05 04:06:08 -05:00

Author	SHA1	Message	Date
Craig Barnes	2a75da4143	Merge branch 'charset-shift-fixes'	2021-06-09 10:18:52 +01:00
Craig Barnes	e030a2ca08	terminal: add 'charset_designator' enum to make code more self-documenting This commit also renames the term_set_single_shift_ascii_printer() function to term_single_shift(), since the former is overly verbose and not really even accurate.	2021-06-09 10:00:25 +01:00
Craig Barnes	a2c9c56f19	vt: fix SS2/SS3 escape sequences to act correctly as single shifts These sequences are supposed to affect the next printable ASCII character and then reset to the previous character set, but before this commit they were behaving like locking shifts.	2021-06-08 21:09:40 +01:00
Craig Barnes	e72e8b1b8e	vt: add support for LS2 and LS3 locking shifts	2021-06-08 21:06:18 +01:00
Daniel Eklöf	9d3351472d	vt: TAB: don’t print a ‘\t’ to the grid if the current cell isn’t empty If the cursor is already at the right edge, our logic that checked for non-empty cells failed; it didn’t check the current cell. Fix by initializing ‘emit_tab_char’ to true/false, depending on the contents of the current cell.	2021-06-08 19:53:26 +02:00
Daniel Eklöf	94b549f93e	vt: emit a tab character if all cells between cursor and tab stop are empty TAB (\t) move the cursor to the next tab stop. That’s it, according to the specification. However, many terminal emulators try to keep tabs in the grid, to be able to e.g. copy them. That is, copying a text chunk containing tabs should result in tabs being pasted, not spaces. In order to do that, we need to print a tab character to the grid. To improve text reflow of tabs, we also print spaces to the subsequent cells, up until (but not including) the next tab stop. However, we can only do this if all the cells between the cursor and the next tab stop are empty, since (obviously), we cannot overwrite pre-existing characters. Finally, while some fonts render tabs as spaces (i.e. an empty glyph), some use a glyph representing “unprintable” characters, or similar. Thus, we need to exclude cells with tab characters when rendering.	2021-06-08 19:53:26 +02:00
Craig Barnes	620fe8e764	vt: fix buggy chains of ternary expressions in action_esc_dispatch() Only the first character in the chain was being compared with `priv` and the rest were just being evaluated as simple expressions. This was causing the G2 and G3 operations to erroneously use the G1 index. Since the characters are a contiguous range, we can just subtract the start of the range to get the appropriate index. The outer switch statement already ensures the values are in range.	2021-06-08 16:52:00 +01:00
Daniel Eklöf	95c4a8ccfb	vt: \E#8: print ‘E’ using the default attributes	2021-06-07 21:35:17 +02:00
Craig Barnes	f14b294dcc	vt: remove action_utf8_print(term, 0) calls from UTF-8 state handlers These calls appear to be left over from a previous refactoring of the code. Calling this function with `wc == 0` is a no-op.	2021-05-25 21:45:55 +01:00
Craig Barnes	14a55de4e7	vt: remove partial support for 8-bit C1 control chars These are part of the "anywhere" state in Paul Flo Williams' VT parser state diagram[1]. That means that they should be accepted anywhere in a byte sequence, including in the middle of other sequences or even in the middle of a multi-byte UTF-8 sequence. Adhering to this requirement makes them incompatible with the use of UTF-8 as a universal encoding. Not adhering to the aforementioned requirement by making a special case for UTF-8 sequences may seem tempting, but it's much more at odds with the relevant standards[2] than it appears on the surface. UTF-8 is not an "8-bit code", at least not according to the parlance of ECMA-43, nor does it map the C1 control range in a compatible way. [1]: https://vt100.net/emu/dec_ansi_parser [2]: ECMA-35, ECMA-43, ECMA-48	2021-05-25 21:37:38 +01:00
Daniel Eklöf	3405a9c81c	Merge branch 'reflow-performance' Part of #504	2021-05-16 18:48:19 +02:00
Craig Barnes	d37b2a7f7b	Update `term->vt.state` for each iteration of vt_from_slave() loop Otherwise it may be stale when read by the anywhere() function.	2021-05-15 19:20:36 +01:00
Daniel Eklöf	d9e1aefb91	term: rename CELL_MULT_COL_SPACER -> CELL_SPACER, and change its definition Instead of using CELL_SPACER for all cells that previously used CELL_MULT_COL_SPACER, include the remaining number of spacers following, and including, itself. This is encoded by adding to the CELL_SPACER value. So, a double width character will now store the character itself in the first cell (just like before), and CELL_SPACER+1 in the second cell. A three-cell character would store the character itself, then CELL_SPACER+2, and finally CELL_SPACER+1. In other words, the last spacer is always CELL_SPACER+1. CELL_SPACER+0 is used when padding at the right margin. I.e. when writing e.g. a double width character in the last column, we insert a CELL_SPACER+0 pad character, and then write the double width character in the first column on the next row.	2021-05-14 14:41:02 +02:00
Craig Barnes	e4ff8d83d1	vt: make anywhere() function return `term->vt.state` by default Instead of passing a `default_return` parameter, which is always just the current state anyway.	2021-05-13 07:47:32 +01:00
Craig Barnes	8bb69f22b7	vt: clean up handling of "anywhere" actions	2021-05-13 07:47:26 +01:00
Daniel Eklöf	5be2c53d8c	term/vt: only do reverse-wrapping (‘bw’) on cub1 Foot currently does reverse-wrapping (‘auto_left_margin’, or ’bw’) on everything that calls ‘term_cursor_left()’. This is wrong; it should only be done for cub1. From man terminfo: auto_left_margin \| bw \| bw \| cub1 wraps from column 0 to last column This patch moves the reverse-wrapping logic from term_cursor_left() to the handling of BS (backspace). Closes #441	2021-04-08 13:11:58 +02:00
Daniel Eklöf	60b3ccc641	term: runtime switch between a ‘fast’ and a ‘generic’ ASCII print function term_print() is called whenever the client application “prints” something to the grid. It is called for both ASCII and UTF-8 characters, and needs to handle sixels, insert mode and ASCII vs. graphical charsets. Since it’s on the hot path, this becomes unnecessarily slow. This patch adds a “fast” version of term_print(), tailored for the common case: ASCII characters in non-insert mode, without any sixels and non-graphical charsets. A new function, term_update_ascii_printer(), has been added, and must be called whenever: * The currently selected charset index changes * The currently selected charset changes (from ASCII to graphical, or vice verse) * Sixels are added to the grid * Sixels are removed from the grid * Insert mode is enabled/disabled	2021-03-16 08:45:18 +01:00
Daniel Eklöf	cb60ddd090	vt: remove xassert(), that cannot be optimized out, from action_print() action_print() is in the hot path, and having if-statement here does have an impact on performance. Much more so when that if-statement involves a functional call to wcwidth(). Closes #330	2021-02-07 11:14:07 +01:00
Craig Barnes	e56136ce11	debug: rename assert() to xassert(), to avoid clashing with <assert.h>	2021-01-16 20:16:00 +00:00
Craig Barnes	22f25a9e4f	Print stack trace on assert() failure or when calling fatal_error() Note: this uses the __sanitizer_print_stack_trace() function from the AddressSanitizer runtime, so it only works when AddressSanitizer is in use.	2021-01-16 19:56:33 +00:00
Daniel Eklöf	bcf46d9eab	Merge branch 'decset-1047-and-1048'	2021-01-16 15:27:20 +01:00
Daniel Eklöf	bc053e4879	vt: document correct BS behavior, and why we do differently	2021-01-15 18:40:07 +01:00
Daniel Eklöf	bae3c871bb	term/vt/csi: break out cursor save/restore to dedicated functions	2021-01-15 17:08:30 +01:00
Craig Barnes	39b2e46e72	Use wrappers from macros.h instead of bare GCC attributes/pragmas	2021-01-03 08:56:47 +00:00
Daniel Eklöf	69cd5fd3ab	vt: codespell: ony -> only	2020-12-16 15:06:34 +01:00
Daniel Eklöf	2e137c0a7e	vt: don’t ignore extra private/intermediate characters Take ‘\E(#0’ for example - this is not the same as ‘\E(0’. Up until now, foot has however treated them as the same escape, because the handler for ‘\E(0’ didn’t verify there weren’t any _other_ private characters present. Fix this by turning the ‘private’ array into a single 4-byte integer. This allows us to match all privates with a single comparison. Private characters are added to the LSB first, and MSB last. This means we can check for single privates in pretty much the same way as before: switch (term->vt.private) { case ‘?’: ... break; } Checking for two (or more) is much uglier, but foot only supports a single escape with two privates, and no escapes with three or more: switch (term->vt.private) { case 0x243f: /* ‘?$’ */ ... break; } The ‘clear’ action remains simple (and fast), with a single write operation. Collecting privates is potentially _slightly_ more complex than before; we now need mask and compare, instead of simply comparing, when checking how many privates we already have. We _could_ add a counter, which would make collecting privates easier, but this would add an additional write to the ‘clean’ action which is really bad since it’s in the hot path.	2020-12-16 14:30:49 +01:00
Daniel Eklöf	7c6686221f	bell: optionally render margins in red when receiving BEL Add anew config option, ‘bell=none\|set-urgency’. When set to ‘set-urgency’, the margins will be painted in red (if the window did not have keyboard focus). This is intended as a cheap replacement for the ‘urgency’ hint, that doesn’t (yet) exist on Wayland. Closes #157	2020-10-08 19:55:32 +02:00
Daniel Eklöf	377f1b7ad3	vt: BS: only reset lcf if cursor is beyond right margin: don’t move cursor This is needed to make reverse auto-wrap work correctly. Without it, we’ll end up moving the cursor left one cell extra.	2020-10-02 21:40:30 +02:00
Daniel Eklöf	9db78c3122	vt: hide pedantic warnings around the VT state machine's switch cases The switch statements use the GCC extension "case X ... Y", and here it doesn't really make any sense to convert it to "case X: case Y:", so hide the warnings instead.	2020-08-23 10:07:08 +02:00
Craig Barnes	44499bbfe1	Fix spelling mistake in vt.c	2020-08-16 14:37:27 +01:00
Craig Barnes	7a77958ba2	Convert most dynamic allocations to use functions from xmalloc.h	2020-08-08 20:37:57 +01:00
Daniel Eklöf	6f2cffd8c0	vt: never call term_print() with a width <= 0	2020-07-16 08:04:12 +02:00
Daniel Eklöf	9508804b18	vt: ignore 0x7f (DEL) in ground state This ensures all bytes mapped to action_print() have wcwidth == 1. DEL has wcwidth == -1, and would thus have been ignored by term_print() anwyway.	2020-07-16 08:01:37 +02:00
Daniel Eklöf	6183f7f64a	vt: utf8: handle multi-column spacer values correctly when combining	2020-07-16 07:41:51 +02:00
Daniel Eklöf	5c99e8013b	term: rename COMB_CHARS_LO,HI -> CELL_COMB_CHARS_LO,HI	2020-07-14 16:41:57 +02:00
Daniel Eklöf	cabcc615c1	vt: change HT (horizontal tab) to not clear LCF According to the specification, HT should clear LCF. However, nearly all emulators do not. In particular, XTerm doesn't. So we follow suite.	2020-07-14 10:47:17 +02:00
Daniel Eklöf	b9719673a1	term: rename term_formfeed() -> term_carriage_return()	2020-07-14 09:29:10 +02:00
Daniel Eklöf	7f7ab00e11	vt: implement C0::FF - processed in the same way as C0::LF	2020-07-14 09:18:52 +02:00
Daniel Eklöf	4849a16f37	vt: process C0::VT the same way we process C0::LF Previously, C0::VT was implemented as a simple 'cursor down'. I.e. it would behave as LF until it reached the bottom of the screen, where instead of scrolling, it became a no-op. See https://vt100.net/docs/vt102-ug/chapter5.html	2020-07-14 09:15:15 +02:00
Daniel Eklöf	7357bb54eb	vt: sort C0's in the switch statement, and use escaped character possible	2020-07-14 09:11:17 +02:00
Daniel Eklöf	fb001ee7a7	unicode combining: don't log overflow errors unless LOG_ENABLE_DBG == 1	2020-06-09 17:31:58 +02:00
Daniel Eklöf	97221dd09b	vt: utf8-print: check width == 0 first, when deciding whether to do combining	2020-06-09 17:30:49 +02:00
Daniel Eklöf	9452aff020	vt: initial version of UTF-8 decoding built-in into the VT parser	2020-06-07 16:16:50 +02:00
Daniel Eklöf	d9028b2394	vt: utf8: use mbtowc() instead of mbrtowc() This is slightly faster, since we don't need to initialize an mbstate_t struct (using mbrtowc() with a NULL-pointer for 'ps' also works). Also, avoid a branch by setting wc=0 and then ignoring the result/error code from mbtowc().	2020-05-31 12:41:35 +02:00
Daniel Eklöf	c38b9be6a4	vt: utf8: don't need one entry action for each UTF8 variant	2020-05-31 12:41:07 +02:00
Daniel Eklöf	00df12f1a3	unicode-combine: simplify - remove -Dunicode-precompose option Since the pre-composing functionality is now part of fcft, it makes little sense to have a compile time option - there's no size benefit to be had. Furthermore, virtually all terminal emulators do pre-composing (alacritty being an exception), this really isn't that controversial.	2020-05-10 17:10:33 +02:00
Daniel Eklöf	b1b32152c1	unicode-precompose: use fcft's precompose functionality This allows us more options when determining whether to use a pre-composed character or not: We now only use the pre-composed character if it's from the primary font, or if at least one of the base or combining characters are from a fallback font. I.e. use glyphs from the primary font if possible. But, if one or more of the decomposed glyphs are from a fallback font, use the pre-composed character anyway.	2020-05-09 12:06:11 +02:00
Daniel Eklöf	4d4df92f66	unicode-combining: limit maximum number of allowed composed chains	2020-05-03 11:31:59 +02:00
Daniel Eklöf	1ebdc01162	unicode-combining: detect when we've reached the chain limit We currently store up to 5 combining characters in any given base+combining chain. This adds a check for when that limit is about to be exceeded. When this happens, we log the chain + the new combining character. Since things will break anyway, we simply overwrite the last combining character.	2020-05-03 11:27:06 +02:00
Daniel Eklöf	62e0774319	unicode-combining: store seen combining chains "globally" in the term struct Instead of storing combining data per cell, realize that most combinations are re-occurring and that there's lots of available space left in the unicode range, and store seen base+combining combinations chains in a per-terminal array. When we encounter a combining character, we first try to pre-compose, like before. If that fails, we then search for the current base+combining combo in the list of previously seen combinations. If not found there either, we allocate a new combo and add it to the list. Regardless, the result is an index into this array. We store this index, offsetted by COMB_CHARS_LO=0x40000000ul in the cell. When rendering, we need to check if the cell character is a plain character, or if it's a composed character (identified by checking if the cell character is >= COMB_CHARS_LO). Then we render the grapheme pretty much like before.	2020-05-03 11:03:22 +02:00

1 2 3 4 5

213 commits