Since the pre-composing functionality is now part of fcft, it makes
little sense to have a compile time option - there's no size benefit
to be had.
Furthermore, virtually all terminal emulators do
pre-composing (alacritty being an exception), this really isn't that
controversial.
This allows us more options when determining whether to use a
pre-composed character or not:
We now only use the pre-composed character if it's from the primary
font, or if at least one of the base or combining characters are from
a fallback font.
I.e. use glyphs from the primary font if possible. But, if one or more
of the decomposed glyphs are from a fallback font, use the
pre-composed character anyway.
This means command line parsing stops when it encounters the first
nonoption argument.
The result is that one no longer need to use '--' to ensure arguments
are passed to the shell/command, instead of parsed by foot.
That is, instead of
foot -- sh -c true
one can now do
foot sh -c true
Arguments to foot *must* go before the command:
foot --fullscreen sh -c true
We currently store up to 5 combining characters in any given
base+combining chain.
This adds a check for when that limit is about to be exceeded. When
this happens, we log the chain + the new combining character.
Since things will break anyway, we simply overwrite the last combining
character.
Instead of storing combining data per cell, realize that most
combinations are re-occurring and that there's lots of available space
left in the unicode range, and store seen base+combining combinations
chains in a per-terminal array.
When we encounter a combining character, we first try to pre-compose,
like before. If that fails, we then search for the current
base+combining combo in the list of previously seen combinations. If
not found there either, we allocate a new combo and add it to the
list. Regardless, the result is an index into this array. We store
this index, offsetted by COMB_CHARS_LO=0x40000000ul in the cell.
When rendering, we need to check if the cell character is a plain
character, or if it's a composed character (identified by checking if
the cell character is >= COMB_CHARS_LO).
Then we render the grapheme pretty much like before.
We only used utf8proc to try to pre-compose a glyph from a base and
combining character.
We can do this ourselves by using a pre-compiled table of valid
pre-compositions. This table isn't _that_ big, and binary searching it
is fast.
That is, for a very small amount of code, and not too much extra RO
data, we can get rid of the utf8proc dependency.
If the client sent the sequence SAB, where SA does NOT have a composed
representation, but SB does, the old code would compose SB and throw
away A.
This patch fixes this by only allowing a compose if there aren't
any pre-existing combining characters.