unicode-precompose: use fcft's precompose functionality

This allows us more options when determining whether to use a
pre-composed character or not:

We now only use the pre-composed character if it's from the primary
font, or if at least one of the base or combining characters are from
a fallback font.

I.e. use glyphs from the primary font if possible. But, if one or more
of the decomposed glyphs are from a fallback font, use the
pre-composed character anyway.
This commit is contained in:
Daniel Eklöf 2020-05-08 23:36:33 +02:00
parent c090a0664f
commit b1b32152c1
No known key found for this signature in database
GPG key ID: 5BBD4992C116573F
4 changed files with 26 additions and 33878 deletions

View file

@ -1,33 +0,0 @@
#!/usr/bin/sh
unicodedata_txt="${1}"
output="${2}"
cat <<EOF > "${output}"
#pragma once
#include <wchar.h>
static const struct {
wchar_t replacement;
wchar_t base;
wchar_t comb;
} precompose_table[] = {
EOF
# extract canonical decomposition data from UnicodeData.txt,
# - pad hex values to 5 digits,
# - sort numerically on base character, then combining character,
# - then reduce to 4 digits again where possible
#
# "borrowed" from xterm/unicode/make-precompose.sh
cut "${unicodedata_txt}" -d ";" -f 1,6 |
grep ";[0-9,A-F]" | grep " " |
sed -e "s/ /, 0x/;s/^/{ 0x/;s/;/, 0x/;s/$/},/" |
sed -e "s,0x\(....\)\([^0-9A-Fa-f]\),0x0\1\2,g" |
(sort -k 3 || sort +2) |
sed -e "s,0x0\(...[0-9A-Fa-f]\),0x\1,g" |
sed 's/^/ /' >> "${output}"
echo "};" >> "${output}"