Using Python's own PRNG should make the code cleaner and allow for
reproducible stimulus files if that is desired via setting --seed (at
least for the same versions of the script, changing the kind and/or
order of the random calls will of course impact the output in the
future).
I did the following substitutions:
* rand.read(1)[0] % n and struct.unpack('@H', rand.read(2))[0] % n →
random.randrange(n)
* rand.read(1)[0] → random.randrange(256)
* rand.read(n) → [random.randrange(256) for _ in range(n)]
(better alternative would have been random.randbytes(n), but is only
available for Python >= 3.9, switching to this in the future will
impact output)
* list[rand.read(1) % len(list)] → random.choice(list)
This allows us more options when determining whether to use a
pre-composed character or not:
We now only use the pre-composed character if it's from the primary
font, or if at least one of the base or combining characters are from
a fallback font.
I.e. use glyphs from the primary font if possible. But, if one or more
of the decomposed glyphs are from a fallback font, use the
pre-composed character anyway.
We only used utf8proc to try to pre-compose a glyph from a base and
combining character.
We can do this ourselves by using a pre-compiled table of valid
pre-compositions. This table isn't _that_ big, and binary searching it
is fast.
That is, for a very small amount of code, and not too much extra RO
data, we can get rid of the utf8proc dependency.