Commit 859b4c89 (url-mode: wip: more work on regex matching,
2025-01-30) regressed URL detection in weechat. Some of the URLs
still work but others don't. This is because regexec() stops at the
first NUL, thus skipping the rest of the line. weechat seems create
NUL cells between their UI widgets.
Work around this by replacing NUL with space. This is probably correct
because selecting and copying those cells also translates to space
(not sure where in the code).
When auto-matching URLs (or custom regular expressions), use the
first *subexpression* as URL, rather than the while regex match.
This allows us to write custom regular expressions with prefix/suffix
strings that should not be included in the presented match.
Users can now define their own regex patterns, and use them via key
bindings:
[regex:foo]
regex=foo(bar)?
launch=path-to-script-or-application {match}
[key-bindings]
regex-launch=[foo] Control+Shift+q
regex-copy=[foo] Control+Mod1+Shift+q
That is, add a section called 'regex:', followed by an
identifier. Define a regex and a launcher command line.
Add a key-binding, regex-launch and/or regex-copy (similar to
show-urls-launch and show-urls-copy), and connect them to the regex
with the "[regex-name]" syntax (similar to how the pipe-* bindings
work).
That is, try to match e.g. Control+shift+a, before trying to match
Control+A.
In most cases, order doesn't matter. There are however a couple of
symbols where the layout consumes the shift-modifier, and the
generated symbol is the same in both the shifted and unshifted
form. One such example is backspace.
Before this patch, key-bindings with shift-backspace would be ignored,
if there were another key-binding with backspace.
So, for example, if we had one key-binding with Control+Backspace, and
another with Control+Shift+Backspace, the latter would never trigger,
as we would always match the first one.
By checking for unshifted matches first, we ensure
Control+Shift+Backspace does match.
When matching the unshifted symbol, or the raw key code, ignore all
key bindings that don't have any modifiers.
This fixes an issue where it was impossible to enter (some of the)
numbers on the keypad, **if** there was a key-binding for
e.g. KP_Page_Up, or KP_Page_Down.
When trying to match key bindings, we do three types of matching:
* Match the _translated_ symbol (e.g. Control+C)
* Match the _untranslated_ symbol (e.g. Control+Shift+c)
* Match raw keyboard codes
This was done for *each* key binding. This meant we sometimes matched
a keybinding in raw mode, even though there was a
translated/untranslated binding that would match it too. All depending
on the internal order of the key binding list.
This patch changes it, so that we first try all bindings in translated
mode, then all bindings in untranslated mode, and finally all bindings
in raw mode.
Closes#1929
For the purpose of matching key bindings, "significant" modifiers are
no more.
We're really only interested in filtering out "locked"
modifiers. We're already doing this, so there's no need to *also*
match against a set of "significant" modifiers.
Furthermore, we *never* want to consider locked keys (e.g. when
emitting escapes to the client application), thus we can filter those
out already when retrieving the set of active modifiers.
The exception is the kitty keyboard protocol, which has support for
CapsLock and NumLock. Since we're already re-retrieving the "consumed"
modifiers (using the GTK style, rather than normal "XKB" style, to
better match the kitty terminal), we might as well re-retrieve the
effective modifiers as well.
Before this patch, the file://<host> prefix was stripped from URIs,
when the hostname matched the current host (that is, for "local"
URLs).
Unfortunately, the way this was done caused other parts of the URI to
be stripped as well. For example, the 'query' and 'fragment' parts.
This patch simply removes all special casing of file:// URIs.
Since the URL is passed to a generic opener (i.e. we don't have a
special opener application for file:// URIs), the opener helper must
handle the file:// prefix anyway.
Closes#1474
* Skip spacer cells. This fixes an issue where characters following a
double-width character weren't detect properly.
* Unpack grapheme clusters (i.e. cells with multiple codepoints), and
iterate all their codepoints.
Closes#1465
When unmapping a sub-surface, Sway <= 1.8 does not damage the surface
beneath the sub-surface.
https://github.com/swaywm/sway/issues/6960
The workaround is to manually damage the main surface. Previously,
this was done when exiting scrollback search, and after the ‘flash’
OSC. But other sub-surfaces, that may also be unmapped, did not.
This patch adds a quirk handler that does this, and calls it when:
* Exiting scrollback search
* Ending the ‘flash’ OSC
* Exiting unicode input mode
* Clearing URL labels
* Removing the scrollback position indicator
Closes#1335
Before this patch, hyperlinked cells were tagged with the “URL”
attribute (thus instructing the renderer to draw an
underline) *before* the grid was snapshot.
When exiting URL mode, the cells were once again updated, this time
removing the URL attribute.
But what if an escape sequence had modified the grid _while we were in
URL mode_? Depending on the sequence, it could move cells around in
such a way, that when exiting URL mode, the affected cells weren’t
updated correctly. I.e. we left some cells with the URL attribute
still set.
The fix is simple: tag cells in the snapshot:ed grid only (which isn’t
affected by any escape sequence received while in URL mode). Not in
the *actual* grid (which _is_ affected).
First, add a ‘token’ argument to spawn(). When non-NULL, spawn() will
set the ‘XDG_ACTIVATION_TOKEN’ environment variable in the forked
process. If DISPLAY is non-NULL, we also set DESKTOP_STARTUP_ID, for
compatibility with X11 applications. Note that failing to set either
of these environment variables are considered non-fatal - i.e. we
ignore failures.
Next, add a helper function, wayl_get_activation_token(), to generate
an XDG activation token, and call a user-provided callback when it’s
‘done (since token generation is asynchronous). This function takes an
optional ‘seat’ and ‘serial’ arguments - when both are non-NULL/zero,
we set the serial on the token. ‘win’ is a required argument, used to
set the surface on the token.
Re-write wayl_win_set_urgent() to use the new helper function.
Finally, rewrite activate_url() to first try to get an activation
token (and spawn the URL launcher in the token callback). If that
fails, or if we don’t have XDG activation support, spawn the URL
launcher immediately (like before this patch).
Closes#1058
Up until now, our Wayland seats have been tracking key bindings. This
makes sense, since the seat’s keymap determines how the key bindings
are resolved.
However, tying bindings to the seat/keymap alone isn’t enough, since
we also depend on the current configuration (i.e. user settings) when
resolving a key binding.
This means configurations that doesn’t match the wayland object’s
configuration, currently don’t resolve key bindings correctly. This
applies to footclients where the user has overridden key bindings on
the command line (e.g. --override key-bindings.foo=bar).
Thus, to correctly resolve key bindings, each set of key bindings must
be tied *both* to a seat/keymap, *and* a configuration.
This patch introduces a key-binding manager, with an API to
add/remove/lookup, and load/unload keymaps from sets of key bindings.
In the API, sets are tied to a seat and terminal instance, since this
makes the most sense (we need to instantiate, or incref a set whenever
a new terminal instance is created). Internally, the set is tied to a
seat and the terminal’s configuration.
Sets are *added* when a new seat is added, and when a new terminal
instance is created. Since there can only be one instance of each
seat, sets are always removed when a seat is removed.
Terminals on the other hand can re-use the same configuration (and
typically do). Thus, sets ref-count the configuration. In other words,
when instantiating a new terminal, we may not have to instantiate a
new set of key bindings, but can often be incref:ed instead.
Whenever the keymap changes on a seat, all key bindings sets
associated with that seat reloads (re-resolves) their key bindings.
Closes#931
We have a number of sub-surfaces for which we are *not* interrested in
pointer (or touch) input.
Up until now, we’ve manually dealt with these, by recognizing these
surfaces in all pointer events, and ignoring them.
But, lo and behold, there are better ways of doing this. By clearing
the subsurface’s input region, the compositor will do this for us -
when a pointer is outside a surface’s input region, the event is
passed to the next surface underneath it.
This is exactly what we want! Do this for all subsurfaces, *except*
the CSDs.
When matching “untranslated” bindings (by matching the base symbol of
the key, e.g. ctrl+shift+2 in US layout), require that no
non-significant modifiers are active.
This fixes an issue where AltGr was “ignored”, and would cause certain
combinations to match a key binding.
Example: ctrl+altgr+0, on many European layouts matched against the
default ctrl+0 (reset the font size), instead of emitting ^]
To make this work, we now need to filter out “locked”
modifiers (e.g. NumLock and CapsLock). Otherwise having e.g. NumLock
active would prevent *all* untranslated matching to fail.
Closes#983
Fcft no longer uses wchar_t, but plain uint32_t to represent
codepoints.
Since we do a fair amount of string operations in foot, it still makes
sense to use something that actually _is_ a string (or character),
rather than an array of uint32_t.
For this reason, we switch out all wchar_t usage in foot to
char32_t. We also verify, at compile-time, that char32_t used
UTF-32 (which is what fcft expects).
Unfortunately, there are no string functions for char32_t. To avoid
having to re-implement all wcs*() functions, we add a small wrapper
layer of c32*() functions.
These wrapper functions take char32_t arguments, but then simply call
the corresponding wcs*() function.
For this to work, wcs*() must _also_ be UTF-32 compatible. We can
check for the presence of the __STDC_ISO_10646__ macro. If set,
wchar_t is at least 4 bytes and its internal representation is UTF-32.
FreeBSD does *not* define this macro, because its internal wchar_t
representation depends on the current locale. It _does_ use UTF-32
_if_ the current locale is UTF-8.
Since foot enforces UTF-8, we simply need to check if __FreeBSD__ is
defined.
Other fcft API changes:
* fcft_glyph_rasterize() -> fcft_codepoint_rasterize()
* font.space_advance has been removed
* ‘tags’ have been removed from fcft_grapheme_rasterize()
* ‘fcft_log_init()’ removed
* ‘fcft_init()’ and ‘fcft_fini()’ must be explicitly called
This option specifies the characters allowed in the auto-detected
URLs.
Any character not in this set constitutes an URL delimiter, and will
never be included in auto-detected URLs.
This option does not affect OSC-8 URLs.
Closes#654
If we have lots of URLs, we use up a *lot* of SHM buffers (and thus
pools). Each and every one is a single mmap(), of at least 4K.
Since all URL labels are created and destroyed at the same time, it
makes sense to use a single pool for all buffers.
To do this, we must now do two passes; first one to generate the
actual string (label content), and to calculate the label sizes.
Then we use this information to allocate all SHM buffers at once, from
the same pool.
Finally, we iterate the URLs again, this time to actually render them.
When tagging URL cells (in preparation for rendering URL mode), we
loop the URL’s entire range, setting the `url` attribute of all cells,
and dirtying the rows.
It is possible to create URLs that are invalid, and wrap around the
scrollback, even though the scrollback hasn’t yet been filled. For
example, by starting an OSC-8 URL, moving the cursor, and then closing
the OSC-8 URL.
These URLs are invalid, but are still rendered just fine. “Fine” being
relative - they will typically fill the entire screen. But at least
that’s a very clear indication for the user that’s something is wrong.
The problem is when we hit un-allocated scrollback rows. We didn’t
check for NULL rows, and crashed.
This has now been fixed.
Removing overlaping and duplicated URLs is done by running two nested
loops, that both iterate the same URL list.
When a duplicate is found, one of the URLs is destroyed and removed
from the list.
Deleting and removing an item *is* safe, but only as long as _no
other_ iterator has references to it.
In this case, if we remove an item from e.g. the inner iterator, we’ll
crash if the outer iterator’s *next* item is the item being removed.
Closes#627
Unlike other surface types, the SHM cookie depends on the address of
each URL instance. This means if we enable, disable, and then enable
URL mode again (thus showing exactly the same URLs as the first time),
the URLs will have new addresses, and thus the old SHM pixmaps will
not get purged automatically.
So, manually purge them when destroying the URLs.
When an auto-detected URL ended *on* the right-most column, the URL
endpoint was off by one, resulting in the underline in URL mode being
one character short.
echo -e '\e]8;;https://www.foo.bar\e\\https://www.foo\e]8;;\e\\.bar'
will produce an OSC-8 URL (https://www.foo) that is slightly shorter
than the auto-detected one (https://www.foo.bar).
This produces strange results in URL mode. For example, if
url.osc8-underline=always, the OSC8 underline will be removed when
url-mode is exited.
This patch changes the behavior so that auto-detected URLs that
overlap OSC-8 URLs are removed.
Note that OSC-8 URLs cannot overlap with each other, and that
auto-detected URLs also cannot overlap with each other.