On 1/22/19 6:26 PM, Kent Karlsson via Unicode wrote:
> Ok. One thing to note is that escape sequences (including control sequences,
> for those who care to distinguish those) probably should be "default
> ignorable" for display. Requiring, or even recommending, them to be default
> ignorable for other processing (like sorting, searching, and other things)
> may be a tall order. So, for display, (maximal) substrings that match:
>
> \u001B[\u0020-\002F]*[\u0030-\007E]|
> (\u001B'['|\009B)[\u0030-\003F]*[\u0020-\002F]*[\u0040-\007E]
>
> should be default ignorable (i.e. invisible, but a "show invisibles" mode
> would show them; not interpreted ones should be kept, even if interpreted
> ones need not, just (re)generated on save). That is as far as Unicode
> should go.
So it isn't just "these characters should be default ignorable", but
"this regular expression is default ignorable." This gets back to
"things that span more than a character" again, only this time the
"span" isn't the text being styled, it's the annotation to style it.
The "bash" shell has special escape-sequences (\[ and \]) to use in
defining its prompt that tell the system that the text enclosed by them
is not rendered and should not be counted when it comes to doing
cursor-control and line-editing stuff (so you put them around, yep, the
escape sequences for coloring or boldfacing or whatever that you want in
your prompt). That would seem to be at least simpler than a big ol'
regexp, but really not that much of an improvement. It also goes to
show how things like this require all kinds of special handling,
even/especially in a "simple" shell prompt (which could make a strong
case for being "plain text", though, yes, terminal escape codes are a
thing.)
~mark
Received on Wed Jan 23 2019 - 20:21:54 CST
This archive was generated by hypermail 2.2.0 : Wed Jan 23 2019 - 20:21:54 CST