Kenneth Whistler scripsit:
> UTF-8 implementations are generally
> driven by [inter alia]
> the cost tradeoffs of software adaptation to process 16-bit strings
> versus the processing inefficiencies of dealing with variable-width
> characters.
In addition, in some applications those processing inefficiencies are
not present, thanks to the self-segregating nature of UTF-8. For
example, the Plan 9 "fgrep" program (which searches a stream of text
for the presence of one or more of a list of strings) need never convert
to UCS format at all; the strings are UTF-8 and so is the text, and
in fact the program looks the same as the corresponding 8-bit program.
-- John Cowan cowan@ccil.org e'osai ko sarji la lojban.
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:41 EDT