From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Tue May 31 2005 - 16:06:16 CDT
From: "Страхиња Радић" <vilinkamen@mail.ru>
> By using this kind of reasoning, we would end up asking why the heck
> was ``fi'' or ``ffi'' encoded when these two can be expressed with their
> corresponding atoms
Today, they would not be encoded. They come from the origins of Unicode, at
a time where the normalization rules and encoding policies were still in
their infancy, but where it was desirable to get round-trip compatibility
with other popular encodings. These characters were present in MacOS
charsets, and in the default PostScript charset, so they were used in
plain-text files that were encoded for direct printing without further
reencoding...
At that time, the processing of ligatures was missing in fonts, due to lack
of uniform technology to support it portably across heterogeneous systems.
This time is over, and ligature processing is a required feature to support
even legacy ISO 8859 charsets like Arabic, or Indian standard charsets
(ISCII). They are now fully integrated into Unicode, and such system is
required in practive for almost all scripts that benefit of fine typographic
features such as ligatures and contextual forms.
Unicode however cannot remove those characters. They remain there for
compatibility, they are not recommanded, they are considered compatibility
characters with canonical decompositions, and not part of normalized forms,
because their plain-text semantic is strictly equal to the semantic of their
decomposition in any human languages that use them.
This archive was generated by hypermail 2.1.5 : Tue May 31 2005 - 16:06:59 CDT