From: Hans Aberg (haberg@math.su.se)
Date: Sat Apr 23 2005 - 07:39:23 CST
It strikes me that one can assign multiple character names to
characters, if only there is one which is a preferred one. This might
be used to correct errors. In a file such as "UnicodeData.txt", one
might merely add new lines with the new character name, declaring
that in a translation from code point to character name, the last
name in the list is to be used.
So, say one wants to correct "BRAKCET" to "BRACKET", then the new
version of UnicodeDATA.txt will look like:
FE17;PRESENTATION FORM FOR VERTICAL LEFT WHITE LENTICULAR BRACKET;Ps;...
FE18;PRESENTATION FORM FOR VERTICAL RIGHT WHITE LENTICULAR BRAKCET;Pe;...
FE18;PRESENTATION FORM FOR VERTICAL RIGHT WHITE LENTICULAR BRACKET;Pe;...
FE19;PRESENTATION FORM FOR VERTICAL HORIZONTAL ELLIPSIS;Po;...
If somebody refers to U-FE18 as "PRESENTATION FORM FOR VERTICAL RIGHT
WHITE LENTICULAR BRAKCET", it will be recognized, but if the
character is first translated into its code point, and then back to a
character name, one gets back "PRESENTATION FORM FOR VERTICAL RIGHT
WHITE LENTICULAR BRACKET". Of course, this last name will be
recognized as well.
-- Hans Aberg
This archive was generated by hypermail 2.1.5 : Sat Apr 23 2005 - 07:40:57 CST