From: Arcane Jill (arcanejill@ramonsky.com)
Date: Mon Apr 25 2005 - 01:02:42 CST
-----Original Message-----
From: unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org]On
Behalf Of Asmus Freytag
Sent: 23 April 2005 22:52
To: Hans Aberg
Cc: Unicode
Subject: Re: String name and Character Name
> At 02:40 PM 4/23/2005, Hans Aberg wrote:
> >At 13:46 -0700 2005/04/23, Asmus Freytag wrote:
> >>>So, say one wants to correct "BRAKCET" to "BRACKET", then the new
> >>>version of UnicodeDATA.txt will look like:
> >>> FE17;PRESENTATION FORM FOR VERTICAL LEFT WHITE LENTICULAR
> >>> BRACKET;Ps;...
> >>> FE18;PRESENTATION FORM FOR VERTICAL RIGHT WHITE LENTICULAR
> >>> BRAKCET;Pe;...
> >>> FE18;PRESENTATION FORM FOR VERTICAL RIGHT WHITE LENTICULAR
> >>> BRACKET;Pe;...
> >>> FE19;PRESENTATION FORM FOR VERTICAL HORIZONTAL ELLIPSIS;Po;...
> >I leave it to the engineers to
> >figure out what might be considered a less painful method.\
>
> --"Just leave the driving to us."
Well, I'm a software engineer too, so I guess I'm allowed to comment here. This
suggestion:
> >>> FE18;PRESENTATION FORM FOR VERTICAL RIGHT WHITE LENTICULAR
> >>> BRAKCET;Pe;...
> >>> FE18;PRESENTATION FORM FOR VERTICAL RIGHT WHITE LENTICULAR
> >>> BRACKET;Pe;...
will break existing software. Or at least, it will break software which /I/
have written, which is perhaps not so bad as mine is not commercially deployed,
but I'm guessing there's commercially deployed software out there which made
the same assumption as I - which is that UnicodeData.txt
contains at most one line per codepoint, listed in ascending numerical order.
If that assumption is invalid, my code breaks.
Okay, so that's not much of a big deal as it wouldn't be /that/ much effort for
me to write in a fix, but it might be more difficult for code which is already
deployed.
On the other hand...
-----Original Message-----
From: unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org]On
Behalf Of Asmus Freytag
Sent: 23 April 2005 21:43
To: Peter Kirk; Doug Ewell
Cc: Unicode Mailing List
Subject: Re: String name and Character Name
> But in the spirit of hypothesizing a solution, I would consider using an
> alias mechanism in the way aliases are used for Property names the best
> solution. For properties (and their values) there exist multiple aliases,
> which are all considered unique.
That would work, and wouldn't break anything.
BUT... I still don't see the point. If the purpose of names is to be a unique
identifier, then aliases are not needed. The existing names /already serve that
purpose/. On the other hand, if the purpose of names is to be meaningful to
humans (regardless of their language), then the CDLR suggestion still seems
like the best idea to me.
And though the names presented by BabelPad and its ilk may be sometimes
misleading, it is difficult to criticise them too harshly while an alternative
does not (yet) exist (although it would be nicer if TUS had gone to more effort
to point out "these names are not supposed to be meaningful").
My vote ... (if I had one) ... would go to the CDLR idea, indexed by codepoint
(so we can ignore the names altogether). And when that's complete (at least for
English), TUS should encourage applications to present CDLR-localized names to
end-users in place of the ISO name. From a software point of view, that's kinda
easy - if you've already got a locale discrimination mechanism in place, then
it's just one more file to parse.
Jill
This archive was generated by hypermail 2.1.5 : Mon Apr 25 2005 - 01:05:18 CST