From: Mark E. Shoulson (mark@kli.org)
Date: Fri Oct 26 2007 - 11:56:57 CDT
John H. Jenkins wrote:
>
> On Oct 26, 2007, at 8:14 AM, Mark E. Shoulson wrote:
>
>> Unicode generally tries to code what's written more than what's
>> meant, I thought.
>>
>
>
> Well, not really.
>
> Unicode tries to formalize the informal understanding that users of a
> script bring to it. In the case of x, "everybody" knows that it's the
> same letter in English as in Spanish. In East Asia, there are a number
> of cases where "everybody" knows that two entities are separate
> characters even if they look almost the same and in fact may be
> indistinguishable in practice.
OK, I'll buy that. It also answers other common questions: "everybody"
knows that Cyrillic А is distinct from Latin A, despite common origin
and identical typographic treatment, etc. And it leads to the problems
you mention, as "everybodies" disagree.
Andrew West's clarification of how they got to be same-but-different was
also helpful. It's a little like the y in "ye olde shoppe", which by
rights should be coded "þe", since it isn't a y but a thorn; the
distinction between the two wore away. OK, not a good example because
there it IS unified with the y and here the argument is for keeping them
separate, but whatever.
(For that matter, vunzndi@vfemail.net's point about i/j and I/l was good
too. Just didn't want people to think I was ignoring their arguments...)
~mark
This archive was generated by hypermail 2.1.5 : Fri Oct 26 2007 - 11:58:02 CDT