Re: Surrogate points

From: Hans Aberg (haberg@math.su.se)
Date: Mon Jan 31 2005 - 12:24:13 CST

Next message: Hans Aberg: "Re: Surrogate points"

Previous message: Peter Kirk: "Re: Surrogate points"
Maybe in reply to: Hans Aberg: "Surrogate points"
Next in thread: Hans Aberg: "Re: Surrogate points"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

At 14:18 -0800 2005/01/30, Doug Ewell wrote:
>In any case, it is incorrect to state that the choice of this block was
>due to "failure to given UTF-16 a proper design." Other blocks, such as
>the "obvious" 0xF800 through 0xFFFF, were already occupied.

It makes the character number allocations dependent of a particular
encoding, which is wholly unnecessary.

>2. Noncharacters 0xFFFE and 0xFFFF
>
>The designation of 0xFFFE and 0xFFFF as "noncharacters" goes back to
>Unicode 1.0 (1991), although that term was not used at the time. The
>numeric value -1 has a long history of being used as a "sentinel" value,
>to indicate the end of a series of real values. This works fine for
>non-negative numeric data, such as inventory counts, but caused problems
>in existing 8-bit character sets where the value 0xFF might have a real
>meaning.
>
>To solve this problem, Unicode 1.0 set aside the value 0xFFFF as NOT
>corresponding to an actual character. This way, programs that used
>16-bit values (i.e. all Unicode programs at the time) could safely use
>it as a sentinel without fear of colliding with a real character
>assignment. This was completely intentional.

Again, one sets these values aside in the encoding, if necessary, not in the
character model.

>Claiming that either of these features of Unicode is the result of poor
>design of UTF-16 is simply wrong. It is an uninformed opinion based on
>inadequate consideration of the facts.

So obviously, the guys who did this design, did not understand to clearly
separate the character model from the encoding.

>Hans, I don't know how long you spent on this list as a silent observer
>("lurker") before you began posting, but evidently not long enough.
>
>When I joined this list, I spent almost a year lurking before I made my
>first post. I listened to the experts. I made plenty of wrong
>statements of my own, but accepted the criticisms and corrections of
>those who obviously knew more than I did. I learned the history of why
>things are, and perhaps most importantly, I learned the importance of
>Unicode's stability policies, which explain why it is TOO LATE to make
>major architectural changes that would invalidate all existing
>implementations.
>
>While I admit a year may be excessive, I strongly suggest you take some
>time off to READ the list, read the FAQ's, read the book (on-line or
>hardcover), read the UAX's and UTS's and UTR's, and THINK about why the
>Unicode Standard is the way it is, and what can -- and cannot -- be done
>to change it. The choice is entirely up to you, but if you do not do
>the necessary homework to draw reasonable conclusions and ask reasonable
>questions, your posts will continue to reflect your lack of
>understanding, and will be ignored by more and more people.

I was clearly, more or less, aware of the facts you at some length put up,
before I was posting. The idea was that the intelligent reader should
notice, before replying.

So, evidently, your one year of lurking didn't help you.

Hans Aberg

Next message: Hans Aberg: "Re: Surrogate points"
Previous message: Peter Kirk: "Re: Surrogate points"
Maybe in reply to: Hans Aberg: "Surrogate points"
Next in thread: Hans Aberg: "Re: Surrogate points"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Mon Jan 31 2005 - 12:25:57 CST