Re: GB18030

From: Yung-Fong Tang (ftang@netscape.com)
Date: Thu Sep 27 2001 - 18:27:44 EDT


Markus Scherer wrote:

> Yung-Fong Tang wrote:
> > ... But you
> > still need to know what U+4ff3a to define such mapping table, right?
>
> Wrong. You just need to know the mapping between code points, whether assigned, used, or whatever.
>
> > ... So, whatever the software the user currently have today, without an
> > upgrade (either upgrade the code or mapping table) still won't know how to
> > convert U+4ff3a to lower case or upper case, right ?
>
> No, but that's irrelevant for character conversion. Once you update the Unicode character database in your product, your software will do it - if it knows how to deal with supplementary characters in general. (That part is a technicality which is, again, independent of whether there _are_ assigned characters.)

It still take a "Once you update the Unicode character database in your product" to make it happen, right? From software distribution point of view, it mean a different version number and therefore usually require a QA cycle. As I said, you CANNOT do it WITHOUT an upgrade. Anteing could happen WITH an upgrade-
either change to code or change the mapping table.

> > But how can you generate such mapping table without knowing that character ?
>
> By specifying which _code point_ in one encoding gets mapped to which other _code point_ in the other encoding.
> Character conversion never looks at whether the code points that it maps are actual _characters_.
>
> When you map between the GBK or Shift-JIS user-defined areas and Unicode PUA or similar, then you also map code points that don't have characters. What's new?

Case mapping ? You have no way to generate mapping table for case mapping with knowing the character unless you already define those character have no case or only one case.

> > ...
> > How many years does it take for people to realize that give a new mappint to
> > their customer still need a complete life cycle of QA and distribution? And
> > there will be a new version number attach to the software for that.
>
> Is this about the existence of supplementary characters again?
> They exist since 1996, and a vendor who followed the UTC/ISO negotiations could see it coming since 1993.
> Surely most everyone had the time to roll out a new release of their software to get the support for them in - in more than five years?

Don't tell me there any people how implemented HanCharacterStokeNumber(U+20000) in 1996, no body have a implementation of HanCharacterStokeNumber(U+20000) until U+20000 got defined.

>
>
> (I know that few actually worked on this in time. But time there was.)
>
> markus



This archive was generated by hypermail 2.1.2 : Thu Sep 27 2001 - 17:03:18 EDT