Re: Compression through normalization

From: Peter Kirk (peterkirk@qaya.org)
Date: Fri Dec 05 2003 - 17:50:00 EST

Next message: Michael Everson: "Re: Missing African Latin letters (bis)"

Previous message: Michael Everson: "Re: Missing African Latin letters"
In reply to: Philippe Verdy: "RE: Compression through normalization"
Next in thread: Kenneth Whistler: "Re: Compression through normalization"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

On 05/12/2003 14:01, Philippe Verdy wrote:

> ...
>
>It's just a shame that what was considered as equivalent in the Korean
>standards is considered as canonically distinct (and even compatibility
>dictinct) in Unicode. This means that the same exact abstract Korean text
>can have two distinct representation in Unicode and there's no way to match
>these Unicode representations together. And also that whan mapping Korean
>charsets to Unicode, care must be done, before making the mapping, that all
>compound jamaos will be used each time it is possible.
>
>
Agreed.

>If now the text is stored and handled entirely in Unicode without returning
>to the KSC standard, you won't have any other tool than just UCA to collate
>strings (but collation does not produces strings, just collation weights,
>and there's currently no tool to reverse a list of weights back to an
>Unicode string...
>
>...
>
I note the following which is part of the text explaining C10:

> All processes and higher-level protocols are required to abide by C10
> as a minimum.
> However, higher-level protocols may define additional equivalences
> that do not
> constitute modifications under that protocol. For example, a
> higher-level protocol
> may allow a sequence of spaces to be replaced by a single space.

Presumably a higher level protocol could transform Korean text into a
standardised form, doing what (in your opinion and mine at least)
Unicode normalisation ought to have done.

-- 
Peter Kirk
peter@qaya.org (personal)
peterkirk@qaya.org (work)
http://www.qaya.org/

Next message: Michael Everson: "Re: Missing African Latin letters (bis)"
Previous message: Michael Everson: "Re: Missing African Latin letters"
In reply to: Philippe Verdy: "RE: Compression through normalization"
Next in thread: Kenneth Whistler: "Re: Compression through normalization"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Fri Dec 05 2003 - 18:31:55 EST