From: Mark Davis ☕ (mark@macchiato.com)
Date: Thu Feb 18 2010 - 11:26:34 CST
We probably should add to the FAQ the following information.
1. The duplicate accented characters encoded as a result of Greek national
body requests in ISO 10646 in the process of the merger with Unicode, nearly
20 years ago. This is despite the fact that the equivalent characters were
already encoded.
http://unicode.org/cldr/utility/list-unicodeset.jsp?a=
\p{sc=greek}%26\p{nfcqc=n}
However, they are not in canonical form, and when mapped to NFC (the
recommended format), they are converted to the normal characters. Example:
http://unicode.org/cldr/utility/transform.jsp?a=::nfc;+(.)+
>+%26hex($1)+\u00a0+%26name($1)&b=\u1F71
2. As to the uppercase forms, the consortium has decided that any
language-specific casing information should be in the Unicode locales
project (CLDR). People can submit a proposal for exactly how the casing
should work for Greek or other languages at
http://unicode.org/cldr/trac/newticket. Such a proposal should completely
specify the desired processing not only for UPPERCASING, but also
Titlecasing and lowercasing.
Mark
On Thu, Feb 18, 2010 at 08:25, Apostolos Syropoulos
<ijdt.editor@gmail.com>wrote:
>
>
> 2010/2/18 Jon Hanna <jon@hackcraft.net>
>
> Apostolos Syropoulos wrote:
>>
>>> Not really as this makes no sense! The whole issue
>>> seems to me as yet another Unicode error.
>>>
>>
>> What way would you have provided round-trip compatibility with previous
>> encodings?
>>
>>
> Do you consider this absolutely necessary, because I don't! I am using
> Unicode for my
> books and articles and I have to do many tricks to get things right...
>
> A.S.
>
>
> --
> Apostolos Syropoulos
> 366, 28th October Str.
> GR-671 00 Xanthi, GREECE
>
>
This archive was generated by hypermail 2.1.5 : Thu Feb 18 2010 - 11:28:11 CST