ISO-15924 script nodes and UAX#24 script IDs

From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Mon May 17 2004 - 13:19:28 CDT

Next message: E. Keown: "Re: Vertical BIDI"

Previous message: E. Keown: "Re: Archaic-Greek/Palaeo-Hebrew (was, interleaved ordering; was, Phoenician)"
Next in thread: Philippe Verdy: "Re: ISO-15924 script nodes and UAX#24 script IDs"
Maybe reply: Philippe Verdy: "Re: ISO-15924 script nodes and UAX#24 script IDs"
Maybe reply: Michael Everson: "Re: ISO-15924 script nodes and UAX#24 script IDs"
Maybe reply: jameskass@att.net: "Re: ISO-15924 script nodes and UAX#24 script IDs"
Reply: Doug Ewell: "Re: ISO-15924 script nodes and UAX#24 script IDs"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

I know that now ISO15924 publishes 4-letter codes for scripts used in
Bibliographic references and that it contains more scritps than in Unicode as
ISO-15924 needs separate codes for variants that are unified for Unicode
encoding.

I also understand that Unicode defines its own "Hiragana_or_Katakana" code that
is needed for character classification of a few characters (this specific code
is not used as script codes for bibliographic reference, and that's a good
justification for listing the "script" name in ISO-15924 between parentheses as
this is a technical requirement, however ISO-15924 still accepted to encode it
under N°=412, Code=Hrkt.)

I also understand that Unicode also needs script IDs for "Common" and "Inherit".

ISO-15924 also includes a "ID" column that should reflect the script ID used in
Unicode character properties.

BUT:

I note these quirks:
- The ISO-15924 text incorrectly references the IDs "Old Italic", "Linear B",
"Canadian Aboriginal" with spaces, but the actual script IDs as defined in UAX
#24 use underscores.
- Unicode already defines three script ids that have no correspondance in
ISO-15924: "Limbe", "Tai Le", Cypriot". Should there exist now a request to map
these scripts with IDs in ISO-15924?
- Isn't the Unicode script ID "Common" mapping to the ISO-15924 code "Zyyy"
(N°=998) for undetermined script?
- Should there exist a "Zwww" code in ISO 15924 for the Unicode "Inherited"
script ID?

Many ISO-15924 code exist that are candidate for encoding within Unicode with
their own script ID to be defined later. For example these have been already
discussed here:
    %N°;code;English name;nom français
    100;Mero;Meriotic;méroïtique
    115;Phnx;Phoenician;phénicien
    120;Tfng;Tifinagh (Berber);tifinagh (berbère)
    140;Mnda;Mandaean;mandéen //Is it same as Mende "Kikakui" Syllabic?
    282;Plrd;Pollard Phonétic;phonétique de Pollard
    300;Brah;Brahmi;brâhmî
    360;Java;Javanese;javanais
    365;Batk;Batak;batak
    ...
and many others that are in the Unicode roadmap.
For these scripts, does Unicode need to define its own script ID?
Or shouldn't simply Unicode deprecate script IDs in favor of ISO-15924 codes?
This may be important because UAX#24 is a normative reference in the W3C CSS3
specification, and may be the existing Unicode IDs should become aliases of
ISO-15924 codes.

Next message: E. Keown: "Re: Vertical BIDI"
Previous message: E. Keown: "Re: Archaic-Greek/Palaeo-Hebrew (was, interleaved ordering; was, Phoenician)"
Next in thread: Philippe Verdy: "Re: ISO-15924 script nodes and UAX#24 script IDs"
Maybe reply: Philippe Verdy: "Re: ISO-15924 script nodes and UAX#24 script IDs"
Maybe reply: Michael Everson: "Re: ISO-15924 script nodes and UAX#24 script IDs"
Maybe reply: jameskass@att.net: "Re: ISO-15924 script nodes and UAX#24 script IDs"
Reply: Doug Ewell: "Re: ISO-15924 script nodes and UAX#24 script IDs"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Mon May 17 2004 - 13:19:53 CDT