From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Mon May 17 2004 - 13:19:28 CDT
I know that now ISO15924 publishes 4-letter codes for scripts used in
Bibliographic references and that it contains more scritps than in Unicode as
ISO-15924 needs separate codes for variants that are unified for Unicode
encoding.
I also understand that Unicode defines its own "Hiragana_or_Katakana" code that
is needed for character classification of a few characters (this specific code
is not used as script codes for bibliographic reference, and that's a good
justification for listing the "script" name in ISO-15924 between parentheses as
this is a technical requirement, however ISO-15924 still accepted to encode it
under N°=412, Code=Hrkt.)
I also understand that Unicode also needs script IDs for "Common" and "Inherit".
ISO-15924 also includes a "ID" column that should reflect the script ID used in
Unicode character properties.
BUT:
I note these quirks:
- The ISO-15924 text incorrectly references the IDs "Old Italic", "Linear B",
"Canadian Aboriginal" with spaces, but the actual script IDs as defined in UAX
#24 use underscores.
- Unicode already defines three script ids that have no correspondance in
ISO-15924: "Limbe", "Tai Le", Cypriot". Should there exist now a request to map
these scripts with IDs in ISO-15924?
- Isn't the Unicode script ID "Common" mapping to the ISO-15924 code "Zyyy"
(N°=998) for undetermined script?
- Should there exist a "Zwww" code in ISO 15924 for the Unicode "Inherited"
script ID?
Many ISO-15924 code exist that are candidate for encoding within Unicode with
their own script ID to be defined later. For example these have been already
discussed here:
%N°;code;English name;nom français
100;Mero;Meriotic;méroïtique
115;Phnx;Phoenician;phénicien
120;Tfng;Tifinagh (Berber);tifinagh (berbère)
140;Mnda;Mandaean;mandéen //Is it same as Mende "Kikakui" Syllabic?
282;Plrd;Pollard Phonétic;phonétique de Pollard
300;Brah;Brahmi;brâhmî
360;Java;Javanese;javanais
365;Batk;Batak;batak
...
and many others that are in the Unicode roadmap.
For these scripts, does Unicode need to define its own script ID?
Or shouldn't simply Unicode deprecate script IDs in favor of ISO-15924 codes?
This may be important because UAX#24 is a normative reference in the W3C CSS3
specification, and may be the existing Unicode IDs should become aliases of
ISO-15924 codes.
This archive was generated by hypermail 2.1.5 : Mon May 17 2004 - 13:19:53 CDT