From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Thu Jul 29 2010 - 11:07:44 CDT
"Mark Davis ☕" <mark@macchiato.com>
> It is not so strange. Read
> http://www.unicode.org/reports/tr24/proposed.html#Multiple_Script_Values,
> and other parts of #24 describing Common.
It is exactly because I had read this proposed update for UTS#24 that
I used my argument (if not, I would have not spoken about the
ExtendedScript property in my report : isn't it made to use more
precise mappings to ISO 15924, including script variants ?).
Nothing would be special about "Common" : "sc=Arabic" alias "sc=Arab"
could use the same formalism (also used for and "Hani", "Jpan" that
are defined as multiple scripts or script variants) to subdivide it
with the new "extended script" property.
It's true that for now, Unicode is unable to make distinctions between
"Hans" and "Hant" on just the encoded abstract characters (so for them
we have "sc=Hani" only, but an "extended script" property could make
more precise mappings, without being completely bound to the stability
policy).
But it does not mean that texts and localization resources can't make
such distinctions by external tagging, or in stylesheets, or in
romanization schemes. And librarians (and book readers) already make
distinctions as well between Eastern and Western versions of the
unified Arabic.
It could even have benefit within IDNA to help diagnose those digits
that have confusable forms in the two variants (even if there's a work
in progress for defining the confusables needed for IDNA), and adding
the extra ISO 15924 codes (for Arabic variants) won't break Unicode
(after all there are already variants for Latin and Sinograms, exactly
because of these "font variants").
This archive was generated by hypermail 2.1.5 : Thu Jul 29 2010 - 11:11:04 CDT