Re: Common and Inheritied Unicode scripts

From: Markus Scherer <markus.icu_at_gmail.com>
Date: Mon, 10 Jun 2013 11:48:58 -0700

On Mon, Jun 10, 2013 at 10:29 AM, Colosi, John <jcolosi_at_verisign.com> wrote:

> All,
>
>
>
> Per UTR 24, Section 2.8<http://www.unicode.org/reports/tr24/#Multiple_Script_Values>the COMMON and INHERITED script values indicate that a code point can be
> used with 2 or more other Scripts. But the document is not broadly
> explicit about which scripts are compatible with which COMMON/INHERITED
> code points. An example in this section indicates that *“U+30FC ( ー )
> KATAKANA-HIRAGANA PROLONGED SOUND MARK is shared between Hiragana and
> Katanana [sic]”* and that it cannot be *“used with other scripts, such as
> Latin or Greek”*.
>

It does not say "cannot be used", it says "Neither character is used".
UAX #24 and the related properties are descriptive -- they document usage,
they don't forbid it.

Please also read the following section, "2.9 Script_Extensions Property"

In my reading, it feels like the document stops short of saying “U+0660
> must only be used in Arabic and Syriac”.
>
Unicode does not forbid combinations of characters.

> Given that these statements appear as an example, they feel
> non-normative. So generally speaking, I’d love to get some guidance about
> how registries should treat COMMON/INHERITED code points. Specifically,
> should registries impose restrictions on the use of certain COMMON code
> points? Is there a document that describes those restrictions, mapping
> COMMON/INHERITED code points to a set of scripts?
>
http://www.unicode.org/reports/tr36/
http://www.unicode.org/reports/tr39/
http://www.unicode.org/reports/tr46/
http://www.unicode.org/reports/tr31/

markus

-- 
Google Internationalization Engineering
Received on Mon Jun 10 2013 - 13:52:42 CDT

This archive was generated by hypermail 2.2.0 : Mon Jun 10 2013 - 13:52:44 CDT