PRI #227: Changes to Script_Extensions Property Values

The Unicode Technical Committee is requesting feedback on a proposal for changes to the Script Extensions property values for certain combining marks. This proposal is intended to better reflect the set of scripts that those characters are used with.

In Unicode 6.1, these characters all have property values Script_Extensions={Common} or Script_Extensions={Inherited}. Those values would be appropriate for characters that are used with a wide variety of scripts. However, there is reason to believe that the characters listed below are instead used primarily only with certain scripts.

The UTC seeks corroboration that these choices are appropriate, or alternatively, information on whether other scripts should be listed instead. Additional scripts may be added to the Script_Extensions property values for any character at any time, so if use with another script is suspected, but not confirmed at this point, a future update could be made when more definite information becomes available.

The UTC is particularly interested in information about any additional scripts (besides Devanagari) that the Vedic combining marks are commonly used with. In some cases, where characters are firmly associated with a single script, an alternative approach would be to change the Script property value. This would apply to the two Greek cases listed first, below.

The header of each numbered section below suggests a specific change to the Script_Extensions property value for a specific list of characters.

1. Change to Script_Extensions={Greek}

Combining Diacritical Marks — Additions for Greek

U+0342 ( ͂ ) COMBINING GREEK PERISPOMENI

U+0345 ( ͅ ) COMBINING GREEK YPOGEGRAMMENI


Combining Diacritical Marks Supplement — Used for Ancient Greek

U+1DC0 ( ᷀ ) COMBINING DOTTED GRAVE ACCENT

U+1DC1 ( ᷁ ) COMBINING DOTTED ACUTE ACCENT


2. Change to Script_Extensions={Latin}

Combining Diacritical Marks — Medieval superscript letter diacritics

U+0363 ( ͣ ) COMBINING LATIN SMALL LETTER A

U+0364 ( ͤ ) COMBINING LATIN SMALL LETTER E

U+0365 ( ͥ ) COMBINING LATIN SMALL LETTER I

U+0366 ( ͦ ) COMBINING LATIN SMALL LETTER O

U+0367 ( ͧ ) COMBINING LATIN SMALL LETTER U

U+0368 ( ͨ ) COMBINING LATIN SMALL LETTER C

U+0369 ( ͩ ) COMBINING LATIN SMALL LETTER D

U+036A ( ͪ ) COMBINING LATIN SMALL LETTER H

U+036B ( ͫ ) COMBINING LATIN SMALL LETTER M

U+036C ( ͬ ) COMBINING LATIN SMALL LETTER R

U+036D ( ͭ ) COMBINING LATIN SMALL LETTER T

U+036E ( ͮ ) COMBINING LATIN SMALL LETTER V

U+036F ( ͯ ) COMBINING LATIN SMALL LETTER X


3. Change to Script_Extensions={Latin, Cyrillic}

Cyrillic — Historic miscellaneous

U+0485 ( ҅ ) COMBINING CYRILLIC DASIA PNEUMATA

U+0486 ( ҆ ) COMBINING CYRILLIC PSILI PNEUMATA


4. Change to Script_Extensions={Latin, Devanagari}

Devanagari — Vedic tone marks

U+0951 ( ॑ ) DEVANAGARI STRESS SIGN UDATTA

U+0952 ( ॒ ) DEVANAGARI STRESS SIGN ANUDATTA


5. Change to Script_Extensions={Devanagari}

Vedic Extensions — Tone mark for the Atharvaveda

U+1CE1 ( ᳡ ) VEDIC TONE ATHARVAVEDIC INDEPENDENT SVARITA


Vedic Extensions — Ardhavisarga

U+1CF2 ( ᳲ ) VEDIC SIGN ARDHAVISARGA

U+1CF3 ( ᳳ ) VEDIC SIGN ROTATED ARDHAVISARGA


Vedic Extensions — Tone marks for the Samaveda

U+1CD0 ( ᳐ ) VEDIC TONE KARSHANA

U+1CD1 ( ᳑ ) VEDIC TONE SHARA

U+1CD2 ( ᳒ ) VEDIC TONE PRENKHA


Vedic Extensions — Sign for Yajurvedic

U+1CD4 ( ᳔ ) VEDIC SIGN YAJURVEDIC MIDLINE SVARITA

...

U+1CDD ( ᳝ ) VEDIC TONE DOT BELOW

U+1CF4 ( ᳴ ) VEDIC TONE CANDRA ABOVE


Vedic Extensions — Tone marks for the Satapathabrahmana

U+1CDE ( ᳞ ) VEDIC TONE TWO DOTS BELOW

U+1CDF ( ᳟ ) VEDIC TONE THREE DOTS BELOW


Vedic Extensions — Tone mark for the Rigveda

U+1CE0 ( ᳠ ) VEDIC TONE RIGVEDIC KASHMIRI INDEPENDENT SVARITA


Vedic Extensions — Diacritics for visarga

U+1CE2 ( ᳢ ) VEDIC SIGN VISARGA SVARITA

...

U+1CE8 ( ᳨ ) VEDIC SIGN VISARGA ANUDATTA WITH TAIL


Vedic Extensions — Marks of nasalization

U+1CED ( ᳭ ) VEDIC SIGN TIRYAK


See also http://unicode.org/Public/6.1.0/ucd/ScriptExtensions.txt