L2/00-040

UTC/2000-013

 

Title: Property Changes for Unicode Character Database

Author: Ken Whistler

Date: January 31, 2000

Action: For approval by UTC

 

The following changes have been applied to various data files of the Unicode Character Database since the files were frozen for the Unicode 3.0.0 release (and the CD-ROM).

 

Most of these are informative only. They are provided here FYI. The change for Arabic Shaping affects a normative value, but is a correction to get the data file to match the values printed in the standard.

 

1. Change for Arabic Shaping (normative)

 

The linking and shaping classes for HAMZAT WASL ON ALEF were not

corrected in ArabicShaping.txt. They must be updated to match the

specification printed in the standard for this character.

 

birdie:kenw/work/unicode/datawork> diff ../staging300/ArabicShaping-2d2.txt ArabicShaping-3d1.txt

 

< 0671; HAMZAT WASL ON ALEF; U; <no shaping>

> 0671; HAMZAT WASL ON ALEF; R; ALEF

 

2. UnicodeData.txt fixes (informative)

 

Add "dena sum" as an ISO comment for U+0FCF.

 

Add asterisks as ISO comments for U+01A6 and U+0280.

 

Both of these changes were to correct minor editorial problems discovered

in the printing of ISO/IEC 10646-1.

 

birdie:kenw/work/unicode/unidata> diff ../staging300/UnicodeData-3.0.0d13.txt UnicodeData-3.0.1d1.txt

 

< 01A6;LATIN LETTER YR;Lu;0;L;;;;;N;LATIN LETTER Y R;;;0280;

> 01A6;LATIN LETTER YR;Lu;0;L;;;;;N;LATIN LETTER Y R;*;;0280;

 

< 0280;LATIN LETTER SMALL CAPITAL R;Ll;0;L;;;;;N;;;01A6;;01A6

> 0280;LATIN LETTER SMALL CAPITAL R;Ll;0;L;;*;;;N;;;01A6;;01A6

 

< 0FCF;TIBETAN SIGN RDEL NAG GSUM;So;0;L;;;;;N;;;;;

> 0FCF;TIBETAN SIGN RDEL NAG GSUM;So;0;L;;;;;N;;dena sum;;;

 

3. Names List fixes (informative)

 

The alias for APL quote was moved from 0022 to 0027. (This change

is already reflected in the printed book.)

 

The annotation "(dena sum)" at U+0FCF for the printing of ISO 10646-1

now appears in the name list. (This is automatically picked up in

generating the name list from UnicodeData.txt.)


 

birdie:kenw/work/unicode/namelist> diff ../staging300/NamesList-3.0.0d5.txt UC3M990914.lst

 

< @@@+      Final Draft UC3M990825.lst

<     More annotation fixes for Tibetan.

<       Annotation added for 2231..2233.

---

> @@@+      Final Draft UC3M990914.lst

>     Move APL quote annotation from 0022 to 0027.

 

90d88

<     = APL quote

111a110

>     = APL quote

 

< 0FCF      TIBETAN SIGN RDEL NAG GSUM

> 0FCF      TIBETAN SIGN RDEL NAG GSUM (dena sum)

 

Add an annotation for the numeric usage of digamma.

 

birdie:kenw/work/unicode/namelist> diff UC3N990914.lst UC3N000107.lst

 

< @@@+      Final Draft UC3M990914.lst

<     Move APL quote annotation from 0022 to 0027.

---

> @@@+      Final Draft UC3M000107.lst

>     Add annotation for numeric usage of digamma.

 

> 03DD      GREEK SMALL LETTER DIGAMMA

>     * used symbolically for numeral six

 

4. Changes for PropList.txt (informative)

 

Remove F8F0..F8FF from combining, non-spacing, and NSM listing. (This

was an oversight in the property dump utility, which left in some private

use values used in the generation of the UCA collation tables.)

 

Fix the default range of LR (bidi) to include the entire UDC area,

to match the specification of TR #9.

 

0E47 was removed from alphabetic, and added to diacritic. (U+0E47

THAI CHARACTER MAITAIKHU is apparently treated like the tones, rather

than the vowels.)