Accumulated Feedback on PRI #500

This page is a compilation of formal public feedback received so far. See Feedback for further information on this issue, how to discuss it, and how to provide feedback.

Date/Time: Sun Feb 11 06:11:52 CST 2024
ReportID: ID20240211061152
Name: Heiko
Report Type: Public Review Issue
Opt Subject: 500

Designation of hieroglyphs

In Unicode 15.1 there are Egyptian hieroglyphs in the range from U+13000 to
U+1342F. In the NamesList.txt file, these hieroglyphs have a hieroglyph
code in their name. For example, the character with the code point U+13000
has the name 'EGYPTIAN HIEROGLYPH A001'.

In Unicode 16.0 there are now newly added Egyptian hieroglyphs in the range
from U+13460 to U+143FA. I have now noticed that these new hieroglyphs in
the NamesList.txt file no longer have the hieroglyph code in their name,
but simply their respective code point. For example, the character with the
code point U+13460 has the name 'EGYPTIAN HIEROGLYPH-13460'.

There is a new file Unikemet.txt. This file contains the hieroglyph code in
the kEH_UniK field. For the character with the code point U+13460, the
hieroglyph code is A001F. This should also be in the name of the code point
in the NamesList.txt file. The hieroglyphs newly added in Unicode 16.0
should not be designated differently from the hieroglyphs previously known
in Unicode 15.1. It should be uniform.

Date/Time: Sun Feb 25 09:55:38 CST 2024
ReportID: ID20240225095538
Name: Heiko
Report Type: Public Review Issue
Opt Subject: 500

I noticed following issues in file Unikemet.txt (Egyptian hieroglyphs) of 
Unicode 16.0:

The following character (code point) has no description (kEH_Desc) and no hieroglyph 
code (kEH_UniK, kEH_JSesh), but its name in file NamesList.txt is 'EGYPTIAN HIEROGLYPH L003'. 
So it could therefore become the hieroglyph code 'L003' in file Unikemet.txt:

U+131A6

The following 9 characters exist in file Unikemet.txt, but have no description 
(kEH_Desc) and no hieroglyph code (kEH_UniK, kEH_JSesh):

U+1372D
U+13731
U+13807
U+13A73
U+13A9D
U+13C3F
U+13CCE
U+13CF3
U+1430D

The following 319 characters have a description (kEH_Desc), but no hyroglyph code 
(kEH_UniK, kEH_JSesh) in file Unikemet.txt:

U+136C7
U+136D7
U+136D9
U+136EA
U+1370F
U+13711
U+1371E
U+1371F
U+13736
U+1373A
U+1373B
U+1373C
U+13747
U+1374B
U+13758
U+13776
U+13788
U+13789
U+1378F
U+137A8
U+137CB
U+137D2
U+137D4
U+137DE
U+137E8
U+137E9
U+137EA
U+137ED
U+137F6
U+13808
U+1382A
U+13835
U+1383C
U+1383F
U+13841
U+13842
U+1384B
U+13862
U+13865
U+13888
U+13889
U+1388B
U+13892
U+13893
U+13897
U+138A4
U+138A9
U+138C4
U+138CE
U+138D9
U+138EE
U+138EF
U+138FB
U+13900
U+13912
U+13914
U+13933
U+13934
U+13972
U+13975
U+13977
U+13985
U+13989
U+13991
U+13997
U+13998
U+139A0
U+139AB
U+139B9
U+139BA
U+139C3
U+139C9
U+139CA
U+139CF
U+139D1
U+139D6
U+139D7
U+139FA
U+139FF
U+13A02
U+13A09
U+13A19
U+13A1A
U+13A1E
U+13A21
U+13A26
U+13A59
U+13A66
U+13A7C
U+13A81
U+13A83
U+13A8B
U+13A8D
U+13A8F
U+13A92
U+13A93
U+13AB9
U+13ABE
U+13AC5
U+13AD1
U+13AD5
U+13AE4
U+13AE7
U+13AE9
U+13AEF
U+13AF7
U+13AFF
U+13B01
U+13B0A
U+13B0C
U+13B0E
U+13B12
U+13B1C
U+13B1E
U+13B31
U+13B53
U+13B58
U+13B5D
U+13B61
U+13B65
U+13B6A
U+13B6E
U+13B71
U+13B75
U+13B7B
U+13B83
U+13B85
U+13BB4
U+13BBD
U+13BC0
U+13BCD
U+13BD2
U+13BD3
U+13BDF
U+13C03
U+13C04
U+13C13
U+13C1D
U+13C30
U+13C43
U+13C46
U+13C5A
U+13C77
U+13C8E
U+13C93
U+13C94
U+13CA1
U+13CA2
U+13CB8
U+13CC6
U+13CC7
U+13CC8
U+13CCA
U+13CE0
U+13D12
U+13D13
U+13D30
U+13D32
U+13D36
U+13D37
U+13DA8
U+13DB1
U+13DBD
U+13DBE
U+13DCD
U+13DCE
U+13DD8
U+13DE1
U+13DE7
U+13E00
U+13E09
U+13E10
U+13E2E
U+13E33
U+13E3F
U+13E4C
U+13E86
U+13EB0
U+13EB1
U+13EB8
U+13EE4
U+13EE5
U+13EE6
U+13EE9
U+13EFB
U+13EFC
U+13F02
U+13F2A
U+13F2F
U+13F45
U+13F5E
U+13F5F
U+13F62
U+13F64
U+13F6B
U+13F6C
U+13F6D
U+13F70
U+13F71
U+13F81
U+13FB6
U+14008
U+1400D
U+1400E
U+14012
U+14019
U+1402D
U+14057
U+14058
U+1405A
U+14061
U+14076
U+14081
U+14088
U+1408C
U+14099
U+140A3
U+140B1
U+140B4
U+140BC
U+140C8
U+140CE
U+140DD
U+140E2
U+140F0
U+140F3
U+140FD
U+140FE
U+1410B
U+1410C
U+1410D
U+1410E
U+14110
U+1412A
U+14131
U+14141
U+14145
U+1414E
U+14158
U+1415A
U+1415E
U+14168
U+14175
U+1417D
U+14187
U+14189
U+14191
U+14196
U+1419F
U+141A2
U+141A8
U+141AD
U+141AF
U+141B1
U+141B9
U+141BF
U+141C1
U+141CE
U+141E4
U+141ED
U+141F0
U+141FB
U+1420D
U+14211
U+14221
U+14222
U+14228
U+1422D
U+14242
U+14245
U+14250
U+14251
U+14262
U+1426D
U+1427F
U+14282
U+14285
U+142A8
U+142A9
U+142AC
U+142AD
U+142B9
U+142BA
U+142BE
U+142C2
U+142C5
U+142CA
U+142CB
U+142D1
U+142D3
U+142D4
U+142D5
U+142D6
U+142D9
U+142DA
U+142DF
U+142E0
U+142EA
U+142FF
U+14308
U+14325
U+1432A
U+1434B
U+14350
U+14371
U+14380
U+14397
U+143AF
U+143BD
U+143C2
U+143C4
U+143D3
U+143D8
U+143DB
U+143DF
U+143E3
U+143E5
U+143F6
U+143F8

Date/Time: Thu Feb 29 13:22:24 CST 2024
ReportID: ID20240229132224
Name: Peter Constable
Report Type: Public Review Issue
Opt Subject: 57

In section 3.1, the acronym "IFAO" is used for the first time but without
any explanation of what it refers to. The acronym should be spelled out in
its first usage.

That acronym is spelled out in 4.1, in the description of kEH_IFAO. That
appears to be providing a bibliographic reference; bibliographic references
should probably be put in a section toward the end of the document.

The acronym is spelled out in 11.4.2 of the draft core spec
(https://unicode-org.github.io/core-spec/chapter-11/#G26607), but it spells
out IFAO as "Institut français d’archéologie orientale" (in italics). That
is making a reference to the organization, whereas the reference for
kEH_IFAO describes it as documentation provided by that organization. It
would be good to have better consistency between these two.

Date/Time: Mon Apr 22 06:44:29 CDT 2024
ReportID: ID20240422064429
Name: Alexander Kunde
Report Type: Error Report
Opt Subject: Unikemet.txt

This concerns the database file for Egyptian Hieroglyphs, Unikemet.txt
(https://unicode.org/Public/draft/UCD/ucd/Unikemet.txt). I'm not an
Egyptologist myself, but merely an interested observer (in & from
Germany), who noticed - not so much errata - but a few minor
inconsistencies or typos in Unikemet.txt, which I'd like to bring to your
attention.

1. Inconsistencies in opening comments of Unikemet.txt: some listed variable
names begin in "KEH_" (i.e. capital K), rather than 'kEH_'. Allow me to
also to mention that, by comparison, Unihan_Readings.txt doesn't come with
the initial line "; charset=UTF-8", which produces an error when
auto-processing the file and only filtering empty lines or comments. Just
sayin' .

2. typos in kEH_Func: e.g. (at least 1 instance
of) "Ckassifier", "Classifer", "Anukis", "divinitiy", "divinty", "Logogrm",
"phonemogram" (capitalization), "Phonemogrom", "Phonemomgram", "Phonogram"
(?)

3. inconsistent wording in kEH_Func, e.g.:
	Logogram (Neith 4-5th nome of LE)
	Logogram (Neith, 4-5th nome of LE)
	Logogram Neith (4-5th nome of LE)

Points 2 & 3 were spotted after sorting the content of kEH_Func
alphabetically (and having my text editor remove [most] doubles). (I would
attach the resulting file for your convenience, but there seems to be no
option for attachments. I'll try sending it to humans@unicode separately).

Have a nice day.

Alex

Feedback above this line reviewed during UTC #179 in April 2024.

Date/Time: Wed May 08 10:10:47 CDT 2024
ReportID: ID20240508101047
Name: Heiko
Report Type: Public Review Issue
Opt Subject: 500


The character with code point U+132F0 has the description 'EGYPTIAN
HIEROGLYPH S026A' in file NamesList.txt, but in file Unikemet.txt its
kEH_UniK has been changed from S026A to S210.

Some descriptions in file Unikemet.txt start with a space and some others
end with a space.

Date/Time: Thu May 09 17:46:21 CDT 2024
ReportID: ID20240509174621
Name: Debbie Anderson
Report Type: Public Review Issue
Opt Subject: 500

In the draft UAX #57, one instance still needs to be adjusted to say Unikemet.txt: 
"No hieroglyph may have more than one instance of a given property associated with it, 
and no empty properties are included in Unikemet.zip."

Date/Time: Fri Jun 07 14:25:56 CDT 2024
ReportID: ID20240607142556
Name: Michel Mariani
Report Type: Public Review Issue
Opt Subject: 500

- Concerned documents (2024-06-07):
https://www.unicode.org/Public/draft/UCD/ucd/Unikemet.txt
https://www.unicode.org/reports/tr57/tr57-2.html

- The syntax regular expression for the kEH_HG property is incorrect; 
the trailing letter(s) are apparently optional, and the string "US" 
never occurs.

The current regex:
([A-IK-Z]|AA)[0-9]{1,3}[A-Z]{1,2}
|US

should be:
([A-IK-Z]|AA)[0-9]{1,3}([A-Z]{1,2})?

or:
([A-IK-Z]|AA)[0-9]{1,3}[A-Z]{0,2}

- The syntax regular expression for the kEH_UniK property is incorrect;
the trailing letter(s) are apparently optional.

The current regex:
([A-IK-Z]|AA|NL|NU)[0-9]{3}[A-Z]{1,2}
| HJ ([A-IK-Z]|AA)[0-9]{3}[A-Z]{1,2}

should be:
([A-IK-Z]|AA|NL|NU)[0-9]{3}([A-Z]{1,2})?
| HJ ([A-IK-Z]|AA)[0-9]{3}([A-Z]{1,2})?

or:
([A-IK-Z]|AA|NL|NU)[0-9]{3}[A-Z]{0,2}
| HJ ([A-IK-Z]|AA)[0-9]{3}[A-Z]{0,2}

- Likewise, for the kEH_JSesh property, the syntax regular expression:

([A-IK-Z]|Aa|NL|NU)[0-9]{1,3}[A-Za-z]{1,4}
|(US1|US22|US248|US685)([A-IK-Z]|Aa|NL|NU)[0-9]{1,3}[A-Za-z]{1,4}

should probably be updated as follows (including {1,5} instead of {1,4}):
([A-IK-Z]|Aa|NL|NU)[0-9]{1,3}([A-Za-z]{1,5})?
|(US1|US22|US248|US685)([A-IK-Z]|Aa|NL|NU)[0-9]{1,3}([A-Za-z]{1,5})?

or:
([A-IK-Z]|Aa|NL|NU)[0-9]{1,3}[A-Za-z]{0,5}
|(US1|US22|US248|US685)([A-IK-Z]|Aa|NL|NU)[0-9]{1,3}[A-Za-z]{0,5}

- The following entries with the kEH_JSesh property have issues since the 
semi-colon they contain is not taken into account in the syntax regular 
expression; maybe should it be replaced by a space (delimiter):

U+130C8	kEH_JSesh	D66;D245
U+131AF	kEH_JSesh	M1E;M48

- The following entries with the kEH_JSesh property trigger syntax 
errors too; they all start with 'Ff':

U+13115	kEH_JSesh	Ff4
U+13298	kEH_JSesh	Ff8
U+13299	kEH_JSesh	Ff8A
U+1336D	kEH_JSesh	Ff6
U+133F9	kEH_JSesh	Ff1
U+143F3	kEH_JSesh	Ff100
U+143F4	kEH_JSesh	Ff101
U+143F5	kEH_JSesh	Ff110

- This entry with the kEH_JSesh property is possibly a typo:
U+143CF kEH_JSesh US24BY1VARB
would be:
U+143CF kEH_JSesh US248Y1VARB

- In order to successfully pass various syntax validation tests in JavaScript, 
it seems to be necessary to add an 'i' (case-insensitive) flag when building 
the regular expressions. This should be either documented, or fixed by updating 
the syntax regexes or the data itself; for instance, the following entries 
trigger an error:

U+13256	kEH_HG	O5u
U+1327C	kEH_HG	O29v
U+133AD	kEH_HG	V20h
U+133DC	kEH_HG	Y1v

- This entry appears incorrect (invalid syntax):

U+13D47	kEH_IFAO	I018F

it is probably some kind of mismatch with:

U+13D46	kEH_UniK	I018F


Date/Time: Fri Jun 14 10:56:16 CDT 2024
ReportID: ID20240614105616
Name: Michel Mariani
Report Type: Public Review Issue
Opt Subject: 500

------------------------------------------------------------------------

- In https://www.unicode.org/Public/draft/UCD/ucd/Unikemet.txt, the kEH_UniK property is missing from the header.

Something like this should possibly be appended to the list:
#	Original catalog value: kEH_UniK

------------------------------------------------------------------------

- In https://www.unicode.org/Public/draft/UCD/charts/CodeCharts.pdf and https://www.unicode.org/Public/draft/UCD/charts/blocks/U13460.pdf:

    • Pages 1541 to 1543:
    
    Out of sync subgroup numbering:

    G17. Goose  -->  G08. Goose
    G08. Owl  -->  G09. Owl
    G09. Cormorant  -->  G10. Cormorant
    G10. Wader: ibis, flamingo, stork, heron, egret  -->  G11. Wader: ibis, flamingo, stork, heron, egret
    G11. Falcon  -->  G12. Falcon
    G12. Falcon opening his wings  -->  G13. Falcon opening his wings
    G13. Falcon legs bent  -->  G14. Falcon legs bent
    G14. Falcon mummified or emblem of falcon  -->  G15. Falcon mummified or emblem of falcon
    G15. Swallow, sparrow  -->  G16. Swallow, sparrow
    G16. Hoopoe  -->  G17. Hoopoe

    • Page 1565:

    Since there is already another type *2*:
    S20. Sunshade type 2 (S36),
    maybe should it be:

    S21. Sunshade type 2 (S37D)  -->  S21. Sunshade type 3 (S37D)

    • Page 1570:

    Duplicate subgroup numbering, conflicting with:
    U21. Mill tool

    U21. U category, varia  --> U22. U category, varia

    • Page 1574:

    143EE ᓘ EGYPTIAN HIEROGLYPH-143EE
    143EF ֚ EGYPTIAN HIEROGLYPH-143EF
    143F0 ᓗ EGYPTIAN HIEROGLYPH-143F0
    143F1 ᕟ EGYPTIAN HIEROGLYPH-143F1
    143F2 ֛ EGYPTIAN HIEROGLYPH-143F2

    are currently listed under:

    Z06. Geometric shapes, varia

    But, according to https://www.unicode.org/Public/draft/UCD/ucd/Unikemet.txt, 
    their kEH_Cat property starts with "Z-07-", which suggests they should be listed under Z07.

    Likewise:

    143F3 ℙ EGYPTIAN HIEROGLYPH-143F3
    143F4 ῢ EGYPTIAN HIEROGLYPH-143F4
    143F5 ℑ EGYPTIAN HIEROGLYPH-143F5
    143F6 ⚏ EGYPTIAN HIEROGLYPH-143F6

    are currently listed under:

    Z07. Hieratic signs

    but their kEH_Cat property starts with "Z-08-", which suggests they should be listed under Z08.

    So, maybe:

    Z06. Geometric shapes, varia  -->  Z07. Geometric shapes, varia
    Z07. Hieratic signs  -->  Z08. Hieratic signs

------------------------------------------------------------------------

Date/Time: Tue Jun 18 15:49:46 CDT 2024
ReportID: ID20240618154946
Name: Michel Mariani
Report Type: Public Review Issue
Opt Subject: 500


- Concerned data file: https://www.unicode.org/Public/draft/UCD/ucd/Unikemet.txt

- Possible typos in kEH_Desc property:

canels -> canals (or channels?)
cresent -> crescent
postule -> pustule



Date/Time: Wed Jun 19 17:20:20 CDT 2024
ReportID: ID20240619172020
Name: Michel Mariani
Report Type: Public Review Issue
Opt Subject: 500

- Concerned data file: https://www.unicode.org/Public/draft/UCD/ucd/Unikemet.txt

- Possible typos in kEH_Desc property:

45 occurrences: cresent  -->  crescent
3 occurrences: postule  -->  pustule
1 occurrence: irigation canels  -->  irrigation canals (or irrigation channels?)