Proposal to encode CID+15910 in Adobe-Japan1 as Latin small letter theta

Document Number	L2/23-129
Date	2023-6-6
Submitter	Nozomu Katō

The Adobe-Japan1 Character Collection, the de facto glyph set for mainstream OpenType Japanese fonts, contains two sets that look like small letter beta, small letter theta, and small letter chi. The first set, CIDs 1036, 1042, and 1056, appears along with Greek letters in the chart (PDF) and the second set, CIDs 15909-15911, appears as these three alone:

Although the chart of Adobe-Japan1 does not explain the details of each glyph, they can be identified through its resource files:

In the CMap file for Adobe-Japan1-7, the three glyphs in the first group are assigned with U+03B2, U+03B8, and U+03C7, whereas the ones in the second group are assigned with U+A7B5, (unassigned), and U+AB53, respectively:

Excerpt from cid2code.txt (Version 05/18/2022)
CID	UniJIS2004- UTF32-H	UniJISX0213- UTF32-H	UniJISX02132004- UTF32-H	Unicode Name
1036	000003b2	000003b2	000003b2	GREEK SMALL LETTER BETA
1042	000003b8	000003b8	000003b8	GREEK SMALL LETTER THETA
1056	000003c7	000003c7	000003c7	GREEK SMALL LETTER CHI
15909	0000a7b5	0000a7b5	0000a7b5	LATIN SMALL LETTER BETA
15910	*	*	*	-
15911	0000ab53	0000ab53	0000ab53	LATIN SMALL LETTER CHI

In the features file for Adobe-Japan1-7, all the CIDs mapped with Greek and Cyrillic glyphs including the three in the first group (CIDs 1011-1124 etc) are not defined in the 'vrt2' table, which means that they are intended to be shown upright even in the vertical writing mode; whereas the three CIDs in the second group are defined in the 'vrt2' table and intended to be replaced with CIDs 16712-16714, prerotated forms of CIDs 15909-15911 respectively, in the vertical mode:
Having a prerotated glyph is a custom of Latin and IPA characters in Japanese fonts based on Adobe-Japan1.

Thus, it can be concluded that the first set is Greek small letters while the second set is Latin/IPA small letters.

The second set and their prerotated forms were added to Adobe-Japan1 as Supplement 5. According to Dr. Ken Lunde, who had been developing and maintaining the Adobe-Japan1 Collection for many years, unlike all other Supplements, Supplement 5 came from what was called APGS (Apple Publishing Glyph Set) dating back to 2001, and even he is not familiar with the intention or purpose of the addition of each glyph. The only possible clues that he has are that 1) Apple took a lof of its glyphs from Sha-ken（写研）'s glyph set, which included U+A7B5 ꞵ, 2) Adobe-Japan1-5 was published in 2002, and the three glyphs in question are in a Row Font whose name is RomanSupp, meaning that they were treated as Latin, not Greek.

According to him, the addition of the two mappings of U+A7B5 -> CID+15909 and U+AB53 -> CID+15911 to the Adobe-Japan1 CMap resources was done in 2017. Prior to this, they were unencoded glyphs.

Problem

It has been a common custom of Japanese fonts that Greek (and Cyrillic) characters are fullwidth and shown upright even in the vertical mode. Because of this, it can happen that when Greek small beta, theta, or chi is used as a IPA symbol, only that character is shown upright while the other symbols are rotated in the vertical text:

Hidemaru Editor 9.21 (default GDI). Many traditional applications would render like this.

It is inferred that the reason why APGS had contained the glyphs later to become CIDs 15909-15911 was to provide proportional width and "having a prerotated form" versions of these three characters, mainly for use as IPA symbols.

The appearance of UTR #50 may be changing the situation. But it is still useful to have the small theta that always behaves as the same as Latin/IPA letters regardless of text orientation, and regardless of whether the application supports UTR #50.

Hidemaru Editor 9.21 (DirectWrite enabled). Text orientation aligned, but theta is still fullwidth.

CIDs 15909 and 15911 have already been given Unicode code points, U+A7B5 and U+AB53, respectively. If an OpenType font that supports Adobe-Japan1 is built with a recent version of the CMap file, these two characters can be used in plain text. However, only CID+15910 is not assigned with a code point. So this can never be used in plain text, but can be used only in such an application as can access a glyph by CID/GID value.

As today's mainstream OpenType Japanese fonts are almost all based on Adobe-Japan1 Collection, they generally have glyphs intended for CIDs 15909-15911. To allow the use of CID+15910 in plain text, the following character is proposed to be encoded:

????;LATIN SMALL LETTER THETA;Ll;0;L;;;;;N;;;;;

Or, as a second option, it is proposed to register three pairs of variation sequences so that CIDs 15909-15911 can be specified explicitly, as follows:

<03B2 FE00>, <03B8 FE00>, <03C7 FE00> (Normal Greek letters. Whether to rotate or not in the vertical text depends on whether or not the application supports UTR #50),
<03B2 FE01>, <03B8 FE01>, <03C7 FE01> (IPA symbols. Always rotates in the vertical writing mode).

(End of Proposal)