Reclassify Identifier_Type for characters not in common use | Selected-recommended-IdentifierType-in-MSR-but-not-in-RefLGR |
---|
This document is mechanically formatted from the above XML file for the LGR. It provides additional summary data and explanatory text. The XML file remains the sole normative specification of the LGR.
Date | 2025-02-16 |
---|---|
LGR Version | 16.0.0 |
Unicode Version | 16.0.0 |
Description
Partially updates
L2/19-329R
Characters recommended in both UTS#39 and MSR but excluded from the Root Zone or Reference LGR
This document has been submitted as a UTC document. For convenience in documenting the character list it is presented using an LGR template format. A few minor details of the boilerplate in that template may not be applicable in this context and should be disregarded.
The collection comprises 274 characters included in the Maximal Starting Repertoire [MSR] for the DNS Root Zone that are also recommended in UTS#39 but are not part of the Second Level Reference LGR [RefLGR], as well as the uppercase equivalents for 88 of them (Latin), plus 70 decimal digits excluded from the Reference LGR, for a total of 432 characters.
Recommendation
- Proposed:Uncommon_Use (402)—The majority of these 432 characters are proposed for reclassification to Uncommon_Use, based on the fact that the expert teams charged with reviewing them for the ICANN Root Zone LGR and Reference LGR for the Second Level could not come up with evidence that they are used in common everyday writing, even for minority languages in reasonably widespread use. Consequently, they declined to include them in the respective LGRs.
- Proposed:Technical (12)—In a few cases, a different type, such as Technical, is proposed for the character.
- Recommended (18)—In review, some of the characters are found to have documented use compatible with retaining their Identifier_Type of Recommended. These characters are tagged with Identifier_Type Recommended and no change is proposed.
- Review_Needed (0)—Any characters tagged as Review_Needed would need additional review before a recommendation can be proposed for them.
The table in Section 2, “Repertoire” lists the characters with their proposed or existing Identifier_Type values in the “Tags” column, together with references to the source of their classification, as well as any other information. Because this document only includes characters not added to and therefore not listed in [RefLGR], this source is not documented on the character level.
Additional Recommendations
The document contains some other recommendations in addition to reassigning Identifier_Type. They are:
- Proposed language on combining marks for UTS#39. See “Combining Marks”
- Recommendation to change the Tibetan Script to Limited_Use until it is properly vetted for identifiers. See “Tibetan Script”
- Create an Invariance test to ensure that uppercase characters have the same Identifier_Type as their lowercase equivalents, so that identifier validity does not depend on case mapping.
Background
There are over a thousand non-Han characters with Identifier_Type Recommended that are proposed for consideration of reclassification because they appear to fail reasonable criteria for being needed in identifiers. They come in two sets. For one set, an independent analysis [MSR] has found indications that they should have been considered Uncommon_Use, Obsolete or Technical based on information available at the time of encoding, including their uppercase equivalents and any native digits that can be considered obsolete. That set is discussed in another document. The second set contains characters that were tentatively retained as Recommended in the [MSR] but upon further review by local expert teams from the [RZ-LGR] project were found to not be needed for any language or minority language in reasonably widespread use. That second set is discussed here.
The initial analysis was carried out for the purposes of defining the allowed repertoire for IDN Top Level Domain names for the DNS Root Zone. There are some restrictions that are specific to the Root Zone, such as a prohibition on digits, so a follow-on effort determined how to relax these restrictions in a manner appropriate for the needs of Second-Level Domains. This resulted in the Second-Level Reference Label Generation Rules [RefLGR]. The characters listed in this document are those that were not added to the [RefLGR], for lack of evidence of their use in everyday common writing for any language or minority language in vigorous and reasonably widespread use.
The set of characters discussed in this document starts with all characters of Identifier_Type Recommended, subtracting any character disallowed in [MSR] and then subtracting any character included in [RefLGR]. Added to the set are any uppercase equivalents as well as any native digits in that were not added to the [RefLGR] due to not being in common use.
The implication here is that any character not included in the Reference LGR for lack of documented or identifiable usage should be considered Uncommon_Use for Unicode's default identifiers—until such time as independent evidence to the contrary is produced, or a different Identifier_Type is proposed as a better fit. Until then, in lack of a demonstrated use case, it seems not helpful to continue to suggest that these characters should be supported as recommended. This also applies to any sets of native digits for which local experts considered them uncommon_use for the purpose of identifiers.
Arriving at a precise cutoff for Uncommon_Use is difficult because there is no single source or perfect information on the use of writing systems, and the details of such use are changing over time. In the research cited, that determination started with the [EGIDS] classification as a proxy for the likely level of modern use of the writing system, but made further adjustments in expert review. Accordingly, this document suggests that the UTC should consider the published results of the cited research as one of the better sources of information available and only deviate from it on the basis of even better information.
All decisions for the classification of characters in [MSR], or inclusion in [RZ-LGR] and [RefLGR] are documented and sourced on the character level; the same is not true for Unicode's classification, so it is not easily possible to verify any of the decisions that underlie the classification published in UTS39. By first making the alignment proposed here, and then carefully documenting deviations, a positive side effect might be that the classification overall becomes more transparent and reviewable.
For further background on specific disposition for characters in the DNS Root Zone [RZ-LGR] and Second-Level Reference LGR [RefLGR] see the cited references and links therein.
Additional Notes
- U+0931 ऱ DEVANAGARI LETTER RRA is part of the Root Zone and Reference LGR via sequence (does not occur standalone)
- U+09BC ় BENGALI SIGN NUKTA is part of the Root Zone and Reference LGR via sequence (does not occur standalone)
- U+0DA6 ඦ SINHALA LETTER SANYAKA JAYANNA is part of the Root Zone and Reference LGR via sequence (does not occur standalone)
- U+0E45 ๅ THAI CHARACTER LAKKHANGYAO is part of the Root Zone and Reference LGR via sequence (does not occur standalone)
- U+1063 ၣ MYANMAR TONE MARK SGAW KAREN HATHI is part of the Root Zone and Reference LGR via sequence (does not occur standalone)
Although they are not listed as standalone, they are part of the set of code points used in the repertoire of Root Zone and Reference LGR and therefore no change in their existing Identifier_Type is proposed, and they are not listed below.
Discussion and Review
Domain names are an important and deliberately conservative set of identifiers. That said, there may be other classes of identifiers that don't require the same level of restrictions, so this proposal should not be understood to suggest that default Identifiers must be restricted to only those characters that are being recommended for IDNs. Rather, the purpose is to bring the facts discovered during the development of the IDN repertoire for the DNS Root Zone and the [RefLGR] to the attention of the Unicode Technical Committee, so that characters that were classified Recommended can be given additional scrutiny before confirming their status.
The set of Proposed:Uncommon_Use has been mechanically compared to existing sets of exemplars using the UnicodeSets utility. Only a very small number of discrepancies were found, but in all cases, the respective source documents (proposals) documented deliberate decisions to exclude them, confirming the proposal to mark these as Uncommon_Use. The only exceptions to this can be found in the Arabic script.
Arabic Script
As review progresses, a number of characters have been identified with usage as documented in comments and references. For some, the nature of the use should be adjusted:
- U+0671 ٱ ARABIC LETTER ALEF WASLA - this letter is considered to be “an important Quranic character” (which would make it Technical, but not Uncommon_Use). It is also claimed to be used with a newly invented orthography Luri language in Iran.
- U+06C7 ۇ ARABIC LETTER U - this letter was considered by “Proposal for Arabic Script Root Zone LGR”, [Proposal-Arabic] only for Kirghiz and Azerbaijani, for which the proposal concludes “No evidence found for active use” but it turns out it is used in Kazakh and is also part of the name of the Uyghur language “ئۇيغۇر”.
- U+06CA ۊ ARABIC LETTER WAW WITH TWO DOTS ABOVE - this letter was considered only for Sorani Kurdish
(https://en.wikipedia.org/wiki/Kurdish_alphabets)
in which context widespread use could not be confirmed in [Proposal-Arabic]. However, the character has since been found listed in the exemplars for the Luri languages and is documented for them on omniglot [LRC].
In [Proposal-Arabic] the repertoires needed for the Arabic Alphabets used for Southern Azerbaijani in Iran [AZ], and for Uyghur [UG], Kazakh [KK], and Kyrgyz [KY] in China, the latter with their own published standards, do not appear to have been included in that document. A more detailed review based on documented repertoires for these writing systems suggests a few characters that should be attested after all.
Accordingly, the proposal was changed to Proposed:Technical for U+0671 ٱ and to retain as unchanged the assignment to Recommended for the others. See also “Combining Marks”.
Bengali (Bangla) Script
The following characters are excluded from the DNS Root Zone and Reference LGR for the Second level, though listed as exemplars.
- U+098C ঌ BENGALI LETTER VOCALIC L — listed in the exemplars for the Bengali (Bangla) language.
As discussed in Section 4.2.1.3, “No Rare and Obsolete Characters” of [Proposal-Bengali]:
There are characters which have been added to Unicode to accommodate rare forms such as Sanskritic ... VOCALIC L “ঌ” (U+098C) ... All such characters are excluded...
Devanagari
The following characters are excluded from the DNS Root Zone and Reference LGR for the Second level, though listed as exemplars.
- U+090C ऌ DEVANAGARI LETTER VOCALIC L — listed in the exemplars for several languages, including hindi and sanskrit (doi,hi,kok,mr,ne,sa)
As discussed in Section 5.3, “Code points not included” of [Proposal-Devanagari]:
Reason for Exclusion: Not in modern usage...
Ethiopic Script
The following characters are excluded from the DNS Root Zone and Reference LGR for the Second level, though listed as exemplars.
- U+1347 ፇ ETHIOPIC SYLLABLE TZOA — listed in the exemplars for the Tigrigna language.
As discussed in Section 5.3, “Code Points Excluded from the Repertoire” of [Proposal-Gurmukhi]:
...the following code points have been excluded from the code point repertoire of the Ethiopic script LGR as there has not been any occurrence of the code point in the corpus data analysed for online contents published in Amharic and Tigrigna languages...
Gurmukhi Script
The following characters are excluded from the DNS Root Zone and Reference LGR for the Second level, though listed as exemplars.
- U+0A72 ੲ GURMUKHI IRI — listed in the exemplars for the Punjabi language.
- U+0A73 ੳ GURMUKHI URA — listed in the exemplars for the Punjabi language.
As discussed in Section 4.1.6, “No Vowel Carriers” of [Proposal-Gurmukhi]:
Gurmukhi script has three vowel carriers ( URA, ੳ (U+0A73), AIRA ਅ (U+0A05) and IRI, ੲ (U+0A72)). They are used as vowel carriers and thus always need to be followed by some matra when used in text. ... where these vowel carriers occur with a matra they will be identical with one of the independent vowels (ਉ (U+ 0A09), ਊ (U+ 0A0A), ਇ (U+ 0A07), ਈ (U+ 0A08), ਏ (U+ 0A0F), ਓ (U+ 0A13); this is also not allowed in Unicode. Thus ੳ (U+0A73) + ◌ੁ (U+0A41), which looks the same as ਉ (U+ 0A09), will create confusion and hence will not be allowed in the LGR. As the vowel carriers ੳ (U+0A73) and IRI, ੲ (U+0A72) cannot occur independently, ... these letters are not included in the repertoire.
Hebrew scripts
The following characters are excluded from the DNS Root Zone and Reference LGR for the Second level, though listed as exemplars.
- U+05F2 ײ – HEBREW LIGATURE YIDDISH DOUBLE YOD — listed in the exemplars for the Yiddish language.
As discussed in Section 5.1.5, “Special Hebrew Code Points” of [Proposal-Hebrew]:
..DOUBLE YOD. Intended for use in Yiddish texts, this code point provides a special combined ligature for two consecutive HEBREW LETTER YOD.
All [...] of these code points are excluded ... [These ligatures] might be confused with their respective combinations of two single letters. In addition, they can be adequately replaced by their respective combination of two consecutive single letters – DOUBLE YOD by two consecutive YOD, etc. Another advantage of these equivalent replacements is that they can be typed using a standard Hebrew-mapped keyboard.
Kannada Script
The following characters are excluded from the DNS Root Zone and Reference LGR for the Second level, though listed as exemplars.
- U+0C8C ಌ KANNADA LETTER VOCALIC L — listed in the exemplars for the Kannada language.
- U+0CB1 ಱ KANNADA LETTER RRA — listed in the exemplars for the Kannada language.
As discussed in Section 5.3, “Codepoints not included” of [Proposal-Kannada]:
Reason for Exclusion: U+C8C Not in modern usage; U+0CB1 ಱ Obsolete character, not used in modern Kannada
Khmer Script
The following characters are excluded from the DNS Root Zone and Reference LGR for the Second level, though listed as exemplars.
- U+17A9 ឩ KHMER INDEPENDENT VOWEL QUU — listed in the exemplars for the Punjabi language.
- U+17B2 ឲ KHMER INDEPENDENT VOWEL QOO TYPE TWO — listed in the exemplars for the Punjabi language
As discussed in Section 5.2, “Independent Vowels” of [Proposal-Khmer]:
...two independent vowels have been excluded from the repertoire. ... The vowel ឩ KHMER INDEPENDENT VOWEL QUU (U+17A9) has been excluded because it is not commonly used in the Khmer dictionary, and [the] few words using this letter now have alternate spellings without it. For example, ឩដ្ឋ (ūdth, camel) is now spelled as អូដ្ឋ ... Moreover, no new words use this code point.
... ឲ KHMER INDEPENDENT VOWEL QOO TYPE TWO (U+17B2) ... is a variant for ឱ Khmer INDEPENDENT VOWEL QOO TYPE ONE (U+17B1). The vowel ឲ KHMER INDEPENDENT VOWEL QOO TYPE TWO (U+17B2) is used to write only one word, the verb “give” but now Choun Nath Khmer Dictionary, which is an authoritative source on Khmer, writes the word as ឱយ (ooy, give) with ឱ KHMER INDEPENDENT VOWEL QOO TYPE ONE(U+17B1) instead.
Latin Script
The following characters are excluded from the DNS Root Zone and Reference LGR for the Second level, though listed as CLDR exemplars for various languages.
- U+014F ŏ LATIN SMALL LETTER O WITH BREVE — listed in the exemplars for the Silesian language.
- U+0157 ŗ LATIN SMALL LETTER R WITH CEDILLA — listed in the exemplars for the Prussian language.
- U+01F9 ǹ LATIN SMALL LETTER N WITH GRAVE — listed in the exemplars for several languages (bas,blo,ewo,jgo,yo). the use of accents on nasal consonants is described in [YO-1], and n with grave can be found in a recent corpus [YO-2]. Therefore, U+01F9 ǹ should remain Recommended.
- U+021F ȟ LATIN SMALL LETTER H WITH CARON — listed in the exemplars for the Lakota language.
- U+1E0D ḍ LATIN SMALL LETTER D WITH DOT BELOW — listed in the exemplars for several languages (kab,kxv,shi-Latn,tzm). According to [KAB] there are a significant number of speakers, and the writing system is probably not as stable as the use as an oral language. The status and recognition seem to have seen some recent improvement so arguably this character should remain Recommended.
- U+1E11 ḑ LATIN SMALL LETTER D WITH CEDILLA — listed in the exemplars for the Prussian language.
- U+1E25 ḥ LATIN SMALL LETTER H WITH DOT BELOW — listed in the exemplars for several languages (ast,kab,shi-Latn,tzm). According to [KAB] there are a significant number of speakers, and the writing system is probably not as stable as the use as an oral language. The status and recognition seem to have seen some recent improvement so arguably this character should remain Recommended.
- U+1E3F ḿ LATIN SMALL LETTER M WITH ACUTE — listed in the exemplars for several languages (blo,jgo,nnh,yo), the use of accents on nasal consonants is described in [YO-1], and m with acute can be found in a recent corpus [YO-2]. Therefore, U+1E3F ḿ should remain Recommended.
- U+1E5B ṛ LATIN SMALL LETTER R WITH DOT BELOW — listed in the exemplars for several languages (kab,kxv,shi-Latn,tzm). According to [KAB] there are a significant number of speakers, and the writing system is probably not as stable as the use as an oral language. The status and recognition seem to have seen some recent improvement so arguably this character should remain Recommended.
- U+1E7D ṽ LATIN SMALL LETTER V WITH TILDE — listed in the exemplars for the Mundang language. While nasalization of vowels is indicated with tilde [MUA], there's no independent corroboration that this is used with the letter v.
- U+1E81 ẁ LATIN SMALL LETTER W WITH GRAVE — listed in the exemplars for the Welsh language. the use of grave with letter w is described as optional in [CY-1]. It does not appear a single time in a large corpus [CY-2].
- U+1E83 ẃ LATIN SMALL LETTER W WITH ACUTE — listed in the exemplars for the Welsh language. The use of acute with letter w is described as optional in [CY-1]. It is rare and only appears a few times in a large corpus [CY-2].
- U+1E85 ẅ LATIN SMALL LETTER W WITH DIAERESIS — listed in the exemplars for several languages (cy,jgo,nnh). The use of diaeresis with letter w in Welsh is described as optional in [CY-1]. It appear extremely rarely in a large corpus [CY-2]. While its use with Ngieboon is described in [NNH], the EGIDS level is 5 with a small population of language users and unknown literacy. For Ngomba (related to Ngieboon) no detailed information was found. Therefore, the proposal to make it Uncommon_Use is unchanged.
- U+1E93 ẓ LATIN SMALL LETTER Z WITH DOT BELOW — listed in the exemplars for the Kabyle language. According to [KAB] there are a significant number of speakers, and the writing system is probably not as stable as the use as an oral language. The status and recognition seem to have seen some recent improvement so arguably this character should remain Recommended.
The following lists the [EGIDS] levels for the languages involved. Where the level warranted a double-check, the results were added above. Unless noted there's no change to the proposal.
- Welsh (cy/cym): 2
- Yoruba (yo/yor): 2
- Ewondo (ewo): 3
- Kabyle (kab): 3
- Anii (blo): 5
- Basaa (bas): 5
- Kuvi (kxv): 5
- Mundang (mua): 5
- Ngiemboon (nnh): 5
- Ngomba (jgo): 5
- Tachelhit (shi): 5
- Silesian (szl): 6a
- Asturian (ast): 6b
- Central Atlas Tamazight (tzm): 6b
- Lakota (lkt): 8a
- Prussian (prg): 9
Oriya Script
The following characters are excluded from the DNS Root Zone and Reference LGR for the Second level, though listed as exemplars.
- U+0B35 ଵ ORIYA LETTER VA — listed in the exemplars for the Oriya language
As discussed in Section 5.2.1, “Code Points Excluded” of [Proposal-Oriya]:
The character Va “ଵ”(0B35) is used in form of handwritten character by few linguistics or has recently been used in a few websites on the internet. But this code point is not very frequently used in mass educational institutions, printed in mass communication, like newspapers or magazines. Some textbooks in the schools and college explain the plosive "Ba" ବ(0B2C) and non-plosive "Va" ଵ(0B35) separately as varga (plosive) and non-varga (non-plosive) respectively.
The public comment version of the Oriya Script LGR, NBGP had included the character Va “ଵ”(0B35) in the repertoire. However, the public comment feedback suggested that this use was not sufficiently common, and further analysis by NBGP concluded that "Va" ଵ(0B35) is not used in commonly printed books, magazines, etc. Therefore, it is excluded from the repertoire.
However, in coming years if this character is seen in frequent use, NBGP may reconsider including it in a later version of its LGR for the Oriya script.
Sinhala Script
The following characters are excluded from the DNS Root Zone and Reference LGR for the Second level, though listed as exemplars.
- U+0D9E ඞ SINHALA LETTER KANTAJA NAASIKYAYA — listed in the exemplars for the Sinhala language
As discussed in Section 5.4, “Code point not included” of [Proposal-Sinhala]:
Reason for Exclusion: Not in modern usage.
Telugu Script
The following characters are excluded from the DNS Root Zone and Reference LGR for the Second level, though listed as exemplars.
- U+0C0C ఌ TELUGU LETTER VOCALIC L — listed in the exemplars for the Telugu language
- U+0C31 ఱ TELUGU LETTER RRA — listed in the exemplars for the Telugu language
As discussed in Section 5.3, “5.4. Code Points Not Included” of [Proposal-Sinhala]:
Reason for Exclusion: Not in widespread use; not used in modern Telugu.
Combining Marks
Needed for NFD:—Where combining marks (such as U+0654 ٔ ) are excluded, but needed for decompositions of recommended characters, it was proposed to focus on the NFC format for Identifier_Type but documenting in UTS#39 that combining characters may be marked as Uncommon_Use even when they are in the NFD version of a modern language's exemplar characters.
The following text is proposed for UTS#39:
Where combining marks, such as U+3099 ゙ COMBINING KATAKANA-HIRAGANA VOICED SOUND MARK are only needed for NFD of recommended characters, they have been given Identifier_Type Uncommon_Use. Identifier systems that work with unnormalized text, or text in NDF may wish to add the full set of characters required in canonical decompositions.
Arabic Script combining marks:—Arabic combining marks are categorically excluded from domain names, see also RFC5564. The Internet Architecture Board [IAB] has issued a statement referencing this issue. Please also see the “Proposal for Arabic Script Root Zone LGR”, [Proposal-Arabic]. In consequence, their current assignment of Recommended should be specifically reviewed by the UTC. If it is felt that a change to Uncommon_Use is not the best classification, then perhaps Inclusion or Technical might be more appropriate.
In the context of this document, this affects the following:
- The major Arabic script combining marks U+064B ً ..U+0652 ْ , U+0654 ٔ ..U+0655 ٕ , U+0657 (Note: of this set, the current document does not mention U+0657 ٗ ). (See also the general discussion for "Combining Marks" above.)
Uppercase Characters
Uppercase characters must share the Identifier_Type of their lowercase equivalent. This is to ensure that default identifiers do not change validity if case mapped. Proposed Identifier_Type values included here match that of their lowercase equivalents. In addition, it is recommended that an invariance test be created to automatically verify the relation between Identifier_Type for lowercase and uppercase characters. Because uppercase characters were not independently evaluated, their table entries do not cite a reference containing source information attesting use.
Native Digits
Not all communities use the native digits encoded for their scripts equally in everyday situations. Where their use is not preferred in that context, native digit sets are excluded from the [RefLGR]. They are tagged here with Identifier_Type Proposed:Uncommon_Use and comments indicate their lack of common use. In cases where native digits are clearly historical or obsolete, Proposed:Obsolete might be more appropriate. However, no such determination has been made here.
Tibetan Script
No attempt has been made to ascertain modern use for specific characters of the Tibetan script. The script is considered by ICANN as eligible for the Root Zone in principle, but work on defining the label generation rules has faced some difficulties and has not yet commenced. It might be reasonable to reflect that uncertainty by also removing these characters from Identifier_Type Recommended until some body, project, or group has created a definite analysis of this script for identifier purposes. (Tibetan characters have been excluded from the list of characters in this document).
A suitable approach might be to change the Tibetan Script to Limited_Use. This does not prevent an implementer to explicitly add characters from the script to their identifiers, but would prevent implementers picking up a script by default that hasn't been properly vetted.
Character Classes
The table in Section 4.1, “Character Classes” presents information about a number of sets collecting characters with different attributes or properties. For each set, a count of members is given. An optional arrow (→) indicates that only a smaller subset of the given set is actually found in this document. For example the notation 2332→0 for the set of characters tagged with RefLGR indicates that none of the characters in this document are part of the Reference LGR, which is as expected, as this document only discusses characters that are not included there.
Contributors
The excerpt that this proposal is based on was prepared by Asmus Freytag, based on published data found in [RefLGR] and reference information from [MSR]. For details on the process and contributors to those projects, see [RefLGR-Overview], in particular, Section 1, “Overview” and Section 6, “Contributors”. Mark Davis, Michel Suignard and Roozbeh Pournader have contributed feedback to this proposal.
Appendix: Code Point Sets
The following lists the sets of characters for each proposed Identifier_Type value. These can be used to import the values into property data or to convert them to Unicode set notations for additional comparisons. Elements of the sets are space separated and consist either of bare hex codes for single characters or a pair of hex codes separated by HYPHEN-MINUS to indicate a range.
Review_Needed()
Recommended(01F8-01F9 0674 06C5 06C7-06CA 1E0C-1E0D 1E24-1E25 1E3E-1E3F 1E5A-1E5B 1E92-1E93)
Proposed:Technical(064B-0652 0654-0655 0670-0671)
Proposed:Uncommon_Use(0114-0115 012C-012D 014E-014F 0156-0157 0162-0163 01D5-01DC 01DE-01E3 01EA-01ED 01F0 01F4-01F5 01FA-01FF 021E-021F 0226-0233 0400 040D 0450 045D 04C1-04C2 04CB-04CC 04DA-04DB 04EA-04ED 05B4 05F0-05F2 0682 0690 0692 0694 069B-069E 06A1 06A3 06A5 06B2 06B4 06B6-06B9 06BF 06D3 06EE-06EF 06FA-06FC 06FF 0750 0753-0755 0757-075F 0761 0764-0765 0769 076B-076D 0772-077D 08A1 08AA-08AC 0904 090C 0929 0934 0944 0979-097A 098C 09D7 0A03 0A66-0A6F 0A72-0A73 0A81 0B0C 0B35 0B57 0B66-0B6F 0BD7 0BE6-0BEF 0C0C 0C31 0C55-0C56 0C66-0C6F 0C8C 0CB1 0CBC 0CC4 0CD5-0CD6 0D0C 0D29 0D66-0D6F 0D8E 0D9E 0DE6-0DEF 0E4E 0EDE-0EDF 108B-108D 1090-1099 10F7-10F8 1207 1287 12AF 12F8-12FF 130F 131F 1347 135A 135D-135F 179D-179E 17A9 17B2 17D7 1E02-1E0B 1E0E-1E11 1E14-1E17 1E1C-1E1F 1E22-1E23 1E26-1E29 1E2E-1E35 1E38-1E3B 1E40-1E41 1E4C-1E59 1E5C-1E61 1E64-1E6B 1E6E-1E6F 1E78-1E8B 1E8E-1E91 1E94-1E99 2D80-2D96 A7B9 AB01-AB06 AB09-AB0E)
Repertoire
Repertoire Summary
Number of elements in repertoire | 432 |
---|---|
Longest code point sequence | 1 |
Repertoire by Code Point
The following table lists the repertoire by code point (or code point sequence). The data in the Script and Name column are extracted from the Unicode character database. Where a comment in the original LGR is equal to the character name, it has been suppressed.
See also the legend provided below the table.
Code Point |
Glyph | Script | Name | Ref | Tags | Comment |
---|---|---|---|---|---|---|
U+0114 | Ĕ | Latin | LATIN CAPITAL LETTER E WITH BREVE | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+0115 | ĕ | Latin | LATIN SMALL LETTER E WITH BREVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+012C | Ĭ | Latin | LATIN CAPITAL LETTER I WITH BREVE | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+012D | ĭ | Latin | LATIN SMALL LETTER I WITH BREVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+014E | Ŏ | Latin | LATIN CAPITAL LETTER O WITH BREVE | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+014F | ŏ | Latin | LATIN SMALL LETTER O WITH BREVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0156 | Ŗ | Latin | LATIN CAPITAL LETTER R WITH CEDILLA | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+0157 | ŗ | Latin | LATIN SMALL LETTER R WITH CEDILLA | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0162 | Ţ | Latin | LATIN CAPITAL LETTER T WITH CEDILLA | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+0163 | ţ | Latin | LATIN SMALL LETTER T WITH CEDILLA | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+01D5 | Ǖ | Latin | LATIN CAPITAL LETTER U WITH DIAERESIS AND MACRON | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+01D6 | ǖ | Latin | LATIN SMALL LETTER U WITH DIAERESIS AND MACRON | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+01D7 | Ǘ | Latin | LATIN CAPITAL LETTER U WITH DIAERESIS AND ACUTE | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+01D8 | ǘ | Latin | LATIN SMALL LETTER U WITH DIAERESIS AND ACUTE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+01D9 | Ǚ | Latin | LATIN CAPITAL LETTER U WITH DIAERESIS AND CARON | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+01DA | ǚ | Latin | LATIN SMALL LETTER U WITH DIAERESIS AND CARON | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+01DB | Ǜ | Latin | LATIN CAPITAL LETTER U WITH DIAERESIS AND GRAVE | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+01DC | ǜ | Latin | LATIN SMALL LETTER U WITH DIAERESIS AND GRAVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+01DE | Ǟ | Latin | LATIN CAPITAL LETTER A WITH DIAERESIS AND MACRON | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+01DF | ǟ | Latin | LATIN SMALL LETTER A WITH DIAERESIS AND MACRON | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+01E0 | Ǡ | Latin | LATIN CAPITAL LETTER A WITH DOT ABOVE AND MACRON | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+01E1 | ǡ | Latin | LATIN SMALL LETTER A WITH DOT ABOVE AND MACRON | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+01E2 | Ǣ | Latin | LATIN CAPITAL LETTER AE WITH MACRON | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+01E3 | ǣ | Latin | LATIN SMALL LETTER AE WITH MACRON | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+01EA | Ǫ | Latin | LATIN CAPITAL LETTER O WITH OGONEK | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+01EB | ǫ | Latin | LATIN SMALL LETTER O WITH OGONEK | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+01EC | Ǭ | Latin | LATIN CAPITAL LETTER O WITH OGONEK AND MACRON | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+01ED | ǭ | Latin | LATIN SMALL LETTER O WITH OGONEK AND MACRON | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+01F0 | ǰ | Latin | LATIN SMALL LETTER J WITH CARON | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+01F4 | Ǵ | Latin | LATIN CAPITAL LETTER G WITH ACUTE | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+01F5 | ǵ | Latin | LATIN SMALL LETTER G WITH ACUTE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+01F8 | Ǹ | Latin | LATIN CAPITAL LETTER N WITH GRAVE | [ID-REC] | Recommended | |
U+01F9 | ǹ | Latin | LATIN SMALL LETTER N WITH GRAVE | [ID-REC], [MSR], [YO-1], [YO-2] | Recommended | |
U+01FA | Ǻ | Latin | LATIN CAPITAL LETTER A WITH RING ABOVE AND ACUTE | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+01FB | ǻ | Latin | LATIN SMALL LETTER A WITH RING ABOVE AND ACUTE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+01FC | Ǽ | Latin | LATIN CAPITAL LETTER AE WITH ACUTE | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+01FD | ǽ | Latin | LATIN SMALL LETTER AE WITH ACUTE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+01FE | Ǿ | Latin | LATIN CAPITAL LETTER O WITH STROKE AND ACUTE | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+01FF | ǿ | Latin | LATIN SMALL LETTER O WITH STROKE AND ACUTE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+021E | Ȟ | Latin | LATIN CAPITAL LETTER H WITH CARON | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+021F | ȟ | Latin | LATIN SMALL LETTER H WITH CARON | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0226 | Ȧ | Latin | LATIN CAPITAL LETTER A WITH DOT ABOVE | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+0227 | ȧ | Latin | LATIN SMALL LETTER A WITH DOT ABOVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0228 | Ȩ | Latin | LATIN CAPITAL LETTER E WITH CEDILLA | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+0229 | ȩ | Latin | LATIN SMALL LETTER E WITH CEDILLA | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+022A | Ȫ | Latin | LATIN CAPITAL LETTER O WITH DIAERESIS AND MACRON | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+022B | ȫ | Latin | LATIN SMALL LETTER O WITH DIAERESIS AND MACRON | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+022C | Ȭ | Latin | LATIN CAPITAL LETTER O WITH TILDE AND MACRON | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+022D | ȭ | Latin | LATIN SMALL LETTER O WITH TILDE AND MACRON | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+022E | Ȯ | Latin | LATIN CAPITAL LETTER O WITH DOT ABOVE | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+022F | ȯ | Latin | LATIN SMALL LETTER O WITH DOT ABOVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0230 | Ȱ | Latin | LATIN CAPITAL LETTER O WITH DOT ABOVE AND MACRON | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+0231 | ȱ | Latin | LATIN SMALL LETTER O WITH DOT ABOVE AND MACRON | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0232 | Ȳ | Latin | LATIN CAPITAL LETTER Y WITH MACRON | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+0233 | ȳ | Latin | LATIN SMALL LETTER Y WITH MACRON | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0400 | Ѐ | Cyrillic | CYRILLIC CAPITAL LETTER IE WITH GRAVE | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+040D | Ѝ | Cyrillic | CYRILLIC CAPITAL LETTER I WITH GRAVE | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+0450 | ѐ | Cyrillic | CYRILLIC SMALL LETTER IE WITH GRAVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+045D | ѝ | Cyrillic | CYRILLIC SMALL LETTER I WITH GRAVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+04C1 | Ӂ | Cyrillic | CYRILLIC CAPITAL LETTER ZHE WITH BREVE | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+04C2 | ӂ | Cyrillic | CYRILLIC SMALL LETTER ZHE WITH BREVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+04CB | Ӌ | Cyrillic | CYRILLIC CAPITAL LETTER KHAKASSIAN CHE | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+04CC | ӌ | Cyrillic | CYRILLIC SMALL LETTER KHAKASSIAN CHE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+04DA | Ӛ | Cyrillic | CYRILLIC CAPITAL LETTER SCHWA WITH DIAERESIS | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+04DB | ӛ | Cyrillic | CYRILLIC SMALL LETTER SCHWA WITH DIAERESIS | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+04EA | Ӫ | Cyrillic | CYRILLIC CAPITAL LETTER BARRED O WITH DIAERESIS | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+04EB | ӫ | Cyrillic | CYRILLIC SMALL LETTER BARRED O WITH DIAERESIS | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+04EC | Ӭ | Cyrillic | CYRILLIC CAPITAL LETTER E WITH DIAERESIS | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+04ED | ӭ | Cyrillic | CYRILLIC SMALL LETTER E WITH DIAERESIS | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+05B4 | ִ | Hebrew | HEBREW POINT HIRIQ | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+05F0 | װ | Hebrew | HEBREW LIGATURE YIDDISH DOUBLE VAV | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+05F1 | ױ | Hebrew | HEBREW LIGATURE YIDDISH VAV YOD | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+05F2 | ײ | Hebrew | HEBREW LIGATURE YIDDISH DOUBLE YOD | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+064B | ً | Inherited | ARABIC FATHATAN | [ID-REC], [MSR] | ArabicCombining, Proposed:Technical | Status of Arabic Combining Mark requires separate review |
U+064C | ٌ | Inherited | ARABIC DAMMATAN | [ID-REC], [MSR] | ArabicCombining, Proposed:Technical | Status of Arabic Combining Mark requires separate review |
U+064D | ٍ | Inherited | ARABIC KASRATAN | [ID-REC], [MSR] | ArabicCombining, Proposed:Technical | Status of Arabic Combining Mark requires separate review |
U+064E | َ | Inherited | ARABIC FATHA | [AZ], [ID-REC], [MSR] | ArabicCombining, Proposed:Technical | Status of Arabic Combining Mark requires separate review |
U+064F | ُ | Inherited | ARABIC DAMMA | [ID-REC], [MSR] | ArabicCombining, Proposed:Technical | Status of Arabic Combining Mark requires separate review |
U+0650 | ِ | Inherited | ARABIC KASRA | [ID-REC], [MSR] | ArabicCombining, Proposed:Technical | Status of Arabic Combining Mark requires separate review |
U+0651 | ّ | Inherited | ARABIC SHADDA | [ID-REC], [MSR] | ArabicCombining, Proposed:Technical | Status of Arabic Combining Mark requires separate review |
U+0652 | ْ | Inherited | ARABIC SUKUN | [ID-REC], [MSR] | ArabicCombining, Proposed:Technical | Status of Arabic Combining Mark requires separate review |
U+0654 | ٔ | Inherited | ARABIC HAMZA ABOVE | [ID-REC], [MSR] | ArabicCombining, Proposed:Technical | Status of Arabic Combining Mark requires separate review |
U+0655 | ٕ | Inherited | ARABIC HAMZA BELOW | [ID-REC], [MSR] | ArabicCombining, Proposed:Technical | Status of Arabic Combining Mark requires separate review |
U+0670 | ٰ | Inherited | ARABIC LETTER SUPERSCRIPT ALEF | [ID-REC], [MSR] | ArabicCombining, Proposed:Technical | Status of Arabic Combining Mark requires separate review |
U+0671 | ٱ | Arabic | ARABIC LETTER ALEF WASLA | [ID-REC], [MSR] | Proposed:Technical | Should become Technical, based on Quranic use |
U+0674 | ٴ | Arabic | ARABIC LETTER HIGH HAMZA | [ID-REC], [KK], [MSR] | Recommended | Kazakh |
U+0682 | ڂ | Arabic | ARABIC LETTER HAH WITH TWO DOTS VERTICAL ABOVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0690 | ڐ | Arabic | ARABIC LETTER DAL WITH FOUR DOTS ABOVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0692 | ڒ | Arabic | ARABIC LETTER REH WITH SMALL V | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0694 | ڔ | Arabic | ARABIC LETTER REH WITH DOT BELOW | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+069B | ڛ | Arabic | ARABIC LETTER SEEN WITH THREE DOTS BELOW | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+069C | ڜ | Arabic | ARABIC LETTER SEEN WITH THREE DOTS BELOW AND THREE DOTS ABOVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+069D | ڝ | Arabic | ARABIC LETTER SAD WITH TWO DOTS BELOW | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+069E | ڞ | Arabic | ARABIC LETTER SAD WITH THREE DOTS ABOVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+06A1 | ڡ | Arabic | ARABIC LETTER DOTLESS FEH | [ID-REC] | Proposed:Uncommon_Use | Not in documented common use |
U+06A3 | ڣ | Arabic | ARABIC LETTER FEH WITH DOT BELOW | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+06A5 | ڥ | Arabic | ARABIC LETTER FEH WITH THREE DOTS BELOW | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+06B2 | ڲ | Arabic | ARABIC LETTER GAF WITH TWO DOTS BELOW | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+06B4 | ڴ | Arabic | ARABIC LETTER GAF WITH THREE DOTS ABOVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+06B6 | ڶ | Arabic | ARABIC LETTER LAM WITH DOT ABOVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+06B7 | ڷ | Arabic | ARABIC LETTER LAM WITH THREE DOTS ABOVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+06B8 | ڸ | Arabic | ARABIC LETTER LAM WITH THREE DOTS BELOW | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+06B9 | ڹ | Arabic | ARABIC LETTER NOON WITH DOT BELOW | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+06BF | ڿ | Arabic | ARABIC LETTER TCHEH WITH DOT ABOVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+06C5 | ۅ | Arabic | ARABIC LETTER KIRGHIZ OE | [ID-REC], [KY], [MSR] | Recommended | Kyrgyz |
U+06C7 | ۇ | Arabic | ARABIC LETTER U | [ID-REC], [KK], [KY], [MSR], [UG] | Recommended | Kazakh, Kyrgyz, Uyghur |
U+06C8 | ۈ | Arabic | ARABIC LETTER YU | [AZ], [ID-REC], [MSR], [UG] | Recommended | Azerbaijani, Uyghur |
U+06C9 | ۉ | Arabic | ARABIC LETTER KIRGHIZ YU | [ID-REC], [KY], [MSR] | Recommended | Kyrgyz |
U+06CA | ۊ | Arabic | ARABIC LETTER WAW WITH TWO DOTS ABOVE | [ID-REC], [LRC], [MSR] | Recommended | Luri |
U+06D3 | ۓ | Arabic | ARABIC LETTER YEH BARREE WITH HAMZA ABOVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+06EE | ۮ | Arabic | ARABIC LETTER DAL WITH INVERTED V | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+06EF | ۯ | Arabic | ARABIC LETTER REH WITH INVERTED V | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+06FA | ۺ | Arabic | ARABIC LETTER SHEEN WITH DOT BELOW | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+06FB | ۻ | Arabic | ARABIC LETTER DAD WITH DOT BELOW | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+06FC | ۼ | Arabic | ARABIC LETTER GHAIN WITH DOT BELOW | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+06FF | ۿ | Arabic | ARABIC LETTER HEH WITH INVERTED V | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0750 | ݐ | Arabic | ARABIC LETTER BEH WITH THREE DOTS HORIZONTALLY BELOW | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0753 | ݓ | Arabic | ARABIC LETTER BEH WITH THREE DOTS POINTING UPWARDS BELOW AND TWO DOTS ABOVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0754 | ݔ | Arabic | ARABIC LETTER BEH WITH TWO DOTS BELOW AND DOT ABOVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0755 | ݕ | Arabic | ARABIC LETTER BEH WITH INVERTED SMALL V BELOW | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0757 | ݗ | Arabic | ARABIC LETTER HAH WITH TWO DOTS ABOVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0758 | ݘ | Arabic | ARABIC LETTER HAH WITH THREE DOTS POINTING UPWARDS BELOW | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0759 | ݙ | Arabic | ARABIC LETTER DAL WITH TWO DOTS VERTICALLY BELOW AND SMALL TAH | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+075A | ݚ | Arabic | ARABIC LETTER DAL WITH INVERTED SMALL V BELOW | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+075B | ݛ | Arabic | ARABIC LETTER REH WITH STROKE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+075C | ݜ | Arabic | ARABIC LETTER SEEN WITH FOUR DOTS ABOVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+075D | ݝ | Arabic | ARABIC LETTER AIN WITH TWO DOTS ABOVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+075E | ݞ | Arabic | ARABIC LETTER AIN WITH THREE DOTS POINTING DOWNWARDS ABOVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+075F | ݟ | Arabic | ARABIC LETTER AIN WITH TWO DOTS VERTICALLY ABOVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0761 | ݡ | Arabic | ARABIC LETTER FEH WITH THREE DOTS POINTING UPWARDS BELOW | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0764 | ݤ | Arabic | ARABIC LETTER KEHEH WITH THREE DOTS POINTING UPWARDS BELOW | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0765 | ݥ | Arabic | ARABIC LETTER MEEM WITH DOT ABOVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0769 | ݩ | Arabic | ARABIC LETTER NOON WITH SMALL V | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+076B | ݫ | Arabic | ARABIC LETTER REH WITH TWO DOTS VERTICALLY ABOVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+076C | ݬ | Arabic | ARABIC LETTER REH WITH HAMZA ABOVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+076D | ݭ | Arabic | ARABIC LETTER SEEN WITH TWO DOTS VERTICALLY ABOVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0772 | ݲ | Arabic | ARABIC LETTER HAH WITH SMALL ARABIC LETTER TAH ABOVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0773 | ݳ | Arabic | ARABIC LETTER ALEF WITH EXTENDED ARABIC-INDIC DIGIT TWO ABOVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0774 | ݴ | Arabic | ARABIC LETTER ALEF WITH EXTENDED ARABIC-INDIC DIGIT THREE ABOVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0775 | ݵ | Arabic | ARABIC LETTER FARSI YEH WITH EXTENDED ARABIC-INDIC DIGIT TWO ABOVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0776 | ݶ | Arabic | ARABIC LETTER FARSI YEH WITH EXTENDED ARABIC-INDIC DIGIT THREE ABOVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0777 | ݷ | Arabic | ARABIC LETTER FARSI YEH WITH EXTENDED ARABIC-INDIC DIGIT FOUR BELOW | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0778 | ݸ | Arabic | ARABIC LETTER WAW WITH EXTENDED ARABIC-INDIC DIGIT TWO ABOVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0779 | ݹ | Arabic | ARABIC LETTER WAW WITH EXTENDED ARABIC-INDIC DIGIT THREE ABOVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+077A | ݺ | Arabic | ARABIC LETTER YEH BARREE WITH EXTENDED ARABIC-INDIC DIGIT TWO ABOVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+077B | ݻ | Arabic | ARABIC LETTER YEH BARREE WITH EXTENDED ARABIC-INDIC DIGIT THREE ABOVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+077C | ݼ | Arabic | ARABIC LETTER HAH WITH EXTENDED ARABIC-INDIC DIGIT FOUR BELOW | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+077D | ݽ | Arabic | ARABIC LETTER SEEN WITH EXTENDED ARABIC-INDIC DIGIT FOUR ABOVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+08A1 | ࢡ | Arabic | ARABIC LETTER BEH WITH HAMZA ABOVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+08AA | ࢪ | Arabic | ARABIC LETTER REH WITH LOOP | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+08AB | ࢫ | Arabic | ARABIC LETTER WAW WITH DOT WITHIN | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+08AC | ࢬ | Arabic | ARABIC LETTER ROHINGYA YEH | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0904 | ऄ | Devanagari | DEVANAGARI LETTER SHORT A | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+090C | ऌ | Devanagari | DEVANAGARI LETTER VOCALIC L | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0929 | ऩ | Devanagari | DEVANAGARI LETTER NNNA | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0934 | ऴ | Devanagari | DEVANAGARI LETTER LLLA | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0944 | ॄ | Devanagari | DEVANAGARI VOWEL SIGN VOCALIC RR | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0979 | ॹ | Devanagari | DEVANAGARI LETTER ZHA | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+097A | ॺ | Devanagari | DEVANAGARI LETTER HEAVY YA | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+098C | ঌ | Bengali | BENGALI LETTER VOCALIC L | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+09D7 | ৗ | Bengali | BENGALI AU LENGTH MARK | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0A03 | ਃ | Gurmukhi | GURMUKHI SIGN VISARGA | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0A66 | ੦ | Gurmukhi | GURMUKHI DIGIT ZERO | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0A67 | ੧ | Gurmukhi | GURMUKHI DIGIT ONE | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0A68 | ੨ | Gurmukhi | GURMUKHI DIGIT TWO | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0A69 | ੩ | Gurmukhi | GURMUKHI DIGIT THREE | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0A6A | ੪ | Gurmukhi | GURMUKHI DIGIT FOUR | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0A6B | ੫ | Gurmukhi | GURMUKHI DIGIT FIVE | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0A6C | ੬ | Gurmukhi | GURMUKHI DIGIT SIX | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0A6D | ੭ | Gurmukhi | GURMUKHI DIGIT SEVEN | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0A6E | ੮ | Gurmukhi | GURMUKHI DIGIT EIGHT | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0A6F | ੯ | Gurmukhi | GURMUKHI DIGIT NINE | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0A72 | ੲ | Gurmukhi | GURMUKHI IRI | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0A73 | ੳ | Gurmukhi | GURMUKHI URA | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0A81 | ઁ | Gujarati | GUJARATI SIGN CANDRABINDU | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0B0C | ଌ | Oriya | ORIYA LETTER VOCALIC L | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0B35 | ଵ | Oriya | ORIYA LETTER VA | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0B57 | ୗ | Oriya | ORIYA AU LENGTH MARK | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0B66 | ୦ | Oriya | ORIYA DIGIT ZERO | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0B67 | ୧ | Oriya | ORIYA DIGIT ONE | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0B68 | ୨ | Oriya | ORIYA DIGIT TWO | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0B69 | ୩ | Oriya | ORIYA DIGIT THREE | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0B6A | ୪ | Oriya | ORIYA DIGIT FOUR | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0B6B | ୫ | Oriya | ORIYA DIGIT FIVE | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0B6C | ୬ | Oriya | ORIYA DIGIT SIX | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0B6D | ୭ | Oriya | ORIYA DIGIT SEVEN | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0B6E | ୮ | Oriya | ORIYA DIGIT EIGHT | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0B6F | ୯ | Oriya | ORIYA DIGIT NINE | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0BD7 | ௗ | Tamil | TAMIL AU LENGTH MARK | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0BE6 | ௦ | Tamil | TAMIL DIGIT ZERO | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0BE7 | ௧ | Tamil | TAMIL DIGIT ONE | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0BE8 | ௨ | Tamil | TAMIL DIGIT TWO | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0BE9 | ௩ | Tamil | TAMIL DIGIT THREE | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0BEA | ௪ | Tamil | TAMIL DIGIT FOUR | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0BEB | ௫ | Tamil | TAMIL DIGIT FIVE | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0BEC | ௬ | Tamil | TAMIL DIGIT SIX | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0BED | ௭ | Tamil | TAMIL DIGIT SEVEN | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0BEE | ௮ | Tamil | TAMIL DIGIT EIGHT | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0BEF | ௯ | Tamil | TAMIL DIGIT NINE | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0C0C | ఌ | Telugu | TELUGU LETTER VOCALIC L | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0C31 | ఱ | Telugu | TELUGU LETTER RRA | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0C55 | ౕ | Telugu | TELUGU LENGTH MARK | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0C56 | ౖ | Telugu | TELUGU AI LENGTH MARK | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0C66 | ౦ | Telugu | TELUGU DIGIT ZERO | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0C67 | ౧ | Telugu | TELUGU DIGIT ONE | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0C68 | ౨ | Telugu | TELUGU DIGIT TWO | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0C69 | ౩ | Telugu | TELUGU DIGIT THREE | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0C6A | ౪ | Telugu | TELUGU DIGIT FOUR | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0C6B | ౫ | Telugu | TELUGU DIGIT FIVE | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0C6C | ౬ | Telugu | TELUGU DIGIT SIX | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0C6D | ౭ | Telugu | TELUGU DIGIT SEVEN | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0C6E | ౮ | Telugu | TELUGU DIGIT EIGHT | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0C6F | ౯ | Telugu | TELUGU DIGIT NINE | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0C8C | ಌ | Kannada | KANNADA LETTER VOCALIC L | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0CB1 | ಱ | Kannada | KANNADA LETTER RRA | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0CBC | ಼ | Kannada | KANNADA SIGN NUKTA | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0CC4 | ೄ | Kannada | KANNADA VOWEL SIGN VOCALIC RR | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0CD5 | ೕ | Kannada | KANNADA LENGTH MARK | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0CD6 | ೖ | Kannada | KANNADA AI LENGTH MARK | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0D0C | ഌ | Malayalam | MALAYALAM LETTER VOCALIC L | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0D29 | ഩ | Malayalam | MALAYALAM LETTER NNNA | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0D66 | ൦ | Malayalam | MALAYALAM DIGIT ZERO | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0D67 | ൧ | Malayalam | MALAYALAM DIGIT ONE | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0D68 | ൨ | Malayalam | MALAYALAM DIGIT TWO | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0D69 | ൩ | Malayalam | MALAYALAM DIGIT THREE | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0D6A | ൪ | Malayalam | MALAYALAM DIGIT FOUR | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0D6B | ൫ | Malayalam | MALAYALAM DIGIT FIVE | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0D6C | ൬ | Malayalam | MALAYALAM DIGIT SIX | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0D6D | ൭ | Malayalam | MALAYALAM DIGIT SEVEN | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0D6E | ൮ | Malayalam | MALAYALAM DIGIT EIGHT | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0D6F | ൯ | Malayalam | MALAYALAM DIGIT NINE | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0D8E | ඎ | Sinhala | SINHALA LETTER IRUUYANNA | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0D9E | ඞ | Sinhala | SINHALA LETTER KANTAJA NAASIKYAYA | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0DE6 | ෦ | Sinhala | SINHALA LITH DIGIT ZERO | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0DE7 | ෧ | Sinhala | SINHALA LITH DIGIT ONE | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0DE8 | ෨ | Sinhala | SINHALA LITH DIGIT TWO | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0DE9 | ෩ | Sinhala | SINHALA LITH DIGIT THREE | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0DEA | ෪ | Sinhala | SINHALA LITH DIGIT FOUR | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0DEB | ෫ | Sinhala | SINHALA LITH DIGIT FIVE | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0DEC | ෬ | Sinhala | SINHALA LITH DIGIT SIX | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0DED | ෭ | Sinhala | SINHALA LITH DIGIT SEVEN | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0DEE | ෮ | Sinhala | SINHALA LITH DIGIT EIGHT | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0DEF | ෯ | Sinhala | SINHALA LITH DIGIT NINE | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+0E4E | ๎ | Thai | THAI CHARACTER YAMAKKAN | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0EDE | ໞ | Lao | LAO LETTER KHMU GO | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+0EDF | ໟ | Lao | LAO LETTER KHMU NYO | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+108B | ႋ | Myanmar | MYANMAR SIGN SHAN COUNCIL TONE-2 | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+108C | ႌ | Myanmar | MYANMAR SIGN SHAN COUNCIL TONE-3 | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+108D | ႍ | Myanmar | MYANMAR SIGN SHAN COUNCIL EMPHATIC TONE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1090 | ႐ | Myanmar | MYANMAR SHAN DIGIT ZERO | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+1091 | ႑ | Myanmar | MYANMAR SHAN DIGIT ONE | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+1092 | ႒ | Myanmar | MYANMAR SHAN DIGIT TWO | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+1093 | ႓ | Myanmar | MYANMAR SHAN DIGIT THREE | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+1094 | ႔ | Myanmar | MYANMAR SHAN DIGIT FOUR | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+1095 | ႕ | Myanmar | MYANMAR SHAN DIGIT FIVE | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+1096 | ႖ | Myanmar | MYANMAR SHAN DIGIT SIX | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+1097 | ႗ | Myanmar | MYANMAR SHAN DIGIT SEVEN | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+1098 | ႘ | Myanmar | MYANMAR SHAN DIGIT EIGHT | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+1099 | ႙ | Myanmar | MYANMAR SHAN DIGIT NINE | [ID-REC] | Proposed:Uncommon_Use | Native digits not in common use |
U+10F7 | ჷ | Georgian | GEORGIAN LETTER YN | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+10F8 | ჸ | Georgian | GEORGIAN LETTER ELIFI | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1207 | ሇ | Ethiopic | ETHIOPIC SYLLABLE HOA | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1287 | ኇ | Ethiopic | ETHIOPIC SYLLABLE XOA | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+12AF | ኯ | Ethiopic | ETHIOPIC SYLLABLE KOA | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+12F8 | ዸ | Ethiopic | ETHIOPIC SYLLABLE DDA | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+12F9 | ዹ | Ethiopic | ETHIOPIC SYLLABLE DDU | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+12FA | ዺ | Ethiopic | ETHIOPIC SYLLABLE DDI | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+12FB | ዻ | Ethiopic | ETHIOPIC SYLLABLE DDAA | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+12FC | ዼ | Ethiopic | ETHIOPIC SYLLABLE DDEE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+12FD | ዽ | Ethiopic | ETHIOPIC SYLLABLE DDE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+12FE | ዾ | Ethiopic | ETHIOPIC SYLLABLE DDO | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+12FF | ዿ | Ethiopic | ETHIOPIC SYLLABLE DDWA | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+130F | ጏ | Ethiopic | ETHIOPIC SYLLABLE GOA | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+131F | ጟ | Ethiopic | ETHIOPIC SYLLABLE GGWAA | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1347 | ፇ | Ethiopic | ETHIOPIC SYLLABLE TZOA | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+135A | ፚ | Ethiopic | ETHIOPIC SYLLABLE FYA | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+135D | ፝ | Ethiopic | ETHIOPIC COMBINING GEMINATION AND VOWEL LENGTH MARK | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+135E | ፞ | Ethiopic | ETHIOPIC COMBINING VOWEL LENGTH MARK | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+135F | ፟ | Ethiopic | ETHIOPIC COMBINING GEMINATION MARK | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+179D | ឝ | Khmer | KHMER LETTER SHA | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+179E | ឞ | Khmer | KHMER LETTER SSO | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+17A9 | ឩ | Khmer | KHMER INDEPENDENT VOWEL QUU | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+17B2 | ឲ | Khmer | KHMER INDEPENDENT VOWEL QOO TYPE TWO | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+17D7 | ៗ | Khmer | KHMER SIGN LEK TOO | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1E02 | Ḃ | Latin | LATIN CAPITAL LETTER B WITH DOT ABOVE | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+1E03 | ḃ | Latin | LATIN SMALL LETTER B WITH DOT ABOVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1E04 | Ḅ | Latin | LATIN CAPITAL LETTER B WITH DOT BELOW | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+1E05 | ḅ | Latin | LATIN SMALL LETTER B WITH DOT BELOW | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1E06 | Ḇ | Latin | LATIN CAPITAL LETTER B WITH LINE BELOW | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+1E07 | ḇ | Latin | LATIN SMALL LETTER B WITH LINE BELOW | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1E08 | Ḉ | Latin | LATIN CAPITAL LETTER C WITH CEDILLA AND ACUTE | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+1E09 | ḉ | Latin | LATIN SMALL LETTER C WITH CEDILLA AND ACUTE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1E0A | Ḋ | Latin | LATIN CAPITAL LETTER D WITH DOT ABOVE | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+1E0B | ḋ | Latin | LATIN SMALL LETTER D WITH DOT ABOVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1E0C | Ḍ | Latin | LATIN CAPITAL LETTER D WITH DOT BELOW | [ID-REC] | Recommended | |
U+1E0D | ḍ | Latin | LATIN SMALL LETTER D WITH DOT BELOW | [ID-REC], [KAB], [MSR] | Recommended | |
U+1E0E | Ḏ | Latin | LATIN CAPITAL LETTER D WITH LINE BELOW | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+1E0F | ḏ | Latin | LATIN SMALL LETTER D WITH LINE BELOW | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1E10 | Ḑ | Latin | LATIN CAPITAL LETTER D WITH CEDILLA | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+1E11 | ḑ | Latin | LATIN SMALL LETTER D WITH CEDILLA | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1E14 | Ḕ | Latin | LATIN CAPITAL LETTER E WITH MACRON AND GRAVE | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+1E15 | ḕ | Latin | LATIN SMALL LETTER E WITH MACRON AND GRAVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1E16 | Ḗ | Latin | LATIN CAPITAL LETTER E WITH MACRON AND ACUTE | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+1E17 | ḗ | Latin | LATIN SMALL LETTER E WITH MACRON AND ACUTE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1E1C | Ḝ | Latin | LATIN CAPITAL LETTER E WITH CEDILLA AND BREVE | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+1E1D | ḝ | Latin | LATIN SMALL LETTER E WITH CEDILLA AND BREVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1E1E | Ḟ | Latin | LATIN CAPITAL LETTER F WITH DOT ABOVE | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+1E1F | ḟ | Latin | LATIN SMALL LETTER F WITH DOT ABOVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1E22 | Ḣ | Latin | LATIN CAPITAL LETTER H WITH DOT ABOVE | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+1E23 | ḣ | Latin | LATIN SMALL LETTER H WITH DOT ABOVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1E24 | Ḥ | Latin | LATIN CAPITAL LETTER H WITH DOT BELOW | [ID-REC] | Recommended | |
U+1E25 | ḥ | Latin | LATIN SMALL LETTER H WITH DOT BELOW | [ID-REC], [KAB], [MSR] | Recommended | |
U+1E26 | Ḧ | Latin | LATIN CAPITAL LETTER H WITH DIAERESIS | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+1E27 | ḧ | Latin | LATIN SMALL LETTER H WITH DIAERESIS | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1E28 | Ḩ | Latin | LATIN CAPITAL LETTER H WITH CEDILLA | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+1E29 | ḩ | Latin | LATIN SMALL LETTER H WITH CEDILLA | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1E2E | Ḯ | Latin | LATIN CAPITAL LETTER I WITH DIAERESIS AND ACUTE | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+1E2F | ḯ | Latin | LATIN SMALL LETTER I WITH DIAERESIS AND ACUTE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1E30 | Ḱ | Latin | LATIN CAPITAL LETTER K WITH ACUTE | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+1E31 | ḱ | Latin | LATIN SMALL LETTER K WITH ACUTE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1E32 | Ḳ | Latin | LATIN CAPITAL LETTER K WITH DOT BELOW | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+1E33 | ḳ | Latin | LATIN SMALL LETTER K WITH DOT BELOW | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1E34 | Ḵ | Latin | LATIN CAPITAL LETTER K WITH LINE BELOW | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+1E35 | ḵ | Latin | LATIN SMALL LETTER K WITH LINE BELOW | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1E38 | Ḹ | Latin | LATIN CAPITAL LETTER L WITH DOT BELOW AND MACRON | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+1E39 | ḹ | Latin | LATIN SMALL LETTER L WITH DOT BELOW AND MACRON | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1E3A | Ḻ | Latin | LATIN CAPITAL LETTER L WITH LINE BELOW | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+1E3B | ḻ | Latin | LATIN SMALL LETTER L WITH LINE BELOW | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1E3E | Ḿ | Latin | LATIN CAPITAL LETTER M WITH ACUTE | [ID-REC] | Recommended | |
U+1E3F | ḿ | Latin | LATIN SMALL LETTER M WITH ACUTE | [ID-REC], [MSR], [YO-1], [YO-2] | Recommended | |
U+1E40 | Ṁ | Latin | LATIN CAPITAL LETTER M WITH DOT ABOVE | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+1E41 | ṁ | Latin | LATIN SMALL LETTER M WITH DOT ABOVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1E4C | Ṍ | Latin | LATIN CAPITAL LETTER O WITH TILDE AND ACUTE | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+1E4D | ṍ | Latin | LATIN SMALL LETTER O WITH TILDE AND ACUTE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1E4E | Ṏ | Latin | LATIN CAPITAL LETTER O WITH TILDE AND DIAERESIS | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+1E4F | ṏ | Latin | LATIN SMALL LETTER O WITH TILDE AND DIAERESIS | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1E50 | Ṑ | Latin | LATIN CAPITAL LETTER O WITH MACRON AND GRAVE | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+1E51 | ṑ | Latin | LATIN SMALL LETTER O WITH MACRON AND GRAVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1E52 | Ṓ | Latin | LATIN CAPITAL LETTER O WITH MACRON AND ACUTE | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+1E53 | ṓ | Latin | LATIN SMALL LETTER O WITH MACRON AND ACUTE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1E54 | Ṕ | Latin | LATIN CAPITAL LETTER P WITH ACUTE | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+1E55 | ṕ | Latin | LATIN SMALL LETTER P WITH ACUTE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1E56 | Ṗ | Latin | LATIN CAPITAL LETTER P WITH DOT ABOVE | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+1E57 | ṗ | Latin | LATIN SMALL LETTER P WITH DOT ABOVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1E58 | Ṙ | Latin | LATIN CAPITAL LETTER R WITH DOT ABOVE | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+1E59 | ṙ | Latin | LATIN SMALL LETTER R WITH DOT ABOVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1E5A | Ṛ | Latin | LATIN CAPITAL LETTER R WITH DOT BELOW | [ID-REC] | Recommended | |
U+1E5B | ṛ | Latin | LATIN SMALL LETTER R WITH DOT BELOW | [ID-REC], [KAB], [MSR] | Recommended | |
U+1E5C | Ṝ | Latin | LATIN CAPITAL LETTER R WITH DOT BELOW AND MACRON | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+1E5D | ṝ | Latin | LATIN SMALL LETTER R WITH DOT BELOW AND MACRON | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1E5E | Ṟ | Latin | LATIN CAPITAL LETTER R WITH LINE BELOW | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+1E5F | ṟ | Latin | LATIN SMALL LETTER R WITH LINE BELOW | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1E60 | Ṡ | Latin | LATIN CAPITAL LETTER S WITH DOT ABOVE | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+1E61 | ṡ | Latin | LATIN SMALL LETTER S WITH DOT ABOVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1E64 | Ṥ | Latin | LATIN CAPITAL LETTER S WITH ACUTE AND DOT ABOVE | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+1E65 | ṥ | Latin | LATIN SMALL LETTER S WITH ACUTE AND DOT ABOVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1E66 | Ṧ | Latin | LATIN CAPITAL LETTER S WITH CARON AND DOT ABOVE | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+1E67 | ṧ | Latin | LATIN SMALL LETTER S WITH CARON AND DOT ABOVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1E68 | Ṩ | Latin | LATIN CAPITAL LETTER S WITH DOT BELOW AND DOT ABOVE | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+1E69 | ṩ | Latin | LATIN SMALL LETTER S WITH DOT BELOW AND DOT ABOVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1E6A | Ṫ | Latin | LATIN CAPITAL LETTER T WITH DOT ABOVE | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+1E6B | ṫ | Latin | LATIN SMALL LETTER T WITH DOT ABOVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1E6E | Ṯ | Latin | LATIN CAPITAL LETTER T WITH LINE BELOW | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+1E6F | ṯ | Latin | LATIN SMALL LETTER T WITH LINE BELOW | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1E78 | Ṹ | Latin | LATIN CAPITAL LETTER U WITH TILDE AND ACUTE | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+1E79 | ṹ | Latin | LATIN SMALL LETTER U WITH TILDE AND ACUTE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1E7A | Ṻ | Latin | LATIN CAPITAL LETTER U WITH MACRON AND DIAERESIS | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+1E7B | ṻ | Latin | LATIN SMALL LETTER U WITH MACRON AND DIAERESIS | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1E7C | Ṽ | Latin | LATIN CAPITAL LETTER V WITH TILDE | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+1E7D | ṽ | Latin | LATIN SMALL LETTER V WITH TILDE | [ID-REC], [MSR], [MUA] | Proposed:Uncommon_Use | Not in documented common use |
U+1E7E | Ṿ | Latin | LATIN CAPITAL LETTER V WITH DOT BELOW | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+1E7F | ṿ | Latin | LATIN SMALL LETTER V WITH DOT BELOW | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1E80 | Ẁ | Latin | LATIN CAPITAL LETTER W WITH GRAVE | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+1E81 | ẁ | Latin | LATIN SMALL LETTER W WITH GRAVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1E82 | Ẃ | Latin | LATIN CAPITAL LETTER W WITH ACUTE | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+1E83 | ẃ | Latin | LATIN SMALL LETTER W WITH ACUTE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1E84 | Ẅ | Latin | LATIN CAPITAL LETTER W WITH DIAERESIS | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+1E85 | ẅ | Latin | LATIN SMALL LETTER W WITH DIAERESIS | [ID-REC], [MSR], [NNH] | Proposed:Uncommon_Use | Not in documented common use |
U+1E86 | Ẇ | Latin | LATIN CAPITAL LETTER W WITH DOT ABOVE | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+1E87 | ẇ | Latin | LATIN SMALL LETTER W WITH DOT ABOVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1E88 | Ẉ | Latin | LATIN CAPITAL LETTER W WITH DOT BELOW | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+1E89 | ẉ | Latin | LATIN SMALL LETTER W WITH DOT BELOW | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1E8A | Ẋ | Latin | LATIN CAPITAL LETTER X WITH DOT ABOVE | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+1E8B | ẋ | Latin | LATIN SMALL LETTER X WITH DOT ABOVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1E8E | Ẏ | Latin | LATIN CAPITAL LETTER Y WITH DOT ABOVE | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+1E8F | ẏ | Latin | LATIN SMALL LETTER Y WITH DOT ABOVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1E90 | Ẑ | Latin | LATIN CAPITAL LETTER Z WITH CIRCUMFLEX | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+1E91 | ẑ | Latin | LATIN SMALL LETTER Z WITH CIRCUMFLEX | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1E92 | Ẓ | Latin | LATIN CAPITAL LETTER Z WITH DOT BELOW | [ID-REC] | Recommended | |
U+1E93 | ẓ | Latin | LATIN SMALL LETTER Z WITH DOT BELOW | [ID-REC], [KAB], [MSR] | Recommended | |
U+1E94 | Ẕ | Latin | LATIN CAPITAL LETTER Z WITH LINE BELOW | [ID-REC] | Proposed:Uncommon_Use | Uppercase of not in common use |
U+1E95 | ẕ | Latin | LATIN SMALL LETTER Z WITH LINE BELOW | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1E96 | ẖ | Latin | LATIN SMALL LETTER H WITH LINE BELOW | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1E97 | ẗ | Latin | LATIN SMALL LETTER T WITH DIAERESIS | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1E98 | ẘ | Latin | LATIN SMALL LETTER W WITH RING ABOVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+1E99 | ẙ | Latin | LATIN SMALL LETTER Y WITH RING ABOVE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+2D80 | ⶀ | Ethiopic | ETHIOPIC SYLLABLE LOA | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+2D81 | ⶁ | Ethiopic | ETHIOPIC SYLLABLE MOA | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+2D82 | ⶂ | Ethiopic | ETHIOPIC SYLLABLE ROA | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+2D83 | ⶃ | Ethiopic | ETHIOPIC SYLLABLE SOA | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+2D84 | ⶄ | Ethiopic | ETHIOPIC SYLLABLE SHOA | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+2D85 | ⶅ | Ethiopic | ETHIOPIC SYLLABLE BOA | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+2D86 | ⶆ | Ethiopic | ETHIOPIC SYLLABLE TOA | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+2D87 | ⶇ | Ethiopic | ETHIOPIC SYLLABLE COA | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+2D88 | ⶈ | Ethiopic | ETHIOPIC SYLLABLE NOA | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+2D89 | ⶉ | Ethiopic | ETHIOPIC SYLLABLE NYOA | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+2D8A | ⶊ | Ethiopic | ETHIOPIC SYLLABLE GLOTTAL OA | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+2D8B | ⶋ | Ethiopic | ETHIOPIC SYLLABLE ZOA | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+2D8C | ⶌ | Ethiopic | ETHIOPIC SYLLABLE DOA | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+2D8D | ⶍ | Ethiopic | ETHIOPIC SYLLABLE DDOA | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+2D8E | ⶎ | Ethiopic | ETHIOPIC SYLLABLE JOA | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+2D8F | ⶏ | Ethiopic | ETHIOPIC SYLLABLE THOA | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+2D90 | ⶐ | Ethiopic | ETHIOPIC SYLLABLE CHOA | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+2D91 | ⶑ | Ethiopic | ETHIOPIC SYLLABLE PHOA | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+2D92 | ⶒ | Ethiopic | ETHIOPIC SYLLABLE POA | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+2D93 | ⶓ | Ethiopic | ETHIOPIC SYLLABLE GGWA | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+2D94 | ⶔ | Ethiopic | ETHIOPIC SYLLABLE GGWI | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+2D95 | ⶕ | Ethiopic | ETHIOPIC SYLLABLE GGWEE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+2D96 | ⶖ | Ethiopic | ETHIOPIC SYLLABLE GGWE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+A7B9 | ꞹ | Latin | LATIN SMALL LETTER U WITH STROKE | [ID-REC] | Proposed:Uncommon_Use | Not in documented common use |
U+AB01 | ꬁ | Ethiopic | ETHIOPIC SYLLABLE TTHU | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+AB02 | ꬂ | Ethiopic | ETHIOPIC SYLLABLE TTHI | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+AB03 | ꬃ | Ethiopic | ETHIOPIC SYLLABLE TTHAA | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+AB04 | ꬄ | Ethiopic | ETHIOPIC SYLLABLE TTHEE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+AB05 | ꬅ | Ethiopic | ETHIOPIC SYLLABLE TTHE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+AB06 | ꬆ | Ethiopic | ETHIOPIC SYLLABLE TTHO | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+AB09 | ꬉ | Ethiopic | ETHIOPIC SYLLABLE DDHU | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+AB0A | ꬊ | Ethiopic | ETHIOPIC SYLLABLE DDHI | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+AB0B | ꬋ | Ethiopic | ETHIOPIC SYLLABLE DDHAA | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+AB0C | ꬌ | Ethiopic | ETHIOPIC SYLLABLE DDHEE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+AB0D | ꬍ | Ethiopic | ETHIOPIC SYLLABLE DDHE | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
U+AB0E | ꬎ | Ethiopic | ETHIOPIC SYLLABLE DDHO | [ID-REC], [MSR] | Proposed:Uncommon_Use | Not in documented common use |
Legend
- Code Point
- A code point or code point sequence.
- Glyph
- The shape displayed depends on the fonts available to your browser.
- Script
- Shows the script property value from the Unicode Character Database. Combining marks may have the value Inherited and code points used with more than one script may have the value Common.
- Name
- Shows the character or sequence name from the Unicode Character Database.
- Ref
- Links to the references associated with the code point or sequence, if any.
- Tags
- LGR-defined tag values. Any tags matching the Unicode script property are suppressed in this view.
- Comment
- The comment as given in the XML file. However, if the comment for this row consists only of the code point or sequence name, it is suppressed in this view. By convention, comments starting with “=” denote an alias. If present, the symbol ⍟ marks a default item shared among a set of LGRs.
Variants
This LGR does not specify any variants.
Classes, Rules and Actions
Character Classes
Number of named classes | 2 |
---|---|
Implicit (except script) | 7 |
The following table lists all named and implicit classes with their definition and a list of their members intersected with the current repertoire (for larger classes, this list is elided).
Name | Definition | Count | Members or Ranges | Ref | Comment |
---|---|---|---|---|---|
Digits | Prop=gc:Nd | 760→70 | {0A66-0A6F 0B66-0B6F 0BE6-0BEF 0C66-0C6F 0D66-0D6F 0DE6-0DEF 1090-1099} | Any character matching Unicode property General_Category:Decimal_Number | |
Uppercase | Prop=gc:Lu | 1858→88 | {0114 012C 014E 0156 0162 01D5 01D7 01D9 01DB 01DE 01E0 01E2 01EA 01EC 01F4 01F8 01FA 01FC 01FE 021E 0226 0228 022A 022C 022E 0230 0232 0400 040D 04C1 04CB ...} | Any character matching Unicode property General_Category:Uppercase_Letter | |
implicit | Tag=ArabicCombining | 11 | {064B-0652 0654-0655 0670} | Any character tagged as ArabicCombining | |
implicit | Tag=Recommended | 18 | {01F8-01F9 0674 06C5 06C7-06CA 1E0C-1E0D 1E24-1E25 1E3E-1E3F 1E5A-1E5B 1E92-1E93} | Any character tagged as Recommended | |
implicit | Tag=RefLGR | 2332→0 | {} | Any character tagged as RefLGR | |
implicit | Tag=RefLGRBySequence | 13→0 | {} | Any character tagged as RefLGRBySequence | |
implicit | Tag=Review_Needed | 0 | {} | Any character tagged as Review_Needed | |
implicit | Tag=Proposed:Technical | 12 | {064B-0652 0654-0655 0670-0671} | Any character tagged as Proposed:Technical | |
implicit | Tag=Proposed:Uncommon_Use | 402 | {0114-0115 012C-012D 014E-014F 0156-0157 0162-0163 01D5-01DC 01DE-01E3 01EA-01ED 01F0 01F4-01F5 01FA-01FF 021E-021F 0226-0233 0400 040D 0450 045D 04C1-04C2 ...} | Any character tagged as Proposed:Uncommon_Use |
Legend
- Members or Ranges
- Lists the members of the class as code points (xxx) or as ranges of code points (xxx-yyy). Any class too numerous to list in full is elided with "...".
- m→n
- Indicates a set for which only n of its m members fall inside the repertoire.
- Tag=ttt
- A named or implicit class defined by all code points that share the given tag value (ttt).
- Prop=ppp:vvv
- A named class defined by reference to value vvv of Unicode property ppp.
- Implicit
- An anonymous class implicitly defined based on tag value and for which there is no named equivalent.
Note: The following named classes are defined but not used in this LGR: Digits, Uppercase.
Whole label evaluation and context rules
The LGR does not define any rules.
Actions
The LGR does not define any actions.
Table of References
The following lists the references cited for specific code points, variants, classes, rules or actions in this LGR.
[AZ] | Azerbaijani Arabic alphabet, Wikipedia: Azerbaijani alphabet, https://en.wikipedia.org/wiki/Azerbaijani_alphabet Note: cited as in use with the Southern Azerbaijani language in Iran |
[CY-1] | Omniglot, Welsh (Cymraeg), https://www.omniglot.com/writing/welsh.htm |
[CY-2] | Corpora Collection Leipzig, Wortschatz Leipzig, Welsh, https://wortschatz.uni-leipzig.de/en/download/Welsh#cym_wikipedia_2021 |
[EGIDS] | Lewis and Simons, EGIDS: Expanded Graded Intergenerational Disruption Scale,” documented in [Glottolog] and summarized here: https://en.wikipedia.org/wiki/Expanded_Graded_Intergenerational_Disruption_Scale_(EGIDS) |
[Glottolog] | Glottolog 5.1 edited by Hammarström, Harald & Forkel, Robert & Haspelmath, Martin & Bank, Sebastian, https://glottolog.org |
[IAB] | IAB Statement on Identifiers and Unicode 7.0.0, https://datatracker.ietf.org/doc/statement-iab-statement-on-identifiers-and-unicode-7-0-0/01/pdf/ |
[ID-REC] | The Unicode Consortium: Identifier_Type property for Unicode Version 16.0.0, available as https://unicode.org/Public/security/16.0.0/IdentifierType.txt Code points cited have Identifier_Type Recommended for 16.0.0 |
[KAB] | Wikipedia, Kabyle language, https://en.wikipedia.org/wiki/Kabyle_language |
[KK] | Kazakh Arabic alphabet, Wikipedia: Kazakh alphabets, https://en.wikipedia.org/wiki/Kazakh_alphabets Note: this alphabet cited as in official use in the Ili Kazakh Autonomous Prefecture of the Xinjiang Uyghur Autonomous Region in China. |
[KY] | Kyrgyz Arabic alphabet, Wikipedia: Kyrgyz Alphabets, https://en.wikipedia.org/wiki/Kyrgyz_alphabets (Note: this alphabet cited as in official use in Afghanistan, Pakistan and the People's Republic of China China) in the Kizilsu Kyrgyz Autonomous Prefecture, the Ili Kazakh Autonomous Prefecture of the Xinjiang Uyghur Autonomous Region. |
[LRC] | Northern & Southern Luri, Omniglot, Luri (لوری), https://www.omniglot.com/writing/luri.htm |
[MSR] | ICANN, “Maximal Starting Repertoire”, https://www.icann.org/resources/pages/msr-2015-06-21-en |
[MUA] | Wikipedia, Mundang language, https://en.wikipedia.org/wiki/Mundang_language |
[NNH] | Wikipedia, Ngieeboon langauge, https://en.wikipedia.org/wiki/Ngiemboon_language |
[Proposal-Arabic] | “Proposal for Arabic Script Root Zone LGR”, https://www.icann.org/en/system/files/files/arabic-lgr-proposal-18nov15-en.pdf |
[Proposal-Bengali] | Neo-Brahmi Generation Panel, “Proposal for a Bangla (Bengali) Script Root Zone Label Generation Rule-Set (LGR)”, 20 May 2020 (PDf), https://www.icann.org/en/system/files/files/proposal-bangla-lgr-20may20-en.pdf |
[Proposal-Devanagari] | Neo-Brahmi Generation Panel, “Proposal for a Devanagari Script Root Zone Label Generation Rule-Set (LGR)”, 22 April 2019, https://www.icann.org/en/system/files/files/proposal-devanagari-lgr-22apr19-en.pdf |
[Proposal-Ethiopic] | Ethiopic Generation Panel, “Proposal for Ethiopic Script Root Zone LGR”, 17 May, 2017, https://www.icann.org/en/system/files/files/proposal-ethiopic-lgr-17may17-en.pdf |
[Proposal-Gurmukhi] | Neo-Brahmi Generation Panel, “Proposal for a Gurmukhi Script Root Zone Label Generation Ruleset (LGR)”, 22 April 2019, https://www.icann.org/en/system/files/files/proposal-gurmukhi-lgr-22apr19-en.pdf |
[Proposal-Hebrew] | Hebrew Generation Panel, “Proposal for a Hebrew Script Root Zone Label Generation Ruleset (LGR)”, Version 1.3, 24 April 2019, https://www.icann.org/en/system/files/files/proposal-hebrew-lgr-24apr19-en.pdf |
[Proposal-Kannada] | Neo-Brahmi Generation Panel, “Proposal for a Kannada Script Root Zone Label Generation Ruleset (LGR)”, 6 March 2019, https://www.icann.org/en/system/files/files/proposal-kannada-lgr-06mar19-en.pdf |
[Proposal-Khmer] | Khmer Generation Panel, “Proposal for Khmer Script Root Zone LGR”, 15 August 2016, https://www.icann.org/en/system/files/files/proposal-khmer-lgr-15aug16-en.pdf |
[Proposal-Oriya] | Neo-Brahmi Generation Panel, “Proposal for an Oriya Script Root Zone Label Generation Rule-set “, 6 March 2019, https://www.icann.org/en/system/files/files/proposal-oriya-lgr-06mar19-en.pdf |
[Proposal-Sinhala] | Sinhala Generation Panel, “Proposal for a Sinhala Script Root Zone Label Generation Ruleset (LGR)”, 22 April 2019, https://www.icann.org/en/system/files/files/proposal-sinhala-lgr-22apr19-en.pdf |
[Proposal-Telugu] | Neo-Brahmi Generation Panel, “Proposal for a Telugu Script Root Zone Label Generation Ruleset (LGR)”, 7 June 2019, https://www.icann.org/en/system/files/files/proposal-telugu-lgr-07Jun19-en.pdf |
[RefLGR] | ICANN, “Second-Level Reference Label Generation Rules”, https://www.icann.org/resources/pages/second-level-lgr-2015-06-21-en |
[RefLGR-Overview] | ICANN, “Reference Label Generation Rules (LGR) for the Second Level — Overview and Summary”, https://www.icann.org/sites/default/files/packages/lgr/lgr-second-level-overview-summary-25oct24-en.pdf |
[RZ-LGR] | ICANN, “Root Zone Label Generation Rules”, https://www.icann.org/resources/pages/root-zone-lgr-2015-06-21-en |
[UG] | Uyghur Arabic alphabet, Wikipedia: Uyghur alphabets, https://en.wikipedia.org/wiki/Uyghur_alphabets Note: this alphabet cited as official and in widespread use in Xinjiang province of China. |
[YO-1] | Wikipedia, Yoruba alphabet, https://en.wikipedia.org/wiki/Yoruba_alphabet |
[YO-2] | Corpora Collection Leipzig, Wortschatz Leipzig, Yoruba, https://wortschatz.uni-leipzig.de/en/download/Yoruba#yor_web_2019 |