[Unicode] Character Proposals Tech Site | Site Map | Search
 

Submitting Character Proposals

General Information

The Unicode Consortium accepts proposals for inclusion of new characters and scripts in the Unicode Standard. Those considering submitting a proposal should first determine whether or not a particular script or character has already been proposed.  Please see the Proposed New Characters -- Pipeline Table page for information on additions to the Unicode Standard which are already under consideration. General guidelines for the preparation of a proposal appear below.

The Unicode Standard definition of character is stated in the Glossary of Unicode Terms. Before preparing a proposal, sponsors should note in particular the distinction between the terms character and glyph as therein defined. Because of this distinction, graphics such as ligatures, conjunct consonants, minor variant written forms, or abbreviations of longer forms are generally not acceptable as Unicode characters. Also see Where is my Character?

Proposal Guidelines

The sponsor(s) proposing the addition of a new character to the Unicode Standard should follow these guidelines.

Proposals for new emoji need to meet different criteria, however. To propose new emoji, follow the Guidelines for Submitting Unicode Emoji Proposals instead of the rest of this section.

Before proceeding, determine that each proposed addition is a character according to the definition given in the Unicode Standard and that the proposed addition does not already exist in the Standard. Consult the Proposed New Characters page to see if the character is already on track to be encoded, and the Archive of Nonapproval Notices to see if the character has already been considered but was disapproved for some reason.

Often a proposed character can be expressed as a sequence of one or more existing Unicode characters. Encoding the proposed character would be a duplicate representation, and is thus not suitable for encoding. (In any event, the proposed character would disappear when normalized.) For example, a g-umlaut character is not suitable for encoding, since it can already be expressed with the sequence <g, combining diaeresis>. For further information on such sequences see Where is my Character and the FAQ page Characters, Combining Marks.

Ensure that documentation supporting the proposal states whether any Unicode characters were examined as possible equivalents for the proposed character and, if so, why each was rejected. Consult the Unicode Character Encoding Stability Policy to make sure that any associated change to existing characters is in accordance with Consortium policies.

Determine and list the proposed (or recommended) character properties for each character being proposed, especially when proposing entire scripts for encoding. See the Unicode Properties in Character Proposals for guidelines about character properties and a list of questions to help make determinations about appropriate property values. See also Chapter 4, Character Properties of The Unicode Standard. Even a partial list of properties will be helpful in the initial proposal.

Proposals to include entire scripts (Egyptian hieroglyphics, for example) must cite modern, definitive sources of information regarding such scripts. Sponsorship by the relevant academic bodies (such as The International Association of Egyptologists) may be helpful in determining the proper scope for encoding of characters in such cases. Before submitting full script proposals, sponsors should also determine that a proposal does not already exist for that script, for example by consulting the Roadmaps.

If a proposed character is part of a dead language or obsolete/rare script that is already encoded, cite the most important modern sources of information on the script and the proposed additions. Names, including academic affiliation, of researchers in the relevant field are welcomed.

If the proposed characters exhibit shaping behavior (contextual shaping, ligatures, conjuncts, or stacking), provide a description of that behavior, preferably with glyph examples. It should be sufficient so that software engineers can produce a minimally acceptable rendering of the characters.

If the proposed characters are symbols, consult the Criteria for Encoding Symbols to gain familiarity with some of the criteria that the UTC will consider when determining whether new symbols are appropriate for encoding. Research other already-encoded blocks of symbols in the standard to check that the types of symbols in the proposal have precedents. Also, because symbols often vary widely in appearance, check carefully that the symbol(s) in the proposal are not merely font-specific variant shapes of symbols already encoded in the standard.

Information about the sorting order of proposed characters should also be provided, where known. For general information about sorting, see Collation. In particular, consider the UCA Default Table Criteria for New Characters, which specifies the criteria the UTC uses for making initial determinations about collation weights for newly encoded characters.

The Unicode Consortium works closely with the relevant committee responsible for ISO/IEC 10646, namely JTC1/SC2/WG2, in proposing additions as well as monitoring the status of proposals by various national bodies. Therefore, proposals may eventually be formulated as ISO/IEC documents and significant detailed information will be required.

The standardized form "ISO/IEC JTC 1/SC 2/WG 2 PROPOSAL SUMMARY FORM TO ACCOMPANY SUBMISSIONS FOR ADDITIONS TO THE REPERTOIRE OF ISO/IEC 10646" has been designed for the purpose of obtaining detailed information for ISO purposes and for the Unicode Technical Committee. Use of this form is required for all proposals. It is available at the following URL:

https://www.unicode.org/L2/summary.html

To complete the Proposal Summary Form, sponsors may wish to refer to the WG2 Principles and Procedures document, also accessible from that URL. That document contains context and explanations about the various questions on the Proposal Summary Form.

Before "finally approving" additions, we require a font with an appropriate license for printing the standard (see Font Submissions Policy). Even if approved, additions won't be published in a version of the standard unless suitable fonts are available.

Requirements of Proposal Form

The proposal summary form requires the following information (paraphrased):

  • the repertoire, including proposed character names;
  • the name and contact information for a company or individual who would agree to provide a computerized font (True Type or PostScript) for publication of the standard;
  • references to dictionaries and descriptive texts establishing authoritative information;
  • names and addresses of appropriate contacts within national body or user organizations;
  • the context within which the proposed characters are used (for example, current, historical, and so on);
  • especially for sporadic additions, what similarities or relationships the proposed characters bear to existing characters already encoded in the standard.

All proposals (whether successful or not) and related materials will be retained by the Unicode Consortium as a matter of record and may be used for any purpose.

Proposal Review Process

The international standardization of entire scripts requires a significant effort on the sponsor's part. It frequently takes years to move from an initial draft to final standardization, particularly because of the requirements to synchronize proposals with the work done in the ISO committee responsible for the development of ISO/IEC 10646.

Experience has shown that it is often helpful to discuss preliminary proposals before submitting a detailed proposal. One option is to become a member of the Unicode Consortium, and submit the proposal to the members-only email list. Alternatively, sponsors can contact the UC Berkeley’s Script Encoding Initiative for initial review.

Each proposal received will be evaluated initially by technical officers of the Unicode Consortium and the result of this initial evaluation will be communicated to the sponsor(s) of the proposal. Once a proposal passes this initial screening, it will be reviewed by the Unicode Technical Committee.

Sponsors, particularly of entire scripts, should be prepared to become involved at various times throughout the process -- perhaps revising their proposals more than once; collecting further detailed information; organizing on-line discussions or meetings to dispel controversy; or answering questions posed by committees or national bodies. Without such involvement, any proposal of more than a few characters is unlikely to be successful in the long-run.

Sponsors can monitor the further progress of their proposals via the public UTC minutes as well as the Proposed New Characters -- Pipeline Table page.

Examples

Many good proposals can be found in the UTC document register. Thesaurus Linguae Graecae has prepared a number of successful proposals.

For people interested in proposing a single symbol or a small set of symbols for encoding, there are also many successful proposals in the UTC document register. For example see the proposal for power symbols.

Interim Solutions

There are ways for programmers and scholarly organizations to make use of Unicode character encoding, even if the script they want to use or transmit is not yet (or may never be) part of the Unicode Standard. Individual groups that make use of rare scripts or special characters can reach a private agreement about interchange and set aside part of the Private Use Area to encode their private set of characters. Individuals with interests in rare scripts or materials relating to them may sometimes be contacted through an electronic mail list which the Consortium maintains. For information about these mail lists, please contact the Unicode office.

Sending Proposals

To send completed proposals or to make further inquiries, please see the Document Submission Details page.

All proposals are required to be in one of the following forms:

  • PDF format (preferred)
  • HTML along with any needed GIF or JPEG images (a ZIP file or TAR archive should be made, including all of the required files)