[Unicode]  ISO 15924 Registration Authority Previous | RA Home | Next
 
 

ISO 15924 - Frequently Asked Questions (FAQ)

  1. What is a script?
  2. What is the difference between a script and a language?
  3. When should I use script codes or names?
  4. What is the ISO 15924 standard?
  5. How was the ISO 15924 code list developed?
  6. Who uses the ISO 15924 codes and why?
  7. What is the relationship between the Internet RFC 3066 (and its predecessor RFC 1766) and the ISO 15924 standards?
  8. Are the script codes intended to be used as abbreviations for the script?
  9. Who is the registration authority for the ISO 15924 standard?
  10. What is the function of the registration authority for the ISO 15924 standard?
  11. What is the Joint Advisory Committee (JAC) for the ISO 15924 standard?
  12. Are there any electronic discussion lists for the ISO 15924 script codes?
  13. How does one request new ISO 15924 script codes?
  14. What are the criteria used to define new ISO 15924 script codes?
  15. Are separate script codes defined for variants of scripts?
  16. Are separate script codes defined for different orthographies?
  17. What are collective script codes?
  18. What is the timeline used for approving new ISO 15924 script codes?
  19. Can ISO 15924 script codes be changed after they had been initially created?
  20. Are the ISO 15924 codes case sensitive?
  21. How does one indicate the script variation used in a particular country?
  22. How does one make distinctions between traditional and simplified Chinese characters using the ISO 15924 script codes?
  23. How does one distinguish between Cantonese and Mandarin variations of Chinese?
  24. How does one code undetermined scriptss using the ISO 15924 script codes?
  25. Is there a mechanism for using locally defined codes?
  26. What is the difference between a script code (ISO 15924) and a language code (ISO 639) and a country code (ISO 3166)?


  1. What is the ISO 15924 standard?

    ISO 15924 provides a set of script codes – a four-letter code set and a three-digit code set – for the representation of names of scripts.

    Back to Questions
  2. How was the ISO 15924 code list developed?

    ISO 15924: Codes for the representation of names of scripts was developed by the ISO TC46/WG3 for use in terminology, lexicography, linguistics, and verious computer applications. It was devised to represent most of the major scripts of the world that are found in the world's literature.

    For more information about the development of the ISO 15924 codes, please see:
    www.unicode.org/iso15924/develop.html

    Back to Questions

  3. Who uses the ISO 15924 codes and why?

    The ISO 15924 codes was devised for use by libraries, information services, and publishers to indicate language in the exchange of information, especially in computerized systems. The codes are expected to be widely used in the library community and may also be adopted for any application requiring the expression of language in coded form by terminologists and lexicographers. The codes codes are expected to be used in Internet applications (see question 15).

    Back to Questions

  4. What is the relationship between the Internet RFC 3066 (and its predecessor RFC 1766) and the ISO 15924 standards?

    The Internet RFC 3066 (Tags for the Identification of Languages), which replaces RFC 1766, describes a language tag for use in cases where it is desired to indicate the language used in an information object, how to register values for use in this language tag, and a construct for matching such language tags. It is considered an Internet Best Current Practices for the Internet Community and gives guidance for the use of ISO 15924 codes.

    RFC 3066 specifies use of a 2-character code from ISO 15924 when it exists; when a language does not have a 2-character code assigned the 3-character code is used. Although it states that the 3-character terminology code is used in these cases where no 2-character code exists, this situation will not occur, since the only variant codes in ISO 15924 are for languages that already have a 2-character code.

    The RFC also specifies the use of optional subtags (e.g. a country code from ISO 3166) and how to register dialect or variant information with IANA when there is no available ISO 15924 code.

    Back to Questions

  5. Are the script codes intended to be used as abbreviations for the script?

    The script codes in ISO 15924 were developed to serve as a device to identify a script or collection of scripts. They were NOT intended to serve as abbreviations or short forms for scripts, but rather as a code that serves as a device to identify a script name. Some codes in the list consist of letters that are used in the some form of the script name, but this has not been possible in all situations, and, often, one would need to know the English form of the language name to recognize a relationship. There are situations where codes have been selected that diverge from the script name. In using the script codes, systems generally display the script name represented by the code and not the code itself to users. Therefore it becomes irrelevant whether the code is "123", "Wxyz", "Latn" or whatever.

    See section 4.1 of ISO 15924 for criteria for the selection of the language code.

    Back to Questions

  6. Who is the registration authority for the ISO 15924 standard?

    The Registration Authority for the ISO 15924 codes is:

    The Unicode Consortium
    Box 391476
    Mountain View, CA 94039-1476
    U.S.A.
    E-mail: iso15924@unicode.org

    The registrar for the ISO 15924 codes is:

    Evertype
    48B Gleann na Carraige
    Cill Fhionntain
    Baile Átha Cliath 2
    Éire/Ireland

    Back to Questions

  7. What is the function of the registration authority for the ISO 15924 standard?

    The registration authorities for the ISO 15924 standards receive and review request applications for both new script codes and for changing existing ones according to criteria indicated in the standards.

    The registration authorities maintain accurate lists of information associated with registered script codes.

    They also process and distribute updates of the codes on a regular basis to subscribers and other parties.

    For more information about the registration authorities' duties, please see: www.unicode.org/iso15924/annexa.html#function.

    Back to Questions

  8. What is the Joint Advisory Committee (JAC) for the ISO 15924 standard?

    The Joint Advisory Committee ISO 15924/RA-JAC was established to advise the ISO 15924 and 15924 registration authorities and guide coding rule applications (as laid down in the ISO 15924 documentation). It consists of six individuals representing ISO member bodies, plus the rotating chairs of the registration authorities as well as up to six observers. The JAC considers applications for new script codes and votes on whether they will be included.

    More information about the Joint Advisory Committee and its activities can be found at: www.unicode.org/iso15924/annexa.html.

    Back to Questions

  9. Are there any electronic discussion lists for the ISO 15924 script codes?
    Yes, for general discussion about the ISO 15924 script codes, please write to: iso15924@dkuug.dk. There is also a discussion list on the IETF RFCs on language coding at: ietf-languages@eikenes.alvestrand.no.

    Back to Questions

  10. How does one request new ISO 15924 script codes?

    To request new codes in the ISO 15924 standards, please fill out the online form at: www.unicode.org/iso15924/iso15924form.html.

    Before submitting your requests, please review the criteria used to define new codes. Appropriate documentation must be provided with the request.

    Back to Questions

  11. What are the criteria used to define new ISO 15924 language codes?

    The criteria used to define new codes in the ISO 15924 standard are:

    Relation to ISO 15924. Since ISO 15924 is to remain a subset of ISO 15924, it must first satisfy the requirements for ISO 15924. In addition it must satisfy the following.

    Documentation

    • a significant body of existing documents (specialized texts, such as college or university textbooks, technical documentation manuals, specialized journals, subject-field related books, etc.) written in specialized languages
    • a number of existing terminologies in various subject fields (e.g. technical dictionaries, specialized glossaries, vocabularies, etc. in printed or electronic form)

    Recommendation. A recommendation and support of a specialized authority (such as a standards organization, governmental body, linguistic institution, or cultural organization)

    Other considerations

    • the number of speakers of the language community
    • the recognized status of the language in one or more countries
    • the support of the request by one or more official bodies

    Collective codes. ISO 15924 does not use collective codes. If these are necessary the alpha-3 code will be used.

    The criteria used when defining new codes in the ISO 15924 standard are:

    Number of documents. The request for a new language code should include evidence that one agency holds 50 different documents in the language or that five agencies hold a total of 50 different documents among them in the language. Documents include all forms of material and is not limited to text.

    Collective codes. If the criteria above are not met the language may be use a collective language code. The words "languages" or "Other" as part of a language name indicates that a language code is a collective one. See also under question 17.

    More information about the selection process for ISO 15924 codes can be found at: www.unicode.org/iso15924/iso15924jac_n3r.html.

    Back to Questions

  12. Are separate script codes defined for variants of scripts?

    A dialect of a language is usually represented by the same language code as that used for the language. If the language is assigned to a collective language code, the dialect is assigned to the same collective language code. Generally, dialects are not given different codes, but determining the difference between dialects and languages will be decided on a case-by-case basis.

    Back to Questions

  13. Are separate script codes defined for different orthographies?

    A language using more than one orthography is not given multiple script codes.

    Back to Questions

  14. What are collective script codes?

    Collective script codes are language groups that are used if the criteria for assigning a separate language code are not met. The words "languages" or "(Other)" indicates that a language code is a collective one.

    ISO 15924 does not use collective codes, but ISO 15924 does. References from separate language names to the collective code used for that language are not included in the ISO 15924 standard, but may be found in the MARC Code List for Languages.

    Back to Questions

  15. What is the timeline used for approving new ISO 15924 script codes?

    After a request for a new, deleted, or changed code is submitted to the appropriate registration authority (Infoterm for 15924 and Library of Congress for 15924), the appropriate registration authority determines whether or not the request meets the relevant criteria.

    The registration authority then informs the requester of the process generally within two weeks of the submission. If the request meets the criteria, the registration authority determines an appropriate code and consults the ISO 15924/JAC. If the first vote is not unanimous, a second round of voting is conducted.

    The original requester will be informed of the JAC decision in six weeks to two months from submission of the original request.

    Results of the JAC decisions will be publicized in a change notice available on the Web.

    Back to Questions

  16. Can ISO 15924 script codes be changed after they had initially been created?

    ISO 15924 script codes are usually not changed in order to ensure continuity and stability of online retrieval from large databases built over many years. However, when language names associated with codes have been changed, variant forms of a language name may be included in the entry, separated by a semicolon in the code lists.

    Obsolete codes are generally not reassigned when they have been changed or discontinued.

    A list of codes that have been changed or added to the lists are located at: www.unicode.org/iso15924/codechanges.html.

    To request a change to the name of an already defined language name, please see: www.unicode.org/iso15924/iso15924chform.html.

    Back to Questions

  17. Are the ISO 15924 codes case sensitive?

    ISO 15924 recommends use of the script codes with an initial capital letter followed by three small letters, but they should be considered case-insensitive and are unique codes regardless of case.

    Back to Questions

  18. How does one indicate the script variation used in a particular country?

    The ISO 15924 standards (and RFC 3066) allow for combining the language code with a country code from ISO 3166 to denote the area in which a term, phrase, or language is used. For instance, English as spoken in U.S. may be indicated with the following:

    eng-US

    Back to Questions

  19. How does one make distinctions between traditional and simplified Chinese characters and using the ISO 15924 script codes?

    The differences between traditional and simplified Chinese characters cannot be represented using the ISO 15924 codes because these are distinctions in script. The character sets can be coded using ISO 15924 (Code for the Representation of Names of Scripts) script codes.

    Back to Questions

  20. How does one distinguish between Cantonese and Mandarin variations of Chinese?

    The standard was intended for written languages primarily, and since Chinese is the same in its written form for Cantonese and Mandarin, no distinction was made in the code list. There are two possible methods for making this distinction using ISO 15924 codes.

    • Use the code for Chinese and add the country code to designate which type of Chinese you are indicating if distinguishing on the basis of country. This is documented in ISO 15924 in section 4.4 and a similar instruction is in ISO 15924:

      zh-CN (as spoken in China)
      zh-TW (as spoken in Taiwan)

    • Use a subtag with the 2-character language code as specified in RFC 3066. Subtags are registered with the Internet Assigned Numbers Authority (IANA).

      zh-mandarin

    Back to Questions

  21. How does one indicate undetermined languages using the ISO 15924 script codes?

    There are two possibilities for coding undetermined languages.

    The first is to use three blanks (Space or blank, which is Hex 20). This implies that a language code is not applicable because there is no sung, spoken, or written textual content.

    The second possibility is to use a code that is available in the ISO 15924 list:

    Und (Undetermined)

    This code is used if the language associated with an item cannot be determined or specified

    It is not recommended to use nulls, since they may cause problems because of special use in programming as a control character.

    Back to Questions

  22. Is there a mechanism for using locally defined codes?

    If a user wishes to use locally defined codes for languages not covered by ISO 15924, codes qaa through qtz are reserved for local use, including for local treatment of dialects. These codes may only be used locally, and may not be exchanged internationally.

    Back to Questions

  23. What is the difference between a script code (ISO 15924) and a language code (ISO 639) and a country code (ISO 3166)?

    ISO 15924 provides two and three-character codes for representing names of languages. ISO 3166 provides two and three-character codes for representing names of countries. These two standards were developed independently, and there was no attempt to use the same code for a language as that for the country in which it is spoken. One should use codes from each list independently.

    The language code and country code may be used together to indicate a language variation spoken in a particular country (see question 22).

    Back to Questions


Copyright © 2004 ISO, Unicode, Inc., & Evertype. All Rights Reserved