[Unicode]  Online Data Home | Site Map | Search
 

Useful Resources

About this List

This page contains a number of resources pertaining to Unicode and Internationalization . These references are provided for informational purposes only.  In particular, the Unicode Consortium has not taken any steps to evaluate or verify the usefulness or accuracy of the information provided.

For additional information, refer to FAQ, Articles on Unicode, Books on Unicode or to  Unicode Enabled Products  for a sample list of products that are reported to be fully (or partially) Unicode-enabled.

If you have any updates to this list, please contact the Unicode office. This should include:

  • Category (from the above)
  • URL 
  • Brief description of the page/site content

Fonts and Keyboards

Adding Fonts to Java
Shows how to enable the use of Unicode fonts (such as Arial Unicode MS) with Java
Adhuna Keyboard [BornoSoft Keyboard Interface Package] 
UNICODE compliant Bengali/Bangla keyboard driver that uses only lower case phonetic English key input for Bengali output.
Aksharamala
A standards-based software tool which automatically transliterates English input into a target Indian lanaguage.
ALT Keyboard Combinations New
ALT keyboard Combinations (in Dutch) New
A convenient and handy list of ALT keyboard combinations
Assamese Phonetic Keyboard 
Assamese Online Dictionary
Avro Keyboard - UNICODE Compliant Free Bangla Typing Software
A full featured UNICODE supported FREE Bangla typing software with most popular Bangla keyboard layouts from Bangladesh and India.
Burkina Faso (In French only)
Keyboards and dictionaries for languages of Burkina Faso
Bangla
Word processor, Unicode converter, text to speech, customizable keyboard, translator, calendar and more for Bangla with Unicode support.
Bangla
Freeware Bangla Unicode Typing Interface, with a choice of keyboard layouts
Bangla Unicode Fonts
Free Unicode complaint OpenType Bangla fonts.
Free Bangla Fonts Project
Releases of four GPLd Bangla (Bengali) Open Type Fonts with full Unicode support
Celtic (Michael Everson)
Everson Mono and other fonts
Chinese (Los Angeles Chinese Learning Center)
A guide on Chinese characters input methods
Code2000 (James Kass)
A shareware Unicode font
Devanagari Editor
Displaying and Typing Japanese Characters
A guide on displaying and typing Japanese characters using Unicode
Dvorak and QWERTY keyboard drivers
For Windows. Support a large and growing number of scripts. Most of these keyboards furnish all of the characters in the relevant Unicode ranges.
Edward Trager's index of free/libre fonts
Europe Keyboard
An international keyboard layout based on the German standard layout, intended for practically all languages written using the Latin script including e.g. Vietnamese, Yorùbá. (Information on the website is in German language)
Every Known Font site
This site is down until further notice. Check Typesource by Proxy
Fingertipsoft Cyrillic Character Set and Keyboard Information
Fontboard
Free Arabic, Cyrillic, Esperanto, Hebrew, Maltese and Yiddish keyboard layouts and a small selection of special fonts for linguists and card players.
Fonts for Kurdish Language
Site contains 33 free Unicode fonts for Kurdish Language with support of Farsi and Arabic, as well as necessary keyboard installator for Windows.
Fonts, Keyboards and Browsers Setup (Alan Wood)
Fonts By Range
Information on the Unicode fonts available for each Unicode range
Gallery of Unicode Fonts
Hundreds of free Unicode fonts, with sample images from each
Gaelic for US hardware & UK hardware (Ciarán Ó Duibhín)
Downloadable free software keyboard layouts for Windows 2000/XP which offer a number of accented and other characters from the Unicode character-set via dead keys for use on US and UK keyboard hardware respectively.
They are primarily intended to support characters found in the Gaelic language, but can be useful more generally.
Fonts in Cyberspace
Links to many sources of multilingual fonts, both encoded with Unicode or a custom codepage
Free Dictionary Project
Hindi Conversion Tools
Hindi Editor (Matthew Blackwell)
Indic Support for Linux
Unicode Devanagari font in Mozilla
International Phonetic Alphabet in Unicode
International Phonetic Alphabet (linguiste.org) New
for Windows 95 and 98, and does not support Unicode.
JLG Latin Extended Keyboard (For US Keyboards)
Freeware which allows users to input over 1000 Unicode characters.
Junicode (Peter S. Baker)
A Unicode-based font for medievalists. 
Keyboard Layout Editor and Generator
Works with Windows 9X, Windows NT 4.0 and Windows 2000. Supports Unicode (when installed in Windows NT 4.0. or Windows 2000)
Keyboard Wizard
A keyboard management utility for Windows® applications, which allows to enter text in any language supported by Unicode.
KeyPixy 
Unicode keyboard for the SuperWaba virtual machine on PDAs
Language Culture Type: International Type Design in the Age of Unicode
Language samples in UTF-8  (Michael Kaplan)
Microsoft Keyboard Layout Creator (MSLKC)
Myanmar in Unicode
Nepali Conversion Tools
Quick Key 5.1
The ideal solution for people who need to input foreign characters or mathematical symbols quickly, but do not want to spend time learning a new keyboard layout. Inserts any Unicode character with a single click
Romanian Academic v2.0  by Sorin Paliga
Romanian Academic is set of keyboard layouts for the Mac OS, which includes a Unicode keyboard compliant with the Romanian national standard SR-13392/1998.
Romanian Keyboard by Cristian Secara (links to a website in Romanian)
Romanian keyboards for Windows 9x/Me, Windows NT4.0, Windows 2000 and Windows XP.
Romanian UF by Florin Neumann
Romanian UF is Romanian Unicode keyboard layout for Mac OS X v10.2 or later, designed for users of US QWERTY keyboards.
South Asia Language Resource Center (University of Chicago)
Tavultesoft Keyman 6.2
Unicode keyboard input mapping system for Microsoft Windows. Keyboard layouts for use with Keyman are available for many languages, and custom keyboard layouts for both simple and highly complex scripts can be quickly designed and packaged for distribution with Keyman Developer. Supports Unicode input in Windows 95, 98, Me, NT4, 2000, XP, Server 2003 and Vista.
Type It
A set of online Unicode-based text editors designed for typing national characters in a variety of languages, as well as symbols of the International Phonetic Alphabet. The editors include buttons and keyboard shortcuts to enter characters without memorizing Alt codes or installing any software, and can be used from any computer with Web access.
Unicode Character Map
Free and fast online method to select Unicode characters to paste into forms or other apps
Unicode implementations for uncommon scripts on MAC OS X
Includes Burmese, Cherokee, Inuktitut, Kannada, Malayalam, Telugu, Tibetan, and soon to be released Limbu support
Unicode keyboard layout generator for the Mac OS (Alex Eulenberg)
Generates 'uchr' keyboard resources, which can be used in Mac OS 9 (with Unicode script) or OS X.
TrueType Explorer
Displays the Unicode ranges supported by a font, and all the glyphs for a given range. It also displays Panose classification, Name strings, Kerning pairs and supported code pages.
Unicode Characters Pickers by Richard Ishida
HTML-based. Quickly create phrases in a script by clicking on Unicode characters arranged in a way that aids their identification. Covers Arabic, Armenian, Bengali, Devanagari, Basic Ethiopic, Gurmukhi, Hebrew, IPA, Latin & diacritics, Malayalam, Tamil, Thai, Tifinagh.
Unicode Code Converter by Richard Ishida
This dynamic HTML app helps you convert between Unicode character numbers, characters, UTF-8 and UTF-16 code units in hex, percent escapes, and Numeric Character References (hex and decimal).
UniView by Richard Ishida
HTML-based. Look up characters, character blocks, paste in and discover unknown characters, store your own info about characters, search on character names, do hex/dec/ncr conversions, highlight character types, etc. etc. Supports Unicode 5.0
UTF-8 Sampler
A plain-text web page in UTF-8 including brief texts many scripts in both proportional and fixed fonts.
VietIME 1.0
A cross-platform Vietnamese input method editor (IME). Enable input of Vietnamese Unicode text in Java's AWT (TextArea and TextField) and Swing text components.
Vietnamese Fonts on MAC OS X
Describes how to enable the Vietnamese Unicode font which is standard in Mac OS X
Website Tips for Designers: Fonts.
 3-D Keyboard

Linguistics and Script Specialty Sites

Akkadian
Database of Latin letters and various other scripts and languages
Interactive query engine for a database of extended Latin letters as well as other scripts used by various languages. It is maintained by Indrik Hein of Estonian Standards.
Digital Dictionaries of South Asia
Digital South Asia Library
The Humanities Computing Laboratory (formerly Duke Humanities Computing Facility)/WinCALIS
Encyclopedia of Typography and Electronic communications
Contains information on using Unicode and ISO 10646, Internet technology, multilingual script terminology, font architecture, character set and encoding standards, and visual human language communication arts. It is edited and maintained by John M. Fiscella.
FarsiWeb Project
Persian script in Unicode and other standards.
HK-SCS New
HKSCS Compatibility Issues Migrating to Unicode 5.0
Khmer
A web page relating to Khmer in Unicode compiled by Maurice Bauhahn
Middle English Manuscripts represented on the Web with UTF-8
Multilingual Computing Resources (Mark Leisher)
Multilingual Project Gutenberg
MyMyanmar.net
A new Myanmar Linguistic, Unicode and Research information website
Southeast Asian Computing and Linguistics  
(Doug Cooper)
TITUS
Thesaurus Indogermanischer Text- und Sprachmaterialien
Specialized site dealing with historical linguistic work and ancient scripts
Uyghur Computer Science Association (UCSA)
Provides free Uyghur Unicocde fonts, software, technical support for Uyghur language processing and more.
Uyghur Unicode Based Website
Online Uyghur input techniques, fonts and keyboard layout, FAQ on Uyghur Unicode, UKY to traditional Uyghur converter and other multidirectional converters, Uyghur software information, and more.
Vietnamese-Unicode FAQs
Zvon character search
it finds properties of Unicode characters and determines its usage in XML documents. It also enables search of HTML and MathML entities and searches based on visual similarity. A browse by Scripts, Blocks, or Digits is also enabled.

Organizations and Other Standards

Banking Symbols Reference
Character Set Tables (Frank da Cruz)
Chinese GB 18030 Encoding Standard
An extensive summary with enough information to describe the encoding structure and to prepare an implementation
Danish UNIX System User Group Website (DKUUG)
Site includes link to extensive ftp archives, as well as information about standardization and Unix systems
DKUUG ftp archive of documents related to I18N
DKUUG ftp archive of ISO documents related to character standardization
European Union
Official website of the European Union (EU), available in 20 different languages
Hong Kong Supplementary Character Set (HKSCS)
A character set developed by the Government of Hong Kong Special Administrative Region. It contains Chinese characters specific to Hong Kong and supplements Unicode/ISO 10646 and Big-5.
IANA (Internet Assigned Numbers Authority)
email to: iana@iana.org
IANA Character Set Registry 
IETF Home Page
Explains internet standards
Internet Drafts
ISO Addresses
ISO International Register of coded character sets
ISO 8859 Character Sets
email to: Roman Czyborra
ISO 10646 Online
ISO 8601 (DateTime Standard)
ISO 639-1 Registration Authority
ISO 639-3 Registration Authority
ISO 639 language codes
ISO 3166 Maintenance Agency
ISO 3166 country codes
ITTF's Publicly available Standards and Technical Reports
LISA
Consisting of over 200 corporate clients and their globalization solutions partners -- the LISA provides best practice, business guidelines and multi-lingual information management standards for making enterprise globalization become a reality.
Localization Research Centre
The LRC, based at the University of Limerick in Ireland, is the focal point and the research and educational centre for localisation.
Michael Everson
New script proposals, other character encoding and internationalization standards documents, font information, and more.
Markus Kuhn's FAQ on international standards
NISO Home Page
the whole ANSI/NISO Z39.53 list (last revised 7/19/93) is at:
http://lcweb.loc.gov/marc/languages/
updates are at: http://lcweb.loc.gov/marc/langann.html
Postal Address Guide (Frank da Cruz)
This page also includes the names of countries in native script encoded in UTF-8.
Postal Codes and Address Resource List (Graham Rhind)
A very comprehensive guide to postal address conventions and codes around the world.
TC46 Transliteration Links
UNGEGN Working Group on Romanization Systems
MARC 21 Specifications (including mapping tables to UCS/Unicode)
Thai IT Standards National Electronics and Computer Technology Center (NECTEC)
RFC's and Internet Standards
SIL Ethnologue
W3C (World Wide Web Consortium) Internationalization Activity
XML Schema Datatypes
Date /time information

Training and Courses

Certification in Localization at California State University in Chico, CA.

Using Unicode

ANSI C implementation of UTF-8
Converts UTF-8 into UCS4 and vice versa. Source code is BSD licensed.
Bangla Unicode Converter
Posted by the Hunger Project-Bangladesh
C++ Unicode Implementation
Chinese Tools (Erik Peterson)
Big5/GB/Hz to UTF-8 Converter and Chinese Character Dictionary in UTF-8
Chinese/Norwegian (Ingar Holst)
Online dictionaries between Chinese and Riksmål Norwegian. Under construction.
Chinese ideographs and their Cantonese pronunciation
cpDetector (Achim Westermann)
Allows to sort large collections of documents by their codepage, to transform large collections of documents to a target codepage (e.g.: into utf-8), and can be used as a java library for codepage detection that may be used for 3rd party software (search engines, file sharing software, browsers, any software that accesses textual data over network)
CPG-Nuke 8.2, CPG-Dragonfly CMS
Open Source CMS/Portal
Data on Languages
A useful web interface to a database of Unicode characters, and codepage conversions.
Decode Unicode
Delphi Graphics and Unicode Center (Mike Lischke)
A unique resource for all Borland Delphi users who need to deal with more than just plain ANSI text. Download a must-have Unicode core library which contains code for wide string manipulation as well as search engines for tuned Boyer-Moore and Unicode regular expression. The Unicode Syntax Edit control particularly important for all those who want to include wide string scripting support into their application. Free for non-commercial use and comes with full source code.
EarthWords
Multilingual web pages developed using some 20 different Unicode ranges including Greek, Cyrillic, Hebrew, Arabic, Devanagari, Bengali, Gurmukhi, Gujarati, Tamil, Thai, Lao, Georgian, Ethiopic and Unified Canadian Aboriginal Syllabics.
EDICODE
Allows entering Unicode characters into a text field by simply pressing buttons labeled with corresponding glyphs. There are button sets for some European languages as well others covering a few Unicode blocks
FAIRY
A Web server extension that makes it possible to show text with any formatting of any language in all browsers on all platforms.
Ethiopic - Sadiss 1.0
Unicode text editor for Ethiopic for Java 1.4.0 or above. Includes Ethiopic font, keyboard layout for Ethiopic, Unicode character entry, text search, UTF-8, locale support, help, samples,...
Field Guide to Chinese Characters (Clopper Almon)
An index to over 3,600 common characters with links to Unihan. The indexing is by the new RADICODE system combining English names of 100 meaningful radicals and stroke codes (not counts) for the rest of the character. Said to be faster, easier, and less arbitrary than the traditional system with 214 radicals and stroke counts.
French (Phillipe Deschamp)
Contains material regarding French language standards and cross-postings for other standards bodies
German
List of countries, with full names, codes, adjectival forms for country and inhabitants, and capitals.
Hanzi 3.0
CD-ROM software for the Macintosh, MS-Windows, and DOS operating systems. Wenlin tackles the most frustrating obstacles for students, scholars, and speakers of Chinese with its versatile and easy-to-use interface.
Hebrew Jony Rosenne
IBM developerWork™
Section of the site discussing Unicode related issues
I18ngurus
Open internationalization resources directory. The site currently provides 900+ links on internationalization and Unicode resources on the internet classified by category and actively monitored.
Internationalization (Alex Schonfeld)
using Java and other resources
Internationalization (Tex Texin)
The Benefits of Unicode, The number of characters in the Unicode Standard, and An Introduction to the Business Example for Unicode are just a few of many interesting topics addressed in this website.
ITworld.com
IT articles that refer to Unicode
Japanese, Chinese and Korean (Ken Lunde)
CJK resources and information, Japanese code manipulation tools, etc.
Mellel by Redlex
Mellel is an affordable word processor for the Mac OS X platform, with extensive Unicode support and specialized features for Farsi, Arabic, Hebrew, and other languages.
Multilingual Support in Internet/IT Applications
NameMap Class
A bidirectional map between Unicode 3.0 character names and the corresponding values for use in Java programs.
Polish Resources
Useful information about Unicode for Polish Users
Polyglot 3000
Automatically recognizes over 400 languages. Comes with full Unicode support.
Private Use Areas
How they are used by Software Vendors
Punjabi Computing Resource Centre
Includes a free Unicode 4.0 GPL font called Saab for Gurmukhi (the first of its kind). Also has a conversion utility (Gurmukhi Unicode Conversion Application) to convert from font-based Gurmukhi to Unicode.
Roland Hentschel utilities
SC Unipad editor
A Unicode text editor for Windows operating systems from Sharmahd Computing
Special Characters Mapping Tool for Web Designers
Sun's Solaris 7 & 8 operating environments and Unicode
Unicode, multilingual computing, and software internationalization. Technical concerns of Unicode in an internationalized application. Codeset conversions.
Testing Tamil and Telugu
The Unicode HOWTO
Bruno Haible's guide to using Unicode/UTF-8 with GNU/Linux systems.
The Unicode Workflow
Johan van Mol's guideline for web developers on how use UTF-8 in their web projects. The article covers the whole process of creating a website: forms, XML, databases, the encoding of script and source files, HTML output and e-mail output.
Tommy's Unicode Library 1.2
Troll Tech's About Unicode
uni2ascii/ascii2uni
A pair of programs that convert between UTF-8 and a variety of pure ASCII representations.
Unibook
A small utility for offline viewing of the character charts and character properties for The Unicode Standard
UnicodeChecker version 1.5
Allows to browse the complete Unicode character set and view extensive information on every code point. The info is taken from various Unicode 3.2 data files in the UNIDATA directory. Furthermore, it can directly convert strings to the four Unicode Normalization Forms. Runs on MacOX X 10.1 and later.
Unicode Database
Characters in Html/Xml ordered by block, category, bidi-class and additional properties. The version of each codepoint is shown.
Unicode Glyph Lookup by hexadecimal code
unidesc/uniname/unihist/ExplicateUTF8/utf8lookup
A set of programs for finding out what is in a Unicode file.
Unicode Data Browser New
A browser for the UnicodeData.txt file
Unicode Input Tool/Converter Firefox Extension New
View Unicode characters, values, and character descriptions in chart and optionally output to a textbox. Also converts among character references (hex or decimal), HTML entities, and Unicode. Several preferences allow a great degree of customization including adding one's own DTD for use in entity conversions.
UTF-8 and Unicode FAQ for Unix/Linux
Markus Kuhn's comprehensive information resource on how to use Unicode/UTF-8 on POSIX systems.
UTF-8 encoding table and Unicode characters New
Reference site displaying Unicode code points with their UTF-8 encoding in a number of numerical formats.
UTF-8 Search
Alta Vista introducing UTF-8 as one of their search encodings
UTF-8 Test
A good source of test pages and listings of alphabets in UTF-8.
VARTALAAP
A multilingual, interactive communication system
WebTide
A freeware html editor, written in Java, which integrates support for Unicode
XML and Unicode

Websites extensively using Unicode

Unicode Project in German Wikipedia New
A presentation of the Unicode blocks and a translation of the Unicode descriptions in German
 
Wikipedia
A multilingual project to create a complete and accurate free content encyclopedia making pretty extensive use of Unicode.