|
|
Useful Resources
About this List
This page contains a number of resources pertaining to Unicode
and Internationalization . These references are provided for
informational purposes only. In particular, the Unicode
Consortium has not taken any steps to evaluate or verify the
usefulness or accuracy of the information provided.
For additional information, refer to FAQ,
Articles on
Unicode, Books
on Unicode or to
Unicode Enabled Products for a sample list of products
that are reported to be fully (or partially) Unicode-enabled.
If you have any updates to this list, please
contact the Unicode office.
This should include:
- Category (from the above)
- URL
- Brief description of the page/site content
Fonts and Keyboards
-
Adding Fonts to Java
- Shows how to enable the use of Unicode fonts (such as Arial
Unicode MS) with Java
- Adhuna
Keyboard [BornoSoft Keyboard Interface Package]
- UNICODE compliant Bengali/Bangla keyboard driver that uses only lower case phonetic English key input for Bengali output.
- Aksharamala
- A standards-based software tool which automatically
transliterates English input into a target Indian lanaguage.
- ALT
Keyboard Combinations

- ALT keyboard
Combinations (in Dutch)

- A convenient and handy list of ALT keyboard combinations
-
Assamese Phonetic Keyboard
- Assamese Online Dictionary
- Avro Keyboard - UNICODE Compliant Free Bangla Typing Software
- A full featured UNICODE supported FREE Bangla typing software with most popular Bangla keyboard layouts from Bangladesh and India.
-
Burkina Faso
(In French only)
- Keyboards and dictionaries
for languages of Burkina Faso
-
- Bangla
- Word processor, Unicode converter, text to speech, customizable keyboard, translator, calendar and more for Bangla with Unicode support.
- Bangla
- Freeware Bangla Unicode Typing Interface, with a choice of
keyboard layouts
-
Bangla Unicode Fonts
- Free Unicode complaint OpenType Bangla
fonts.
- Free Bangla
Fonts Project
- Releases of four GPLd Bangla (Bengali) Open Type Fonts with full
Unicode support
- Celtic (Michael
Everson)
- Everson Mono and other fonts
-
Chinese (Los
Angeles Chinese Learning Center)
- A guide on Chinese characters input methods
- Code2000 (James
Kass)
- A shareware Unicode font
-
Devanagari Editor
-
Displaying and Typing Japanese Characters
- A guide on displaying and typing Japanese characters using Unicode
-
Dvorak and QWERTY keyboard drivers
- For Windows. Support a large and growing number of
scripts. Most of these keyboards furnish all of the characters
in the relevant Unicode ranges.
-
Edward Trager's index of free/libre fonts
- Europe Keyboard
- An international keyboard layout based on the German standard layout, intended for practically
all languages written using the Latin script including e.g. Vietnamese, Yorùbá.
(Information on the website is in German language)
- Every Known Font site
- This site is down until further notice. Check
Typesource
by Proxy
-
Fingertipsoft Cyrillic Character Set and Keyboard Information
- Fontboard
- Free Arabic, Cyrillic, Esperanto, Hebrew, Maltese and Yiddish keyboard
layouts and a small selection of special fonts for linguists and card players.
- Fonts for Kurdish
Language
- Site contains 33 free Unicode fonts for Kurdish Language
with support of Farsi and Arabic, as well as necessary keyboard
installator for Windows.
- Fonts,
Keyboards and Browsers Setup (Alan Wood)
Fonts By Range
- Information on the Unicode fonts available for each Unicode
range
- Gallery of
Unicode Fonts
- Hundreds of free Unicode fonts, with sample images from each
-
Gaelic for US hardware &
UK hardware (Ciarán Ó Duibhín)
- Downloadable free software keyboard layouts for Windows
2000/XP which offer a number of accented and other characters
from the Unicode character-set via dead keys for use on US and
UK keyboard hardware respectively.
They are primarily intended to support characters found in the
Gaelic language, but can be useful more generally.
- Fonts in
Cyberspace
- Links to many sources of multilingual fonts, both encoded with
Unicode or a custom codepage
-
Free Dictionary Project
-
Hindi Conversion Tools
-
Hindi Editor (Matthew Blackwell)
- Indic Support for Linux
- Unicode Devanagari font in Mozilla
-
International Phonetic Alphabet in Unicode
-
International Phonetic Alphabet (linguiste.org)

- for Windows 95 and 98, and does not support Unicode.
-
JLG Latin Extended Keyboard (For US Keyboards)
- Freeware which allows users to input over 1000 Unicode characters.
-
Junicode (Peter S. Baker)
- A Unicode-based font for medievalists.
- Keyboard Layout
Editor and Generator
- Works with Windows 9X, Windows NT 4.0 and Windows 2000.
Supports Unicode (when installed in Windows NT 4.0. or Windows
2000)
- Keyboard Wizard
- A keyboard management utility for Windows® applications, which
allows to enter text in any language supported by Unicode.
- KeyPixy
- Unicode
keyboard for the SuperWaba virtual machine on PDAs
-
Language Culture Type: International Type Design in the Age of Unicode
-
Language samples in UTF-8 (Michael Kaplan)
- Microsoft Keyboard Layout Creator (MSLKC)
- Myanmar in Unicode
- Nepali Conversion Tools
- Quick Key 5.1
- The ideal solution for people who need to input foreign characters or mathematical symbols quickly,
but do not want to spend time learning a new keyboard layout. Inserts any Unicode character with a single click
- Romanian Academic v2.0 by Sorin Paliga
- Romanian Academic is set of keyboard layouts for the Mac OS, which includes a Unicode
keyboard compliant with the Romanian national standard SR-13392/1998.
- Romanian Keyboard
by Cristian Secara (links to a website in Romanian)
- Romanian keyboards for Windows 9x/Me, Windows NT4.0, Windows
2000 and Windows XP.
- Romanian UF by Florin Neumann
- Romanian UF is Romanian Unicode keyboard layout for Mac OS X v10.2 or later,
designed for users of US QWERTY keyboards.
-
South Asia Language Resource Center (University of Chicago)
- Tavultesoft Keyman
6.2
- Unicode keyboard input mapping system for Microsoft Windows.
Keyboard layouts for use with Keyman are available for many languages, and custom keyboard
layouts for both simple and highly complex scripts can be quickly designed and packaged
for distribution with Keyman Developer. Supports Unicode input in Windows 95, 98, Me,
NT4, 2000, XP, Server 2003 and Vista.
- Type It
- A set of online Unicode-based text editors designed for
typing national characters in a variety of languages, as well as
symbols of the International Phonetic Alphabet. The editors
include buttons and keyboard shortcuts to enter characters
without memorizing Alt codes or installing any software, and can
be used from any computer with Web access.
- Unicode Character Map
- Free and fast online method to select Unicode characters to paste into forms or other apps
- Unicode implementations for uncommon scripts on MAC OS X
- Includes Burmese, Cherokee, Inuktitut, Kannada, Malayalam, Telugu, Tibetan, and soon to be released Limbu support
- Unicode keyboard layout generator for the Mac OS (Alex Eulenberg)
- Generates 'uchr' keyboard resources, which can be used in Mac OS 9 (with Unicode
script) or OS X.
- TrueType Explorer
- Displays the Unicode ranges supported by a font, and all the glyphs for a given range. It also displays Panose classification, Name strings,
Kerning pairs and supported code pages.
-
Unicode Characters Pickers by Richard Ishida
- HTML-based. Quickly create phrases in a script by clicking on Unicode characters arranged in a way that aids their identification.
Covers Arabic, Armenian, Bengali, Devanagari, Basic Ethiopic, Gurmukhi, Hebrew, IPA, Latin & diacritics, Malayalam, Tamil, Thai, Tifinagh.
-
Unicode Code Converter by Richard Ishida
- This dynamic HTML app helps you convert between Unicode character numbers, characters, UTF-8 and UTF-16 code units
in hex, percent escapes, and Numeric Character References (hex and decimal).
-
UniView by Richard Ishida
- HTML-based. Look up characters, character blocks, paste in and discover unknown characters, store your own info about characters,
search on character names, do hex/dec/ncr conversions, highlight
character types, etc. etc. Supports Unicode 5.0
- UTF-8
Sampler
- A plain-text web page in UTF-8 including brief texts many
scripts in both proportional and fixed fonts.
- VietIME 1.0
- A cross-platform Vietnamese input method editor (IME). Enable input of
Vietnamese Unicode text in Java's AWT (TextArea and TextField) and Swing text components.
- Vietnamese Fonts on MAC OS X
- Describes how to enable the Vietnamese Unicode font which is standard in Mac OS X
- Website Tips for
Designers: Fonts.
- 3-D
Keyboard
Linguistics and Script Specialty Sites
-
Akkadian
- Database of Latin letters
and various other scripts and languages
- Interactive query engine for a database of extended Latin
letters as well as other scripts used by various languages. It is
maintained by Indrik Hein of Estonian Standards.
- Digital
Dictionaries of South Asia
Digital South Asia
Library
- The Humanities Computing
Laboratory (formerly Duke Humanities Computing Facility)/WinCALIS
-
Encyclopedia of Typography and Electronic communications
- Contains information on using Unicode and ISO 10646, Internet
technology, multilingual script terminology, font architecture,
character set and encoding standards, and visual human language
communication arts. It is edited and maintained by John M.
Fiscella.
- FarsiWeb Project
- Persian script in Unicode and other standards.
-
HK-SCS

- HKSCS Compatibility Issues Migrating to Unicode 5.0
-
Khmer
- A web page relating to Khmer in Unicode compiled by
Maurice Bauhahn
-
Middle English Manuscripts represented on the Web with UTF-8
- Multilingual
Computing Resources (Mark
Leisher)
-
Multilingual Project Gutenberg
- MyMyanmar.net
- A new Myanmar Linguistic, Unicode and Research information website
- Southeast Asian
Computing and Linguistics
- (Doug Cooper)
- TITUS
- Thesaurus Indogermanischer Text- und Sprachmaterialien
- Specialized site dealing with historical linguistic work and
ancient scripts
- Uyghur Computer Science Association (UCSA)
- Provides free Uyghur Unicocde fonts, software, technical
support for Uyghur language processing and more.
- Uyghur Unicode Based Website
- Online Uyghur input techniques, fonts and keyboard layout, FAQ on Uyghur Unicode,
UKY to traditional Uyghur converter and other multidirectional converters, Uyghur software
information, and more.
- Vietnamese-Unicode FAQs
- Zvon
character search
- it finds properties of Unicode characters and determines its
usage in XML documents. It also enables search of HTML and MathML
entities and searches based on visual similarity. A browse by
Scripts, Blocks, or Digits is also enabled.
Organizations and Other Standards
-
Banking Symbols Reference
- Character Set Tables
(Frank da Cruz)
-
Chinese GB 18030 Encoding Standard
- An extensive summary with enough information to describe the
encoding structure and to prepare an implementation
- Danish UNIX System
User Group Website (DKUUG)
- Site includes link to extensive ftp archives, as well as
information about standardization and Unix systems
- DKUUG ftp archive of
documents related to I18N
DKUUG ftp archive of ISO
documents related to character standardization
- European Union
- Official website of the European Union (EU), available in 20
different languages
-
Hong Kong Supplementary Character Set (HKSCS)
- A character set developed by the Government of Hong Kong
Special Administrative Region. It contains Chinese characters
specific to Hong Kong and supplements Unicode/ISO 10646 and Big-5.
- IANA (Internet Assigned
Numbers Authority)
- email to: iana@iana.org
-
IANA Character Set Registry
- IETF Home Page
- Explains internet standards
- Internet Drafts
ISO
Addresses
- ISO
International Register of coded character sets
ISO 8859 Character Sets
- email to: Roman
Czyborra
-
ISO 10646 Online
- ISO 8601 (DateTime
Standard)
-
ISO 639-1 Registration Authority
- ISO 639-3 Registration Authority
-
ISO 639 language codes
-
ISO 3166 Maintenance Agency
-
ISO 3166 country codes
-
ITTF's Publicly available Standards and Technical Reports
- LISA
- Consisting of over 200 corporate clients and their globalization solutions
partners -- the LISA provides best practice, business guidelines and
multi-lingual information management standards for making enterprise
globalization become a reality.
- Localization Research Centre
- The LRC, based at the University of Limerick in Ireland, is the focal point and the research and educational centre for localisation.
- Michael Everson
- New script proposals, other character encoding and internationalization standards documents,
font information, and more.
-
Markus Kuhn's FAQ on international standards
NISO Home Page
- the whole ANSI/NISO Z39.53 list (last revised 7/19/93) is at:
http://lcweb.loc.gov/marc/languages/
updates are at:
http://lcweb.loc.gov/marc/langann.html
- Postal Address Guide
(Frank da Cruz)
- This page also includes the names of countries in native script encoded in UTF-8.
- Postal Codes and Address
Resource List (Graham Rhind)
- A very comprehensive guide to postal address conventions and
codes around the world.
- TC46
Transliteration Links
- UNGEGN Working Group on Romanization Systems
- MARC 21 Specifications
(including mapping tables to UCS/Unicode)
- Thai IT Standards
National
Electronics and Computer Technology Center (NECTEC)
- RFC's and
Internet Standards
- SIL
Ethnologue
- W3C
(World Wide Web Consortium) Internationalization Activity
- XML
Schema Datatypes
- Date /time information
Training and Courses
- Certification in Localization at California State University in Chico, CA.
Using Unicode
ANSI C implementation of UTF-8
Converts UTF-8 into UCS4 and vice
versa. Source code is BSD licensed.
Bangla Unicode Converter
Posted by the Hunger Project-Bangladesh
C++ Unicode Implementation
Chinese Tools (Erik
Peterson)
Big5/GB/Hz to UTF-8 Converter and Chinese Character Dictionary
in UTF-8
Chinese/Norwegian (Ingar Holst)
Online dictionaries between Chinese and Riksmål Norwegian.
Under construction.
Chinese ideographs
and their Cantonese pronunciation
cpDetector (Achim
Westermann)
Allows to sort large collections of documents by their codepage,
to transform large collections of documents to a target codepage
(e.g.: into utf-8), and can be used as a java library for codepage
detection that may be used for 3rd party software (search engines,
file sharing software, browsers, any software that accesses textual
data over network)
CPG-Nuke 8.2,
CPG-Dragonfly CMS
Open Source CMS/Portal
Data on Languages
A useful web interface to a database of Unicode characters,
and codepage conversions.
Decode Unicode
Delphi Graphics and
Unicode Center (Mike Lischke)
A unique resource for all Borland Delphi users who need to
deal with more than just plain ANSI text. Download a must-have
Unicode core library which contains code for wide string
manipulation as well as search engines for tuned Boyer-Moore and
Unicode regular expression. The Unicode Syntax Edit control
particularly important for all those who want to include wide
string scripting support into their application. Free for
non-commercial use and comes with full source code.
EarthWords
Multilingual web pages developed using some 20 different Unicode ranges including Greek, Cyrillic,
Hebrew, Arabic, Devanagari, Bengali, Gurmukhi, Gujarati, Tamil, Thai, Lao, Georgian, Ethiopic and
Unified Canadian Aboriginal Syllabics.
EDICODE
Allows entering Unicode characters into a text field by simply pressing buttons labeled with corresponding glyphs. There are button sets for some
European languages as well others covering a few Unicode blocks
FAIRY
A Web server extension that makes it possible to show text
with any formatting of any language in all browsers on all
platforms.
Ethiopic - Sadiss 1.0
Unicode text editor for Ethiopic for Java 1.4.0 or above. Includes Ethiopic
font, keyboard layout for Ethiopic, Unicode character entry, text search, UTF-8,
locale support, help, samples,...
Field Guide to Chinese Characters (Clopper Almon)
An index to over 3,600 common characters with links to Unihan.
The indexing is by the new RADICODE system combining English
names of 100 meaningful radicals and stroke codes (not counts)
for the rest of the character. Said to be faster, easier, and
less arbitrary than the traditional system with 214 radicals
and stroke counts.
French (Phillipe Deschamp)
Contains material regarding French language standards and
cross-postings for other standards bodies
German
List of countries, with full names, codes, adjectival forms
for country and inhabitants, and capitals.
Hanzi 3.0
CD-ROM software for the Macintosh, MS-Windows, and DOS
operating systems. Wenlin tackles the most frustrating obstacles
for students, scholars, and speakers of Chinese with its versatile
and easy-to-use interface.
Hebrew
Jony Rosenne
IBM developerWork™
Section of the site discussing Unicode related issues
I18ngurus
Open internationalization resources directory. The site currently provides 900+
links on internationalization and Unicode resources on the internet classified
by category and actively monitored.
Internationalization
(Alex Schonfeld)
using Java and other resources
Internationalization (Tex Texin)
The Benefits of Unicode, The number of characters in the Unicode Standard,
and An Introduction to the Business Example for Unicode are just a few of many
interesting topics addressed in this website.
ITworld.com
IT articles that refer to Unicode
Japanese,
Chinese and Korean (Ken Lunde)
CJK resources and information, Japanese code manipulation
tools, etc.
Mellel by Redlex
Mellel is an affordable word processor for the Mac OS X platform, with extensive Unicode support
and specialized features for Farsi, Arabic, Hebrew, and other languages.
Multilingual Support
in Internet/IT Applications
NameMap Class
A bidirectional map between Unicode 3.0 character names and
the corresponding values for use in Java programs.
Polish Resources
Useful information about Unicode for Polish Users
Polyglot 3000
Automatically recognizes over 400 languages. Comes with full
Unicode support.
Private Use
Areas
How they are used by Software Vendors
Punjabi Computing
Resource Centre
Includes a free Unicode 4.0 GPL font called Saab for Gurmukhi
(the first of its kind). Also has a conversion utility (Gurmukhi
Unicode Conversion Application) to convert from font-based Gurmukhi
to Unicode.
Roland Hentschel utilities
SC Unipad editor
A Unicode text editor for Windows operating systems from Sharmahd Computing
Special
Characters Mapping Tool for Web Designers
Sun's Solaris 7 & 8 operating environments and Unicode
Unicode, multilingual computing, and software
internationalization. Technical concerns of Unicode in an
internationalized application. Codeset conversions.
Testing Tamil and
Telugu
The Unicode HOWTO
Bruno Haible's guide to using Unicode/UTF-8 with GNU/Linux systems.
The
Unicode Workflow
Johan van Mol's guideline for web developers on how use UTF-8 in their web
projects. The article covers the whole process of creating a
website: forms, XML, databases, the encoding of script and source
files, HTML output and e-mail output.
Tommy's
Unicode Library 1.2
Troll Tech's About
Unicode
uni2ascii/ascii2uni
A pair of programs that convert between UTF-8 and a variety of pure ASCII representations.
Unibook
A small utility for offline viewing of the character charts
and character properties for The Unicode Standard
UnicodeChecker version 1.5
Allows to browse the complete Unicode character set and view extensive
information on every code point. The info is taken from various Unicode 3.2
data files in the UNIDATA directory. Furthermore, it can directly convert
strings to the four Unicode Normalization Forms. Runs on MacOX X 10.1 and later.
Unicode
Database
Characters in Html/Xml ordered by block, category, bidi-class
and additional properties. The version of each codepoint is shown.
Unicode
Glyph Lookup by hexadecimal code
unidesc/uniname/unihist/ExplicateUTF8/utf8lookup
A set of programs for finding out what is in a Unicode file.
Unicode
Data Browser
A browser for the UnicodeData.txt file
Unicode
Input Tool/Converter Firefox Extension
View Unicode characters, values, and character descriptions in chart and optionally output to a textbox.
Also converts among character references (hex or decimal), HTML entities, and Unicode. Several preferences
allow a great degree of customization including adding one's own DTD for use in entity conversions.
UTF-8 and Unicode FAQ for Unix/Linux
Markus Kuhn's comprehensive information resource on how to use Unicode/UTF-8 on POSIX systems.
UTF-8 encoding table and Unicode characters
Reference site displaying Unicode code points with their UTF-8 encoding in a number of numerical formats.
UTF-8 Search
Alta Vista introducing UTF-8 as one of their search encodings
UTF-8 Test
A good source of test pages and listings of alphabets in UTF-8.
VARTALAAP
A multilingual, interactive communication system
WebTide
A freeware html editor, written in Java, which integrates support for Unicode
XML
and Unicode
Websites extensively using Unicode
- Unicode Project in
German Wikipedia

- A presentation of the Unicode blocks and a translation of the Unicode
descriptions in German
-
- Wikipedia
- A multilingual project to create a complete and accurate free content encyclopedia making pretty extensive use of Unicode.
|
|