From: Jukka K. Korpela (jkorpela@cs.tut.fi)
Date: Fri Aug 12 2005 - 12:30:04 CDT
On Fri, 12 Aug 2005, Jon Hanna wrote:
> There is only one Unicode set.
... at a given moment of time; characters can be added to Unicode in new
versions.
However, perhaps the question referred to "Unicode set" as "subset of the
repertoire of Unicode characters, needed to write a particular language".
Alternatively (and this is what I guess what the question really means),
the question might have referred to different language selection menus in
programs. Such menus often appear in (too) close connection to choices
related to character encoding.
> Different nations using the same language often used different character sets
> to support different currencies, different frequencies of loan words and so
> on. This is one of the things Unicode saves us from worrying about.
It does, but the question "which characters does a particular language
need?" is still relevant and difficult - it isn't even a well-defined
question before you spend quite some time on it. This question, in turn,
affects keyboard design, font choices, input checks, text scanning, etc.
There doesn't seem to be any difference between the two versions of
Portuguese as regards to the character repertoire, as judged by
the current CLDR data:
http://www.unicode.org/cldr/data/diff/by_type/characters.html
There _could_ be a difference, though.
Language selection menus often contain country-specific variants of
languages for no good reason: the choice between them usually has no
effect. (The language forms could be different, but not in a manner that
affects the behavior of programs.) Spelling checks are probably the most
common (potential) area where e.g. the difference between two versions of
Portuguese might matter.
-- Jukka "Yucca" Korpela, http://www.cs.tut.fi/~jkorpela/
This archive was generated by hypermail 2.1.5 : Fri Aug 12 2005 - 12:33:12 CDT