From: Jukka K. Korpela (jkorpela@cs.tut.fi)
Date: Tue May 16 2006 - 02:32:40 CDT
On Tue, 16 May 2006, Balasankar wrote:
> Whether the union of Exemplar & auxiliary exemplar character set should
> contain all the possible characters used in the particular language?
No. It is impossible to list down the characters used in a language; the
set is very fuzzy, with membership ranging from core characters (such as
"a" in English) through marginal characters (like "é", i.e. "e" with
acute, in English) to characters may appear in special words, typically
borrowings, perhaps _very_ rarely. Moreover, these sets are currently
supposed to list down _letters_ only. The two sets make it possible to
give a rather rough description of letters used in a language, and the
choices made are often rather debatable.
It isn't even clear what the intended _use_ of the sets is, or what the
actual use will be. There is a large number of imagineable uses, with
their own implications on what the grounds for defining the sets should
really be. I'm afraid the (mostly implicit) criteria applied now make the
sets incommensurable across languages.
-- Jukka "Yucca" Korpela, http://www.cs.tut.fi/~jkorpela/
This archive was generated by hypermail 2.1.5 : Tue May 16 2006 - 02:42:08 CDT