Keld> As I see it, the language identifier is actually part of a
Keld> greater range of information that you need for a text, such
Keld> as how are numbers represented, date formats etc. This is
Keld> also known as the locale in C and POSIX terms. There is a
Keld> general need to know which locale any text should be
Keld> understood by. This information can be given out-of-band or
Keld> in-stream. What I would propose is a standardized way to
Keld> invoke a locale in-stream to solve the problem.
Thanks for making this clear. I was assuming (when I shouldn't have)
that it was understood that a language id could imply more than simply
"a language" and that it could occur out-of-band or in-stream.
Keld> As also noted above there is a need for this capacity also
Keld> outside UNICODE/10646 and thus I think that UNICODE/10646
Keld> encoding is not the right way to standardize it in.
We haven't spent much time looking at other possible standard
protocols that could be used to implement this on top of
Unicode/10646. Before proceeding any further, we wanted to make sure
we didn't miss anything that would render the approach unfit for use.
Since it seems unlikely that a language id approach would be adopted
in a codeset standard, we will probably attempt to reconcile our
approach with some standard protocol. Maybe one will work out well
enough to make a worthwhile proposal.
-----------------------------------------------------------------------------
mleisher@crl.nmsu.edu
Mark Leisher "The trick is not gaining the knowledge,
Computing Research Lab but surviving the lessons."
New Mexico State University -- "Svaha," Charles de Lint
Box 30001, Dept. 3CRL
Las Cruces, NM 88003
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:32 EDT