Re: Generic Tagging: A Modest Proposal

From: Markus G. Kuhn (kuhn@cs.purdue.edu)
Date: Tue Jul 15 1997 - 15:44:27 EDT


Kenneth Whistler wrote:
> The currently
> active proposal is called the "Plane 14" proposal,
> and has been in active discussion between the UTC
> and members of the IETF.

I hope strongly that these language tags will not become directly part
of ISO 10646-1, but will be described in a separate document, as this
sounds clearly like a non charset issue to me.

Many systems have already their own language tagging mechanism and do
not need an additional one from Unicode. For instance, in HTML 4.0
<http://www.w3.org/TR/WD-html40/>, you can write things like

  <P LANG=de>Dies ist ein Absatz in Deutsch, in den wir
  etwas <Q LANG=en>english text</Q> eingebettet haben.</P>

In this example, language information is used to switch between
German and English hyphenation rules.

See <http://www.w3.org/TR/WD-html40/struct/dirlang.html> for further
details. You can specify the language in HTML 4.0 using the LANG
attribute in almost any HTML element. This is much more convenient
than handling additional new Unicode control characters. I expect
that Netscape 5.0 will allow you to select fonts per language.

Markus

-- 
Markus G. Kuhn, Computer Science grad student, Purdue
University, Indiana, USA -- email: kuhn@cs.purdue.edu



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:35 EDT