> Chris Newman (Chris.Newman@innosoft.com) wrote:
> Since Unicode can support multiple languages, can you give an example
> where language tagging is necessary *and* there is only plain text
> present?
How about the Eastern Arabic-Indic digits four, six and seven (U+06F4,6,7)?
These have different glyphs depending on the language being Persian or Urdu.
See Version 1.0 of the standard. (Version 2.0 has the same comment but only
shows the Persian glyphs.)
On the language tagging mechanism Martin Duerst suggested adding a couple of
characters to indicate tagging. I agree with his point of view that the tags
should be at the character level and not just in the UTF-8 format.
How about using Escape sequences? These have a defined syntax, can contain
variable data, and there are even some reserved for private use. In addition
software may already be processing these (at least to ignore them) because
a sequence identifying ISO10646 may appear in the text stream.
Tim
-- Tim Partridge. Any opinions expressed are mine only and not those of my employer
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:34 EDT