Re: Tagging orthographic systems (was: (iso639.186) the

From: Mark Davis (markdavis@ispchannel.com)
Date: Fri Sep 15 2000 - 10:35:28 EDT


I share the concern about combinatorial explosions. Look a Spanish, Arabic or
English, for example:
http://oss.software.ibm.com/developerworks/opensource/icu/localeexplorer/

I agree that de-*-sp1996 makes more sense. For us, the variant should go before
the country only if the variant is -- in general -- more significant to
specifying the behavior of processing than the country is.

Mark

Harald Alvestrand wrote:

> At 16:18 13/09/2000 -0600, Otto Stolz wrote:
> >Hence, I plead for a tagging-system that allows to represent these dif-
> >ferences. Currently, all of my WWW pages contain the line:
> > <HTML LANG=de><!--neue Rechtschreibung-->
> >I would rather prefer to incorporate the comment in the tag, as in
> >the hypothetical:
> > <HTML LANG=de-sp1996>
> >and likewise for other languages, and other applications.
>
> If you think this is needed, why not write up an RFC 1766 registration for
> it, and let us base the discussion on a registration request?
>
> >Note that this issue is orthogonal to the country code of RFC 1766.
> >E. g., both de-AT, de-CH and de-DE could be either spelled the 1902,
> >or the 1996, way. Hence, the spelling subtag, and the country subtag
> >should be optional, independend of each other.
>
> this is troublesome. Representing orthogonal aspects as subtags of a linear
> string gives us a combinatorial explosion. We can't require end user
> software to match de-sp1996, de-AT-sp1996 and de-sp1996-AT in any sensible
> way, when the software starts out by not caring about spelling variants.
>
> Registering the 4 variants of de-*-sp1996 makes more sense to me.
>
> Harald
>
> --
> Harald Tveit Alvestrand, alvestrand@cisco.com
> +47 41 44 29 94
> Personal email: Harald@Alvestrand.no



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:13 EDT