Philippe may have overlooked the fact that this has been tried (years
ago) in the
Unicode Standard. See: language tags.
http://www.unicode.org/versions/Unicode7.0.0/ch23.pdf#G26419
The syntax for those even goes beyond just ISO 639-2/3 to incorporate
the full range of BCP 47 tags, in principle.
But the catch is that the language tag characters ended up *deprecated*,
precisely because attempting to do this kind of thing in plain text is the
wrong thing to do -- it interferes with the level-appropriate language
tagging mechanisms available in markup.
I see no point in speculating about reinventing this particular broken
wheel one
more time for the Unicode Standard.
--Ken
On 2/12/2015 9:22 PM, Philippe Verdy wrote:
> Another solution isalso to not extend the scope of use of RIS
> characters (leave them as they are for ISO3166-1 based codes only),
> but defne a separate set with "Language Indicator Symbols" (LIS)
> working the same way, but based on ISO 639-2 or -3 (3-letter codes,
> accepting also the language family codes also encoded on 3 letters, as
> well as alll -3 macrolanguages such as "zho" for Chinese or "que" for
> Quechua).
>
>
> Nowhere, that will mean that Unicode defines what is a valid language
> or not. All well-formed triplets are valid, and users are free to use
> 3-code sequences of LIS to do what they want as long as this respects
> the known ISO639 standard (otr its history, including retired codes). ...
>
>
_______________________________________________
Unicode mailing list
Unicode_at_unicode.org
http://unicode.org/mailman/listinfo/unicode
Received on Fri Feb 13 2015 - 10:14:04 CST
This archive was generated by hypermail 2.2.0 : Fri Feb 13 2015 - 10:14:04 CST