Silliness (was RE: UTF-8S (was: Re: ISO vs Unicode UTF-8))

From: Edward Cherlin (edward.cherlin.sy.67@aya.yale.edu)
Date: Fri Jun 01 2001 - 13:57:03 EDT


At 4:44 AM -0600 6/1/01, Bill Kurmey wrote:
>Kenneth Whistler wrote:
>> Plane 14 PUA usage description tags? Naaah, nobody would suggest such
>> a bizarre thing, would they?
>
>Marco Cimarosti wrote:
>>The three words "PUA usage description" are redundant, methinks. Removing
>>them leaves a more concise and dramatic example of a weird proposal.
>
>Extensions to Plane 14 to include subsets for all other 'alphabetic'
>scripts, then encoding of ISO 639 and 3166 as separate code points so that
>folks using 'non-alphabetic' scripts may identify languages and countries
>using their own language at which point the "script for representing
>language tags" created in 3.1 would remove the perception of using
>"English-centric" and "ASCII-centric" scripts for language tagging within,
>for example, the EUC which requires 'official' documentation in 12
>languages and the UN which (used to, may still?) require 'official'
>documentation in 5 languages.

I demand separate codes for Korean with and without Hanja, since one
is alphabetic (theoretically, anyway) and one is "large character
set". Also for English words written in Katakana, and Sanskrit
written in Tibetan or Chinese. Also a code to be used exclusively for
Bertrand Russell's diary, written in Greek script and English
language, with, of course, his own orthography. Let's see...there are
more than 2^8 writing systems, and well over 2^12 languages, so if we
go for it--Yes! We can fill all of Unicode just with language/script
tags!!

NB. :-)

>Then there are all the numerous languages where the glyph associated with
>the "abstract character" code point might be replaced by "sound-generating"
>representations for those folks with only an oral tradition and no writing
>system.

Your suggestion is unsound. That's just a glyph variation on IPA. :-)
:-) What you want is an XML tag.

>Maybe "UTF-8S" should be reserved now as an acronym for encoding code
>points for "Sound-Glyphs"? ("Sound-bytes" might ignite the 'bits and
>bytes' thread again.:-)

Actually with surrogates, they would be Sound-Words.

>Bill Kurmey, Edmonton, AB, Canada

-- 

Edward Cherlin Generalist "A knot!" exclaimed Alice. "Oh, do let me help to undo it." Alice in Wonderland



This archive was generated by hypermail 2.1.2 : Fri Jul 06 2001 - 00:17:18 EDT