From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Wed Dec 03 2003 - 12:55:53 EST
> De : Jungshik Shin [mailto:jshin@mailaps.org]
> Note that Korean syllables in Unicode are NOT "LVT?" as you
> seem to think
I did not say that...
> BUT "L+V+T*" with '+', '*' and '?' have usual RE meaning.
I said this:
( ((L* V* VT T*) - (L* V+ T)) | X )*
> Who said that? 11,172 precomposed syllables are both *redundant*
> (should have never been encoded) and *incomplete* even for modern Korean
> text. I prefer to use Korean letters (in U+1100 block) for every single
> syllables of Korean, modern or not. We do need U+115F followed by 'V+T*'
> in modern Korean text in dictionaries, grammar books and lingustics text.
OK this choseong filler makes sense for vowel starting syllables, to make
them appear as if it was a L+V+T form. I still doubt that this is really
needed (unless the intent is to detach the vowel from a possible previous
trailing consonnant in <L0,V0,T0>, and not form a ligature with it where
<L0,V0,T0,V1,T1> would be composed as <L0+V0>,<T0+V1+T1> where T0 is
converted to a leading consonnant.
> Come on!!! We do not want to encode any more precomposed syllables.
> Encoding 11,172 of them already ranks top in the list of things we'd
> have done differently. Adding 567 more would NEVER NEVER happen even if
> there's room for them.
What about the existing "compatibility Hangul syllables" starting with
vowels ? Are they really distinct from the jamos that compose them, as
if they were decomposed to a leading choseong filler, a vowel and a
consonnant ? What would happen if a compressor chose to compress
occurences of <LF,V,T> to these compatibility vowel-starting syllables
by using a mapping to an internal charset, and reversed the compression
back to separate Lf, V, T in Unicode?
I've just read the interesting Bytext.org approach, and what I proposed
seems to have been thought also by them in their 8-bit encoding (which
does not preserve the strict Unicode canonical equivalence, but seems to
be created to preserve the Hangul script structure...
Converting a Hangul text coded with the Bytext.org encoding to Unicode
would certainly face the design choice in the mapper to whever or not
using compatibility Hangul syllables...
__________________________________________________________________
<< ella for Spam Control >> has removed Spam messages and set aside
Newsletters for me
You can use it too - and it's FREE! http://www.ellaforspam.com
This archive was generated by hypermail 2.1.5 : Wed Dec 03 2003 - 18:07:33 EST