From: Asmus Freytag (asmusf@ix.netcom.com)
Date: Mon Mar 27 2006 - 03:32:45 CST
On 3/25/2006 6:09 PM, Richard Wordingham wrote::
> At 00:15 +0000 2006-03-26, Richard Wordingham wrote:
>
>>> Does anyone care to expound the theory of variation selectors? There 
>>> may be words in white in the TUS saying 'only for unifying CJK 
>>> variants that the Chinese (or Japanese, especially with surnames) 
>>> insist are different.'
>  
> I have [read TUS]; or at least, I have read TUS 4.0 Section 15.6 
> 'Variation Selectors'. Several times.  (I can find no indication that 
> it is different to TUS 4.1 Section 15.6.)  I have the nagging feeling 
> that I have missed something.
>
> Richard.
>
>
I don't know what you mean by theory of variation selectors. However, I 
think it might be useful to summarize some of the facts that can be 
gathered from reading TUS (and not only section 15.6)
and add some observations along the way:
Variation selectors work best when you have two shapes that can clearly 
be substituted for each other in the majority of cases, but where there 
are some (non-predictable) instances in which it is required to use only 
one of them to the exclusion of the other.
Variation selectors are best considered a solution of last resort. It 
would be inappropriate to have them occur very frequently, that's not 
just because of the space they take up, but also because there will 
always be implementations of processes that will not handle them 
correctly (i.e. not ignore them).
So far, variation sequences have been *standardized* for Math and 
Mongolian (apparently an "M" is required at the start of the name of the 
writing system ;-).
For math, the variations allowed us to claim that certain minor shape 
variations are not semantically meaningful, without having to prove that 
proposition rigorously (by fully unifying the characters). [Rigorously 
establishing unifications in math can verge on the impossible, because 
the writing system is fundamentally open-ended.] At the same time, the 
variation sequences allow mapping to existing entity sets and character 
sets. So, in a way, they were primarily used to avoid creating 
compatibility characters and the need to map between them. Instead, if 
you just ignore the variation selector, the two base characters are 
already the same character - no cross mapping needed.
For Mongolian, the FVS are needed to override the shaping mechanism in 
unusual cases. Think of them as super ZWJ/ZWNJ just as Mongolian shaping 
is Arabic-style shaping on steroids. By making the FVS script-specific, 
we give additional context: Mongolian layout engines need to consider 
them, practically all other processes ignore them (or let them pass 
through).
The role of variants in the CJK system is a particularly well-understood 
one, and the variation selector mechanism models that understanding 
directly, which, in a sense, can be considered a good thing. As there 
may be many variants for each character, a major issue in the CJK 
environment is cataloging - we eventually came to the conclusion that 
standardization of variation sequences along the model outlined in 
Section 15.6 is a futile exercise. UTS#37 provides a way to register 
sets of variants.
UTC and WG2 have left open the future use of variation selectors. If 
equally compelling needs arise (compared to the one I've summarized 
above) then variation selectors could once again be part of the 
solution. All things being equal, any solution that does not require 
them, will be automatically preferred.
Hope you find this useful,
A./
This archive was generated by hypermail 2.1.5 : Mon Mar 27 2006 - 03:35:56 CST