Otfried Cheong scripsit:
> (1) The situation of LONG-S versus ROUND-S is quite parallel to that
>     between FINAL-KAF and KAF.
Indeed.
> (2) Therefore, FINAL-KAF and LONG-S need to be encoded.  Not, as has
>     been hinted, because they come from an ancient legacy encoding,
>     but because they are necessary, here and now.
For both reasons.
> (3) There still remains the question why LONG-S has a compatibility
>     decomposition to S, while FINAL-KAF doesn't.
This involves a subtle point (meaning that I myself only figured it out
a short while ago :-) ).  To give a character a compatibility decomposition
asserts more than that it is a variant form of one or more other
characters.  It further asserts that the character is itself a
compatibility character: i.e. it was encoded solely for compatibility
with something, typically another encoding.
Thus (until I bitched about it recently), ASCII ^ had a compatibility
decomposition of SPACE followed by COMBINING CIRCUMFLEX.  This was
a blunder, simply because ASCII ^ is not a compatibility character;
it has acquired many functions of its own, particularly in computer
languages.  (The specific problem was that ASCII files would cease
to be pre-normalized in Normalization Forms KC and KD.)
>     When you search for a string in a word-processor, I would like "s"
>     to match all of "s", "S", and "long-s".  How is this in Hebrew?
>     Would you want to find a match with FINAL-KAF if you typed a KAF
>     in the search pattern?
Plain-text search in Hebrew isn't very useful, because of the
overlapping morpheme structure.
-- 
John Cowan                                   cowan@ccil.org
       I am a member of a civilization. --David Brin
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:54 EDT