From: John Hudson (john@tiro.ca)
Date: Thu May 18 2006 - 12:17:59 CDT
theiling@absint.com wrote:
> While programming a compatibility decomposition plus case folding (two
> things in one step), I noticed that
>
> U+0345 COMBINING GREEK YPOGEGRAMMENI
> is converted to
> U+03B9 GREEK SMALL LETTER IOTA
>
> but that code positions like
>
> U+0363 COMBINING LATIN SMALL LETTER A
> is not converted to
> U+0061 LATIN SMALL LETTER A
>
> And some similar combining chars accordingly.
The former is a peculiarity of the Greek writing system, not a general rule for combining
letter-like marks. The ypogegrammeni is written as a full iota when it follows an
uppercase letter.
> Is there a reason for it? This would then result in some letter-like
> chars not being found when searching for them as a letter.
But they are not letters, they are combining marks that happen to be based on letters:
their function is not alphabetical. In general, you don't want them to be confuseable with
or decomposed to letter characters. The Greek ypogegrameni is an exception, and I believe
there are case roundtripping issues as a result.
John Hudson
-- Tiro Typeworks www.tiro.com Vancouver, BC john@tiro.ca I am not yet so lost in lexicography, as to forget that words are the daughters of earth, and that things are the sons of heaven. - Samuel Johnson
This archive was generated by hypermail 2.1.5 : Thu May 18 2006 - 12:37:15 CDT