From: jcowan@reutershealth.com
Date: Tue May 25 2004 - 16:57:43 CDT
Rick McGowan scripsit:
> The Unicode Technical Committee has posted a new issue for public
> review and comment. Details are on the following web page:
>
> http://www.unicode.org/review/
I have prepared a draft DiacriticFolding.txt file for this issue; it is
temporarily available at http://www.ccil.org/~cowan/DiacriticFolding.txt .
This was prepared by looking for lines in UnicodeData that matched
the regex '(GREEK|LATIN|CYRILLIC|HEBREW).*WITH'. (I added Hebrew to the
set of scripts specified by the current draft of #30.)
Characters with decompositions were mapped into the base character of the
decomposition; characters without decompositions were mapped by name.
The file http://www.ccil.org/~cowan/DiacriticFoldingExceptions.txt contains
a list of 32 characters matching the pattern which did not seem to me
to be suitable for diacritic folding.
I have posted a short version of this note to the Unicode comment form.
Comments?
-- A rabbi whose congregation doesn't want John Cowan to drive him out of town isn't a rabbi, http://www.ccil.org/~cowan and a rabbi who lets them do it jcowan@reutershealth.com isn't a man. --Jewish saying http://www.reutershealth.com
This archive was generated by hypermail 2.1.5 : Tue May 25 2004 - 17:01:38 CDT