From: Janusz S. Bień (jsbien@mimuw.edu.pl)
Date: Fri Apr 08 2011 - 23:51:40 CDT
On Wed, 06 Apr 2011 jsbien@mimuw.edu.pl (Janusz S. Bień) wrote:
> I need to provide the decomposition mappings for some PUA
> characters. I would like to use the same format to overrride standard
> compatibility decomposition (at the moment I would like just to block
> the conversion of long s to the standard one).
>
> Do you have any suggestion for a format to store and maintain such
> data?
For archive:
Jakub Wilk suggested NormalizationCorrections.txt:
http://unicode.org/Public/UNIDATA/NormalizationCorrections.txt
# Interpretation of the fields:
# Field 0: Unicode code point
# Field 1: Original (erroneous) decomposition
# Field 2: Corrected decomposition
# Field 3: Version of Unicode for which the correction was
# entered into UnicodeData.txt, in n.n.n format.
# Comment: Indicates the Unicode Corrigendum which documents
# the correction
I intend to leave Field 1 empty and to use Field 3 for the character
origin (e.g. MUFI) and Field 4 for the character name and possibly
other comments.
Regards
JSB
-- , Prof. dr hab. Janusz S. Bien - Uniwersytet Warszawski (Katedra Lingwistyki Formalnej) Prof. Janusz S. Bien - Warsaw University (Department of Formal Linguistics) jsbien@uw.edu.pl, jsbien@mimuw.edu.pl, http://fleksem.klf.uw.edu.pl/~jsbien/
This archive was generated by hypermail 2.1.5 : Fri Apr 08 2011 - 23:54:04 CDT