Re: Microsoft input method, 950, and Unicode mapping

From: Kevin Bracey (kevin.bracey@pace.co.uk)
Date: Thu Dec 20 2001 - 05:08:17 EST


In message <4.2.0.58.20011219110105.01b1a960@popd.ix.netcom.com>
          Asmus Freytag <asmusf@ix.netcom.com> wrote:

> Because of this, you get better interoperation among CJK code sets with
> using CIRCLED PLUS instead of EARTH, but at the cost of having obscured
> the semantics (i.e. compromised interoperation with Unicode-based
> systems).

I see. In constructing my tables, I was trying to identify semantics by
comparing surrounding and other characters in groups, so Earth/Sun was my
choice.

> > I was able to come up with a good Big5 mapping by taking the best ideas
> > from various Big5 and CNS11643 tables on the net, then making sure each
> > of those Unicode compatibility characters was used once, AND IN THE ORDER
> > THEY APPEAR IN UNICODE.
>
> That's not always a good idea. Unicode order often does not follow any
> standard, even when characters are intended to map.

But in this case, it seems clear that the correlation is too close to be
coincidental. U+FE30 to U+FE4E can extremely plausibly be found in order
in CNS11643/Big5. U+FE4F is out of order - the only exception. In the next
group, U+FE50 to U+FE6B again appear to appear in order. I would love to have
this confirmed by whoever placed the characters in Unicode. Here's my deduced
correlation for Big5:

0xA14A 0xFE30 # PRESENTATION FORM FOR VERTICAL TWO DOT LEADER
0xA155 0xFE31 # PRESENTATION FORM FOR VERTICAL EM DASH
0xA157 0xFE32 # PRESENTATION FORM FOR VERTICAL EN DASH
0xA159 0xFE33 # PRESENTATION FORM FOR VERTICAL LOW LINE
0xA15B 0xFE34 # PRESENTATION FORM FOR VERTICAL WAVY LOW LINE
0xA15C 0xFE4F # WAVY LOW LINE
0xA15F 0xFE35 # PRESENTATION FORM FOR VERTICAL LEFT PARENTHESIS
0xA160 0xFE36 # PRESENTATION FORM FOR VERTICAL RIGHT PARENTHESIS
0xA163 0xFE37 # PRESENTATION FORM FOR VERTICAL LEFT CURLY BRACKET
0xA164 0xFE38 # PRESENTATION FORM FOR VERTICAL RIGHT CURLY BRACKET
0xA167 0xFE39 # PRESENTATION FORM FOR VERTICAL LEFT TORTOISE SHELL BRACKET
0xA168 0xFE3A # PRESENTATION FORM FOR VERTICAL RIGHT TORTOISE SHELL BRACKET
0xA16B 0xFE3B # PRESENTATION FORM FOR VERTICAL LEFT BLACK LENTICULAR BRACKET
0xA16C 0xFE3C # PRESENTATION FORM FOR VERTICAL RIGHT BLACK LENTICULAR BRACKET
0xA16F 0xFE3D # PRESENTATION FORM FOR VERTICAL LEFT DOUBLE ANGLE BRACKET
0xA170 0xFE3E # PRESENTATION FORM FOR VERTICAL RIGHT DOUBLE ANGLE BRACKET
0xA173 0xFE3F # PRESENTATION FORM FOR VERTICAL LEFT ANGLE BRACKET
0xA174 0xFE40 # PRESENTATION FORM FOR VERTICAL RIGHT ANGLE BRACKET
0xA177 0xFE41 # PRESENTATION FORM FOR VERTICAL LEFT CORNER BRACKET
0xA178 0xFE42 # PRESENTATION FORM FOR VERTICAL RIGHT CORNER BRACKET
0xA17B 0xFE43 # PRESENTATION FORM FOR VERTICAL LEFT WHITE CORNER BRACKET
0xA17C 0xFE44 # PRESENTATION FORM FOR VERTICAL RIGHT WHITE CORNER BRACKET
0xA1C6 0xFE49 # DASHED OVERLINE
0xA1C7 0xFE4A # CENTRELINE OVERLINE
0xA1C8 0xFE4D # DASHED LOW LINE
0xA1C9 0xFE4E # CENTRELINE LOW LINE
0xA1CA 0xFE4B # WAVY OVERLINE
0xA1CB 0xFE4C # DOUBLE WAVY OVERLINE

0xA14D 0xFE50 # SMALL COMMA
0xA14E 0xFE51 # SMALL IDEOGRAPHIC COMMA
0xA14F 0xFE52 # SMALL FULL STOP
0xA151 0xFE54 # SMALL SEMICOLON
0xA152 0xFE55 # SMALL COLON
0xA153 0xFE56 # SMALL QUESTION MARK
0xA154 0xFE57 # SMALL EXCLAMATION MARK
0xA15A 0xFE58 # SMALL EM DASH
0xA17D 0xFE59 # SMALL LEFT PARENTHESIS
0xA17E 0xFE5A # SMALL RIGHT PARENTHESIS
0xA1A1 0xFE5B # SMALL LEFT CURLY BRACKET
0xA1A2 0xFE5C # SMALL RIGHT CURLY BRACKET
0xA1A3 0xFE5D # SMALL LEFT TORTOISE SHELL BRACKET
0xA1A4 0xFE5E # SMALL RIGHT TORTOISE SHELL BRACKET
0xA1CC 0xFE5F # SMALL NUMBER SIGN
0xA1CD 0xFE60 # SMALL AMPERSAND
0xA1CE 0xFE61 # SMALL ASTERISK
0xA1DE 0xFE62 # SMALL PLUS SIGN
0xA1DF 0xFE63 # SMALL HYPHEN-MINUS
0xA1E0 0xFE64 # SMALL LESS-THAN SIGN
0xA1E1 0xFE65 # SMALL GREATER-THAN SIGN
0xA1E2 0xFE66 # SMALL EQUALS SIGN
0xA242 0xFE68 # SMALL REVERSE SOLIDUS
0xA24C 0xFE69 # SMALL DOLLAR SIGN
0xA24D 0xFE6A # SMALL PERCENT SIGN
0xA24E 0xFE6B # SMALL COMMERCIAL AT

-- 
Kevin Bracey, Principal Software Engineer
Pace Micro Technology plc                     Tel: +44 (0) 1223 518566
645 Newmarket Road                            Fax: +44 (0) 1223 518526
Cambridge, CB5 8PB, United Kingdom            WWW: http://www.pace.co.uk/



This archive was generated by hypermail 2.1.2 : Thu Dec 20 2001 - 04:59:48 EST