Dear, all.
I have just sent a mail here first time.
I am Masahiko Maedera, Japanese Software Engineer
in Lotus Development Japan.
I am making Unicode Text Editor on Windows9X/NT
for three years (since 1996).
Now I have a problem that
the area 0x00100000-0x7FFFFFFF of UCS-4 can not be mapped by UTF-16.
I think that this area may not be used right now.
But if this area will be used in future,
we will have serious problems of conversion and compatibility.
Especially, in ISO-10646-1, we can use Praivate Use Area
(0x0E000000-0x00FFFFFF, 0x60000000-0x7FFFFFFF),
And there is no prohibition to use this area now.
When we meet a plain text which contains these area expressed by UTF-8,
we must give up to treat this text by UTF-16
in spite of currect ISO-10646-1 text.
It is unhappy, indeed.
Therefore, I offer one proposal to solve this problem.
If you will offer better proposal than mine,
I am willing to accept it.
But It is unhappy for me to accept the condition
that there is no conversion rule between UCS-4 to UTF-16.
My proposal is,
-----
At first, I use binary expression.
UCS-4(binary expression):
0wxxxxxx-xxxxyyyy-yyyyyyzz-zzzzzzzz
Extended UTF-16(binary expression):
11011011-0111110w, 110111xx-xxxxxxxx,
11011011-01111110, 110111yy-yyyyyyyy,
11011011-01111111, 110111zz-zzzzzzzz
Next, I use hexadicimal expression.
UCS-4 range:
0x00110000-0x3FFFFFFF
Extended UTF-16 expression:
U+DB7C + low surrogate + U+DB7E + low surrogate + U+DB7F + low surrogate
UCS-4 range:
0x40000000-0x7FFFFFFF
Extended UTF-16 expression:
U+DB7D + low surrogate + U+DB7E + low surrogate + U+DB7F + low surrogate
-----
You may have anxiety that there need be 12 octets to express this area.
But I think this is trivial thing.
Becouse it is important that I should guard current efficiency
of surrogate pairs and I rarely process this area.
And from this conversion,
some code points(0x000EF000-0x000EFFFF) in Plane 14 shall be reserved.
Best regards,
Masahiko Maedera.
-- 1999/04/12 Masahiko Maedera<Masahiko_Maedera@lotus.co.jp> Lotus Development Japan.
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:45 EDT