Re: Unicode 3.0 Release

From: Masahiko Maedera (Masahiko_Maedera@notesgw2.lotus.co.jp)
Date: Mon Sep 13 1999 - 22:07:37 EDT


Dear, Mr. Mark Davis.

Now I have found something wrong in the technical report 17.

http://www.unicode.org/unicode/reports/tr17/

> UTF-8 provides a good example:
> ...
> 0x80..0x3FF ---> 2 bytes
> 0x400..0xD7FF, 0xE000..0xFFFF ---> 3 bytes
> ...

but, in the RFC 2279 UTF-8, the below is described.

> 0000 0080-0000 07FF 110xxxxx 10xxxxxx
> 0000 0800-0000 FFFF 1110xxxx 10xxxxxx 10xxxxxx ( excluding surrogate )

Should it be modified as the following?

> 0x80..0x7FF ---> 2 bytes
> 0x800..0xD7FF, 0xE000..0xFFFF ---> 3 bytes

Best regards,
  Masahiko



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:51 EDT