From: Doug Ewell (dewell@adelphia.net)
Date: Tue Feb 04 2003 - 01:49:13 EST
Keyur Shroff <keyur_shroff at yahoo dot com> wrote:
> Can you please explain what is the best practice to handle unassigned
> code points so that applications can easily become forward compatible?
> If we just ignore unassigned code points, then will it make for
> application easier to migrate to later version of Unicode?
I should probably wait for someone like Ken to come by and provide an
authoritative answer, but until then:
The basic rule is that unassigned code points cannot be interpreted or
modified in any way.  In particular, they cannot simply be thrown away,
or converted to an assigned code point such as U+003F or U+FFFD.
That said, there are certain conventions for certain ranges of code
points.  For example, the range from U+0590 through U+08FF is marked in
the Roadmap as being reserved for right-to-left scripts, and IIRC there
are ranges reserved for invisible formatting and control characters
(U+206x and U+FFFx).  But I really don't know how advisable it is to,
say, render an string of unassigned code points like ࠁࠂࠃ as RTL just
because it falls within the "RTL block."
Better wait for the experts.
-Doug Ewell
 Fullerton, California
This archive was generated by hypermail 2.1.5 : Tue Feb 04 2003 - 02:32:43 EST