From: Peter Kirk (peter.r.kirk@ntlworld.com)
Date: Mon Aug 18 2003 - 12:43:53 EDT
On 18/08/2003 09:06, Jim Allan wrote:
> Jill Ramonsky posted:
>
>> I would really like it if these, and
>> every single other character which is "only there for reasons of
>> round trip
>> compatibility" with something else, were explicity marked in the
>> machine-readable charts with something meaning "Don't introduce this
>> character, at all, ever. Don't try to interpret it. Just preserve it, in
>> case it ever gets turned back to its original character set".
>
>
> That would probably be too strong.
>
> If characters are available then some people will use them. :-(
>
> See section 2.3 at http://www.unicode.org/versions/Unicode4.0.0/ch02.pdf
>
> Unicode 3.0 contained under section D21 on compatibility characters:
>
> << Their use is discouraged other than for legacy data. >>
>
> I don't know whether this statement was intentionally removed was
> accidently dropped in the changes in 4.0 which distinguish
> "compatitiblity character" from "compatibility composite character".
>
> In any case people can't be prevent from doing things that are
> officially discouraged, especially as for some particular use it might
> be wrong to discourage them. So if you are handling Roman numerals in
> an application and wish your handling to be complete then
> unfortunately you do have to take the compatibility Roman numerals
> into account.
Yes, but people can be clearly discouraged from using them, and that is
not currently happening. It seems that currently if you come across a
character by browsing through the charts and want to discover if use of
it is officially discouraged you have to wade through huge databases and
hundreds of pages of text to find out if a particular set of properties
implies that use is discouraged. Well, even that won't tell me
definitively, for I read, "The compatibility decomposable characters are
precisely defined in the Unicode Character Database, whereas the
compatibility characters in the more inclusive sense are not." (from
section 2.3) - and it is the latter whose use is discouraged. But is it
in fact safe to assume that the list of such characters includes, but is
not limited to, those which have defined compatibility mappings?
It would be much simpler if each such character were clearly labelled in
the code charts etc. DO NOT USE!, and with its glyph presented on a grey
background or in some other way to indicate its special status.
-- Peter Kirk peter@qaya.org (personal) peterkirk@qaya.org (work) http://www.qaya.org/
This archive was generated by hypermail 2.1.5 : Mon Aug 18 2003 - 13:09:04 EDT