From: John D. Burger (john@mitre.org)
Date: Thu Sep 20 2007 - 12:55:31 CDT
Asmus Freytag wrote:
> IDN still operates on a restricted domain of characters, many
> characters that are part of ordinary text are disallowed from the
> get-go (I haven't checked where that subset is at recently, but
> that's the general idea). At the minimum, the transformations that
> are designed into IDN would need to be modified or extended to
> handle such characters. Because of that alone, the normalization
> and folding aspect of IDN is unlikely to be suitable for general
> text. There are likely additional issues.
>
> If you suggest that any scheme in which you can't represent the
> word "can't" is suitable for the class of applications that the
> original poster represents, then I fail to follow you.
But that's due to IDN's restricted domain, yes? I guess my thought
was that, if the transformations from IDN can be applied to a larger
domain of characters, then IDN might provide a fifth normalization
form appropriate for a broad class of applications.
But my hope for a cookie-cutter solution appears to be forlorn. :)
- John D. Burger
MITRE
This archive was generated by hypermail 2.1.5 : Thu Sep 20 2007 - 12:51:07 CDT