From: Bjoern Hoehrmann (derhoermi@gmx.net)
Date: Mon Oct 04 2010 - 22:59:19 CDT
Hi,
Every now and then I need a tool that takes a Unicode string and gives
me all the strings that are not identical but equivalent under one of
the four normalization forms defined in UAX #15. Now I do have a couple
of hacks that get me by, but is there any tool or paper that has a more
complete solution? Last year I worked a bit in the general direction,
but http://lists.w3.org/Archives/Public/www-archive/2009Feb/0071.html I
ran out of time after proving that the sets of strings in one of the
normal forms are all regular languages, and writing a denormalizer was
not the goal anyway.
Thanks,
-- Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de 25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/
This archive was generated by hypermail 2.1.5 : Mon Oct 04 2010 - 18:02:17 CDT