You will have to normalize the way the strings are processed, and you
need to make sure it is done the same way everytime. Checkout ICU for
this purpose.
http://oss.software.ibm.com/icu/
Dave
--- "Theodore H. Smith" <delete@softhome.net> wrote:
> What is going to be done about the confusion generated from
> having multiple ways to encode the same character?
>
> For example, for filenames, OSX will encode an accented Roman
> letter one way, while for filenames Windows will encode it the
> other way. These kind of confusions are totally expected, if
> Unicode will allow more than one way to encode the same
> character.
>
> This means that matching algorithm's won't work, because the
> characters are different!
>
> Will there be some kind of recommendation of which to avoid?
> Will the Unicode consortium make a standard to say that one of
> these encodings is strongly not recommended, and in fact
> depreciated?
>
> And what about the OS that uses this encoding? How will the
> Unicode consortium make the newly-offending OS change it's ways?
>
> And what about the hordes of apps that expect one format but
> don't expect the other? And the hoardes of OS independant apps
> (Java? Perl?) that might generate conflicting versions?
>
>
=====
Dave Possin
Globalization Consultant
www.Welocalize.com
http://groups.yahoo.com/group/locales/
__________________________________________________
Do You Yahoo!?
Sign up for SBC Yahoo! Dial - First Month Free
http://sbc.yahoo.com
This archive was generated by hypermail 2.1.2 : Mon Jul 08 2002 - 15:27:05 EDT