From: Jon Hanna (jon@hackcraft.net)
Date: Thu Nov 12 2009 - 04:13:32 CST
Wunna Ko wrote:
> While I am working on Unicode strings with MS Database 2005, I got
> primary key validation error when I input Burmese characters ကု "Ku"
> while ကို "Ko" is in the database.
>
> ကု "Ku" is U1000 + U102F and ကို "Ko" is U1000 + U102D + U102F
>
> It is really a separate string but MS Database treat as a same string. :(
It's more accurate to say that the database is treating it as an
equivalent string. Which are considered equivalent depends on the
collation. The code:
SELECT
CASE WHEN
NCHAR(0x1000) + NCHAR(0x102F) COLLATE Latin1_General_BIN =
NCHAR(0x1000) + NCHAR(0x102D) + NCHAR(0x102F)
THEN 'Equivalent' ELSE 'Distinct'
END,
CASE WHEN
NCHAR(0x1000) + NCHAR(0x102F) COLLATE Latin1_General_CI_AS =
NCHAR(0x1000) + NCHAR(0x102D) + NCHAR(0x102F)
THEN 'Equivalent' ELSE 'Distinct'
END
gives results of "Distinct, Equivalent" because the first collation
considers these strings distinct while the second doesn't.
Without a COLLATE clause the default collation for your database will be
used.
Similarly, if a column is a primary key, then the collation used for
that column will affect which strings are considered equivalent for the
purpose of that key.
SELECT * FROM fn_helpcollations() will list collations available on your
system.
Those found with SELECT * FROM fn_helpcollations() WHERE description
LIKE '%bin%' are most likely to find these two strings to be distinct,
but may go too far in this regard (being sensitive to all concerns of
case, accent, etc).
This archive was generated by hypermail 2.1.5 : Thu Nov 12 2009 - 04:19:00 CST