RE: Support for non-BMP characters from Marc Durdin on 2012-04-25 (Unicode Mail List Archive)

From: Marc Durdin <marc.durdin_at_tavultesoft.com>
Date: Wed, 25 Apr 2012 09:09:12 +0000

Probably the most egregious example I know of is JavaScript. As far as I know, JavaScript still only groks UCS-2. I'd love to be wrong.

Marc

-----Original Message-----
From: unicode-bounce_at_unicode.org [mailto:unicode-bounce_at_unicode.org] On Behalf Of David Starner
Sent: Wednesday, 25 April 2012 6:32 PM
To: Unicode Mailing List
Subject: Support for non-BMP characters

It's been ten years since the first non-BMP characters were encoded.
How are they working in your neck of the woods? There's a lot of places where they're working just fine, but I was facing MySQL's support. It has had support for UCS-2 and UTF-8 limited to the BMP for a long time; now in MySQL 5.5 there's utf16, utf32 and utf8mb4. (MySQL
5.1 and 5.5 are the current stable releases.) But there's enough warnings about incompatibilities with utf8mb4 to make me pause before switching my private database to it, and I think the net will see MySQL databases with utf8 instead of utf8mb4 as long as MySQL exists, unless they decide to push people over to it.

(Ada's an issue too, though not one most people will have to deal with. While Ada 2005 added a UTF-32 string type, it left the UCS-2 string type as is. Again, I suspect a lot of nominally Unicode Ada programs are going to BMP-only. Of course, UTF-8 as an ASCII superset is used, stuffed into strings labeled Latin-1; it's technically not conformant with the Ada standard but it works so long as you don't need much string processing.)

In any case, is the use of non-BMP characters still problematic in your corner of the computing world or is everything looking fine from where you are?

--
Kie ekzistas vivo, ekzistas espero.

Received on Wed Apr 25 2012 - 04:11:53 CDT

This archive was generated by hypermail 2.2.0 : Wed Apr 25 2012 - 04:11:54 CDT