Re: Problem with accented characters

From: Doug Ewell (dewell@adelphia.net)
Date: Mon Aug 23 2004 - 13:51:32 CDT

Next message: Deborah Goldsmith: "Re: Problem with accented characters"

Previous message: Tay, William: "Problem with accented characters"
In reply to: Tay, William: "Problem with accented characters"
Next in thread: Deborah Goldsmith: "Re: Problem with accented characters"
Reply: Deborah Goldsmith: "Re: Problem with accented characters"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

Problem with accented charactersWilliam Tay wrote:

> Can anyone explain why an accented character is sometimes represented
> as a base character plus its accent? For example, the utf-8
> representation for é is 65 CC 81, which is the utf-8 representation
> for e and the accent, instead of C3 A9? I find that this is how MacOS
> X represents accented characters.

The two characters U+0065 and U+0301 (é) are canonically equivalent to
the single character U+00E9 (é). That is, the two-character combining
sequence is supposed to be considered equivalent to the single
precomposed character. Apparently MacOS X, or at least one application
running under it, does use the combining sequence.

> How can a C application that receives such utf-8 encoded characters
> handle them correctly? Appreciate your comments.

It must understand normalization. See TUS 4.0, section 5.6 for more
information.

-Doug Ewell
Fullerton, California
http://users.adelphia.net/~dewell/

Next message: Deborah Goldsmith: "Re: Problem with accented characters"
Previous message: Tay, William: "Problem with accented characters"
In reply to: Tay, William: "Problem with accented characters"
Next in thread: Deborah Goldsmith: "Re: Problem with accented characters"
Reply: Deborah Goldsmith: "Re: Problem with accented characters"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Mon Aug 23 2004 - 13:52:37 CDT