Re: Thank you for all the good information, sUTF32ToUTF8 function

From: Markus Scherer (markus.scherer@jtcsv.com)
Date: Fri Nov 09 2001 - 13:05:28 EST

Previous message: Jungshik Shin: "Another Chinese char. variant dictionary(was..Re: Is 879,309 enough?)"
In reply to: Peter_Constable@sil.org: "Re: Thank you for all the good information, sUTF32ToUTF8 function"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

> Any suggestions on what the right way to deal with "surrogate" codepoints
> in this algorithm? They should not occur in the data, but what if they do?

Either encode them as 3-byte UTF-8, or throw an exception etc.
Note that ISO 10646-UTF-8 forbids encoding them at all, and it looks like Unicode-UTF-8 is going that direction.

markus

Previous message: Jungshik Shin: "Another Chinese char. variant dictionary(was..Re: Is 879,309 enough?)"
In reply to: Peter_Constable@sil.org: "Re: Thank you for all the good information, sUTF32ToUTF8 function"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.2 : Fri Nov 09 2001 - 14:10:26 EST