UTF8 vs. Unicode (UTF16) in code

From: Allan Chau (achau@rsasecurity.com)
Date: Wed Mar 07 2001 - 19:45:58 EST

Next message: Lukas Pietsch: "Re: Square and lozenge notes -- Musical Notation 3.1 -- Mensuralnotation"
Previous message: Rick McGowan: "Re: Square and lozenge notes -- Musical Notation 3.1 --Mensuralnotation"
Next in thread: addison@inter-locale.com: "Re: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: addison@inter-locale.com: "Re: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Ienup Sung: "Re: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: addison@inter-locale.com: "Re: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Ienup Sung: "Re: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Ayers, Mike: "RE: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Ienup Sung: "RE: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Michael \(michka\) Kaplan: "Re: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Ienup Sung: "Re: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Marco Cimarosti: "RE: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Antoine Leca: "Re: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Peter_Constable@sil.org: "RE: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Thomas Chan: "RE: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Marco Cimarosti: "RE: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Ayers, Mike: "RE: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Yves Arrouye: "RE: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Allan Chau: "Re: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Keld Jørn Simonsen: "Re: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Ienup Sung: "Re: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Peter_Constable@sil.org: "RE: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Ayers, Mike: "RE: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Ienup Sung: "Re: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Thomas Chan: "RE: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: John H. Jenkins: "RE: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: John H. Jenkins: "RE: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Lars Marius Garshol: "Re: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Marco Cimarosti: "RE: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Michael Everson: "RE: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Thomas Chan: "RE: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: John Jenkins: "Re: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Christopher John Fynn: "RE: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Kenneth Whistler: "Re: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: John Jenkins: "Re: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: John Jenkins: "Re: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Michael Everson: "Re: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: William Overington: "Re: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Kenneth Whistler: "Re: UTF8 vs. Unicode (UTF16) in code"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

We've got an English-language only product which makes use of
single-byte character strings throughout the code. For our next
release, we'd like to internationalize it (Unicode) & be able to store
data in UTF8 format (a requirement for data exchange).

We're considering between using UTF8 within the code vs. changing our
code to use wide characters. I'm wondering what experiences others have
had that can help with our decision. I'm thinking that using UTF8
internally may mean less rewriting initially, but we'd have to check
carefully for code that make assumptions about character boundaries.
Because of this, I think that it'd be more complicated for developers to

have to work with UTF8 in code. Unicode (UTF16) internally would be
easier to manage since most characters will essentially be fixed width,
but there'd be alot of code to rewrite. Also, I've heard of problems
with the wide character type (wchar_t) having different definitions
depending on platform (we're running on NT & Sun Solaris). Many of our
product APIs would also be affected.

Can others offer their insights, suggestions?

Thanks,
-allan

text/x-vcard attachment: Card for Allan Chau

Next message: Lukas Pietsch: "Re: Square and lozenge notes -- Musical Notation 3.1 -- Mensuralnotation"
Previous message: Rick McGowan: "Re: Square and lozenge notes -- Musical Notation 3.1 --Mensuralnotation"
Next in thread: addison@inter-locale.com: "Re: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: addison@inter-locale.com: "Re: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Ienup Sung: "Re: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: addison@inter-locale.com: "Re: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Ienup Sung: "Re: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Ayers, Mike: "RE: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Ienup Sung: "RE: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Michael \(michka\) Kaplan: "Re: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Ienup Sung: "Re: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Marco Cimarosti: "RE: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Antoine Leca: "Re: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Peter_Constable@sil.org: "RE: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Thomas Chan: "RE: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Marco Cimarosti: "RE: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Ayers, Mike: "RE: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Yves Arrouye: "RE: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Allan Chau: "Re: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Keld Jørn Simonsen: "Re: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Ienup Sung: "Re: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Peter_Constable@sil.org: "RE: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Ayers, Mike: "RE: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Ienup Sung: "Re: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Thomas Chan: "RE: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: John H. Jenkins: "RE: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: John H. Jenkins: "RE: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Lars Marius Garshol: "Re: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Marco Cimarosti: "RE: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Michael Everson: "RE: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Thomas Chan: "RE: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: John Jenkins: "Re: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Christopher John Fynn: "RE: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Kenneth Whistler: "Re: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: John Jenkins: "Re: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: John Jenkins: "Re: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Michael Everson: "Re: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: William Overington: "Re: UTF8 vs. Unicode (UTF16) in code"
Maybe reply: Kenneth Whistler: "Re: UTF8 vs. Unicode (UTF16) in code"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:20 EDT