I have numbers for text size and conversion performance of BOCU-1 and SCSU relative to UTF-8.
Quick summary:
For Latin text, UTF-8 is best.
For CJK, BOCU-1 and SCSU provide smaller size, with some speed trade-off.
For other scripts, BOCU-1 and SCSU are much better than UTF-8 in both speed and size.
Note that BOCU-1 encoded text (since it preserves control characters and spaces) could be directly used in emails, for CVS, etc.
Please see http://oss.software.ibm.com/icu/dropbox/bocuperf.html
Best regards,
markus
This archive was generated by hypermail 2.1.2 : Thu Apr 04 2002 - 20:33:26 EST