From: Doug Ewell (dewell@adelphia.net)
Date: Sun May 28 2006 - 18:26:28 CDT
Cristian Secară <orice at secarica dot ro> wrote:
>> This sounds like an application for SCSU! The Romanian performance
>> will take a slight hit from the distinction of comma below and
>> cedilla in the Unicode glyph standard, [...]
>
> What has this to do with the discussion here ?
> I am discussing the GSM character set here. This happen to have a few
> Western Latin characters in it.
Richard was suggesting that SCSU would have been a more appropriate
encoding for SMS than the GSM character set. It allows access to the
full Unicode repertoire and encodes most Latin-based orthographies,
including Romanian, much more efficiently than GSM.
> For example, a message written with some accented characters for
> French language (like à, è or similar) will always fall in the GSM
> character set, so the message will consists only of 7-bit per
> character / 160 characters per message
> When I am entering something particular for Romanian (î for example,
> that is U+00EE), the whole message will turn to 16-bit per character /
> 70 characters per message, even if the remaining 99% of my message has
> only pure ASCII characters. The use of UTF-8 have been of great sense
> here, but for some reason this option has been left out.
That was exactly Richard's point: this would not happen if SCSU were
used. SCSU does have a fallback to 16-bit "Unicode mode," but primarily
for Han, Yi, and Hangul, which generally need 16 bits anyway.
-- Doug Ewell Fullerton, California, USA http://users.adelphia.net/~dewell/
This archive was generated by hypermail 2.1.5 : Sun May 28 2006 - 18:35:18 CDT