Re: Unicode, SMS, PDA/cellphones

From: Theodore H. Smith (delete@elfdata.com)
Date: Sun May 28 2006 - 09:44:48 CDT

  • Next message: Theodore H. Smith: "Understanding normalisation"

    > On Sun, 28 May 2006 16:31:06 +0800, Donald Z. Osborn wrote:
    >
    >> * Message length being rather shorter in Unicode SMS than with 7 or 8
    >> bit
    >
    > Usual [Latin] SMS messages are using the 7-bit GSM character set. Just
    > a few additional characters are using an escape character.
    > (ref.: http://www.csoft.co.uk/sms/character_sets/gsm.htm )
    > A single SMS message written solely using characters from the 7-bit
    > GSM
    > character set can have maximum 160 characters. If, during SMS
    > composition, a single non-GSM character is entered, then the whole
    > message will turn to double byte, limiting a single message to maximum
    > 70 characters. I don't know if each transmitted character is direct 2
    > bytes PMB, or UTF16 transformation encoding.
    >
    > Every time I try to send a SMS message that includes accented
    > characters for my language (Romanian), I can't stop to blame those who
    > have established the SMS technical standard, because the fixed 2-bytes
    > character for Latin is pure waste of space (and money :).

    BOCU would have been more sensible. It can usually encode codepoints
    above 256 in one byte per character, and it can represent every code
    point.



    This archive was generated by hypermail 2.1.5 : Mon May 29 2006 - 11:40:17 CDT