Re: FW: Subj: Amount of Space Unicode Takes

From: Asmus Freytag (asmusf@ix.netcom.com)
Date: Mon Jul 16 2007 - 13:56:44 CDT

Next message: Addison Phillips: "Re: FW: Subj: Amount of Space Unicode Takes"

Previous message: Jukka K. Korpela: "Re: FW: Subj: Amount of Space Unicode Takes"
In reply to: Magda Danish (Unicode): "FW: Subj: Amount of Space Unicode Takes"
Next in thread: Addison Phillips: "Re: FW: Subj: Amount of Space Unicode Takes"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

On 7/16/2007 10:24 AM, Magda Danish (Unicode) wrote:
> Daniel,
> I am forwarding your question to the Unicode mailing list http://www.unicode.org/consortium/distlist.html for possible help from list subscribers.
> Regards,
>
> ---------------------------
> Magda Danish
> Sr. Administrative Director
> The Unicode Consortium
> 650-693-3921
> magda@unicode.org
>
>
>
> -----Original Message-----
> Date/Time: Fri Jul 13 12:58:18 CDT 2007
> Contact: dbjohnson88@hotmail.com
> Name: Daniel Johnson
> Report Type: Other Question, Problem, or Feedback Opt Subject: Amount of Space Unicode Takes
>
> I have a question about how much space Unicode takes up. I am working on a HTML project in multiple languages. Each of these web pages have to be stored on a chip with limited space. Is there any way to "compact" the HTML scripts in order to save space on the chip? Or is there a different call number for a character which will take up less space in hex? It would be greatly appreciated if the email was answered.
>
If you're project allows you to insert your own decompression layer
between the on-chip storage and the HTML, then you can use SCSU, the
Standard Compression Scheme for Unicode. SCSU is intended for
applications where the goal of compression is to arrive at about the
same size as a traditional 8-bit encoding for the *same* text. It also
preserves ASCII, so it only compresses the text data in your HTML, not
the syntax characters or element names, etc. If needed, you can delay
the decoding until the time you actually need to display a given bit of
text, since you can parse the HTML syntax w/o decoding.

See http://www.unicode.org/reports/tr6/ for the full details.

Decoders are very small and easy to write yourself, encoders can be more
complex, as there are sometimes multiple choices on how to compress a
string, with different lengths. If you try for the absolute 'best' case,
your encoder can get tricky, but, as experience has shown, a
'reasonable' effort will deliver good results at very moderate code
complexity.

Sample code exists or can be found easily for a number of approaches,
and you don't need to use an external library, making it ideal for
on-chip solutions.

A./
> Thank you
>
> Daniel Johnson
>
> -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- (End of Report)
>
>
>
>
>
>

Next message: Addison Phillips: "Re: FW: Subj: Amount of Space Unicode Takes"
Previous message: Jukka K. Korpela: "Re: FW: Subj: Amount of Space Unicode Takes"
In reply to: Magda Danish (Unicode): "FW: Subj: Amount of Space Unicode Takes"
Next in thread: Addison Phillips: "Re: FW: Subj: Amount of Space Unicode Takes"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Mon Jul 16 2007 - 13:57:56 CDT