Re: Fonts across platforms....

From: Otto Stolz (Otto.Stolz@uni-konstanz.de)
Date: Fri Jun 05 2009 - 05:11:15 CDT

  • Next message: Martin Heijdra: "Is searching in Unihan database down?"

    Damon Anderson schrieb:
    > I'm not sure what you mean by "quoted-printable encoding".

    Cf. <http://en.wikipedia.org/wiki/Quoted-printable>;
    for an example,
    cf. <http://de.wikipedia.org/wiki/Quoted-printable#Beispiel>.

    > But it seems
    > in both cases, either Notepad or Email that I am choosing which encoding
    > to save/send the file in... with the result being that it is possible
    > the application is converting the original content created by Unikey.

    I do not expect neither Notpad nor Thunderbird to apply normalization
    (cf. <http://www.unicode.org/faq/normalization.html>, and
    <http://www.unicode.org/reports/tr15/>) to the data. I rather guess
    that they simply apply the desired encoding to the data.

    In Notepad, you would chose "Unicode Big Endian" (i. e. UTF-16BE)
    encoding to store the unaltered data, as delivered from the keyboard
    driver.

    In Thunderbird, you would chose "Unicode (UTF-8)" encoding, and
    "quoted-printable". I am using the German Thunderbird version, so
    I can only guess the menu items you will have to use:
    the encoding is under settings/encoding; the quoted printable under
    options/settings/general (or something similar). With these settings,
    Thunderbird will convert the text into UTF-8, and then apply the
    MIME quoted-printable encoding.

    > By the way when I get my Hex dump how do I match that to the Unicode chart?

    The Unicode charts exhibit the Unicode Scalar Values in hex;
    any hex dump of UTF-16BE data will directly compare to the charts.

    To compare UTF-8 data to the charts, you would have to reverse
    the UTF-8 encoding first; cf.
    <http://www.systems.uni-konstanz.de/Otto/Vortrag/Charset/UTF-8_Magic_Pocket_Encoder.pdf>,
    or <http://skew.org/cumped/>.

    In quoted-printable, you get the hex value of each non-ASCII byte
    in three characters, e. g. "=FC" for the byte FC (hex).

    Good luck,
       Otto Stolz



    This archive was generated by hypermail 2.1.5 : Fri Jun 05 2009 - 05:15:05 CDT