From: Dean Harding (dean.harding@dload.com.au)
Date: Wed Jun 01 2005 - 02:29:48 CDT
I've been trying this out myself today, and from what I can gather hotmail,
yahoo and gmail all accept utf-8 quoted-printable mails (as in, they appear
to decode them properly at least). The problem with both hotmail and yahoo
is that they report to the browser that the webpage is encoded as ISO
8859-1, so any UTF-8 characters will be garbage. Gmail, on the other hand,
is OK since it reports an encoding of UTF-8 to the browser, so any
non-US-ASCII characters look right on gmail.
Perhaps the POP3 interface to Hotmail/Yahoo will work properly, but I'm
haven't bothered to sign up for it to test :)
Faraz, my suggestion to you is that you continue sending emails with the
UTF-8 charset and quoted-printable (since this seems the most
widely-supported combination) and that you also recommend your users to NOT
use Hotmail or Yahoo to view Urdu emails. I don't believe there's anything
you can do that will get them to display properly, without them having to
manually change the page's encoding to UTF-8. You can get your users to
sign up for gmail (it's still invite-only, but you can always forward
yourself an invite via http://isnoop.net/gmail/), since gmail works fine.
Dean.
-----Original Message-----
From: unicode-bounce@unicode.org [mailto:unicode-bounce@unicode.org] On
Behalf Of Philippe Verdy
Sent: Wednesday, 1 June 2005 3:12 am
To: paul@sustainableGIS.com; unicode@unicode.org
Subject: Re: browser encoding settings
True also in France: some servers were initially configured to accept emails
using 8-bit encodings, and later they were reverted to only accept 7-bit
encodings.
Many mail servers in Japan only support 7-bit emails, because they were
tweaked locally since long to support Shift-JIS, and not reconfigured later
to support other 8-bit encodings with something else than the ugly MIME
7-bit transfer encoding syntaxes.
As for Indian charsets, there's no other better supported encoding (ISCII is
rarely supported in most browsers, mail agents or webmail servers), the only
choice that remains is then UTF-7.
But some webmails servers also do not support UTF-7, but only UTF-8, so
users reading their emails online will be disapointed to see messages
bargles with unreadable sequences like "+AO7-"... I think that Urdu readers
need to use POP3 email agents, or choose a webmail service that do support
the decoding of UTF-7 (in addition to UTF-8).
The alternative then is to use UTF-8 with a MIME 7-bit transfer encoding
syntax (quoted printable). Note that Base64 would probably be more
efficient, but some mail servers reject all Base64-encoded emails, because
they think they only contain binary attachments which are thought to be
undesirable for security (simply because Base64 is used most often for those
binary attachments).
If I had to send Urdu emails, I would choose UTF-8 with Quoted-Printable...
Ugly because this is an inefficient encoding (so emails are larger), but at
least it works on most platforms. Now the recipients need a browser or email
agent capable of displaying Urdu texts (this is a separate issue: if your
email is in Urdu, you can expect that users capable of reading this language
have set up an environment with fonts and renderers suitable for the
extended Arabic script, and Bidi rendering).
A more efficient encoding would use the ISO-8859 Latin+Arabic charset also
with Quoted-Printable (but here again, the Latin-Arabic charset is not
commonly supported by many webmail agents).
BiDi text rendering is also an issue: if your email is plain text, not all
email agents will render it properly (and BiDi override controls defined in
Unicode are too much often ignored in many console applications, as they
have no equivalent in legacy Arabic charsets). If you use HTML instead, you
could alternatively use a "visual" encoding order for characters, using the
<BDO> HTML override. This will complicate the composition of your email
text...
----- Original Message -----
From: "Paul Hastings" <paul@sustainableGIS.com>
To: <unicode@unicode.org>
Sent: Tuesday, May 31, 2005 7:36 AM
Subject: Re: browser encoding settings
> Dean Harding wrote:
>> Like most character set conversions, they probably convert it from
>> whatever the source encoding is to some form of Unicode (usually whatever
>> is most convenient for the platform), and then into whatever output
>> encoding they wanted (in this case UTF-8).
>
> i'm not sure that's true for yahoo. we've had numerous headaches sending
> utf-8 mail to their users. from what we were able to tease out of their
> html it looks like the encoding is dependent on "where" the yahoo mail
> server is. some "US" servers don't seem to have any html encoding hints at
> all, "Chinese" servers seem to use GB2312, etc. users have had to manually
> swap their browser's encoding, usually messing up the rest of the yahoo
> content around the email. we couldn't find any official yahoo docs on this
> (though maybe we didn't look hard enough or in the right places). we more
> or less gave up on it and included an idiotic "if you can't read this
> email...." tag.
This archive was generated by hypermail 2.1.5 : Wed Jun 01 2005 - 02:31:38 CDT