From: Jungshik Shin (jshin@mailaps.org)
Date: Tue Nov 04 2003 - 23:44:47 EST
Markus Scherer wrote:
> YTang0648@aol.com wrote:
>
>> We are talking about charset value for the internet protocol here. It
>> is a special narrow field of charset name. The value used by Internet
>> protocol are defined by a well defined process-
>> http://www.faqs.org/rfcs/rfc2278.html RFC 2278 - IANA Charset
>> Registration Procedures
>
> "well defined process" is a stretch. By the way, RFC 2978 replaced 2278
> a few years ago.
I think exactly the same way as Markus on all his points. It's
regrettable but true that all sort of garbages (in addition to
well-defined useful character encodings) were thrown into it and were
accepted almost blindly. In other words, there's a serious quality
control issue.
> The problem with the IANA charset _list_ is that it lists not only
> useful charset names but also
> - names that are illegal for charsets by its own rules
> - names for things that are not (verifiably) charsets at all
Markus may have something different in mind, but I'd add to this
category coded character set names like ks_c_5601-1987 that are not
suitable for use as MIME charset.
> - names for charsets that cannot be implemented reliably
> because there is no online, machine-readable specification for them,
> and even for the ones where there is one, it is not usually
> a mapping to/from Unicode
Related to the last point is that some charsets are not properly
annotated.
Jungshik
This archive was generated by hypermail 2.1.5 : Wed Nov 05 2003 - 00:35:01 EST