Re: Charsets + encoding + codesets

From: Martin J. Dürst (mduerst@ifi.unizh.ch)
Date: Thu Oct 09 1997 - 10:14:18 EDT


On Wed, 8 Oct 1997, Kenneth Whistler wrote:

>
> Martin asked, in response to my example of what I would
> like to see in a consistent registry of encoded character
> sets:
>
> > Where did you get your short tags from? The largest and most widely
> > used collection of tags in this area is the IANA "charset" registry.
> > at least three of four of your short tags are wrong in this respect;
> > it is iso-8859-1, utf-8, and utf-16.
>
> I made them up. The whole point was to have a *short*, consistently
> constructed tag for identification purposes within this table,
> rather than that IANA tag, for several reasons:
>
> 1. This is an excerpt from a large spreadsheet of such things,
> and many entries in the table do not have an IANA registry--
> thus do not have an IANA tag.

You can easily use an x-... tag for your internal purposes, or
register them if you think it's necessary.

> 2. The IANA tags are whatever they got registered as, which means
> they are not consistently generated, and are not always short.
> (My fave is: "Extended_UNIX_Code_Packed_Format_for_Japanese",
> but I also dislike the years appended to all the 8859 parts:
> "ISO_8859-9:1989", etc. MIME substitutes "ISO-8859-9".)
>
> Think of the short tags as another set of aliases for the IANA
> registry, if you will.

We don't need to complicate things by creating another set of
aliases. The IANA registry clearly indicates which names/aliases
are preferred by MIME. When I say IANA tags or MIME tags, that's
what I mean. And these are reasonably short. I see absolutely
no reason, for example, why you make 'UTF8' out of the official
'UTF-8'. It only creates confusion.

Regards, Martin.



This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:37 EDT