Re: Detecting encoding in Plain text

From: Peter Kirk (peterkirk@qaya.org)
Date: Wed Jan 14 2004 - 07:33:34 EST

Next message: Peter Kirk: "Re: New MS Mac Office and Unicode?"

Previous message: Peter Kirk: "Re: German characters not correct in output webform"
In reply to: D. Starner: "Re: Detecting encoding in Plain text"
Next in thread: D. Starner: "Re: Detecting encoding in Plain text"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

On 13/01/2004 18:05, D. Starner wrote:

>Peter Kirk writes:
>
>
>>I agree that heuristics should be adjusted for Thai. But problems may
>>arise if they have to be adjusted individually, and without regression
>>errors, for all 6000+ world languages.
>>
>>
>
>Thai is hard because of the writing system. But most writing systems weren't
>encoded pre-Unicode, so if they were typed into a computer, it was with
>a Latin (or Cyrillic?) transliteration that probably used spaces and new lines,
>and in fact was probably ASCII.
>
>More cynically, those who use obscure character sets or font encodings have
>trouble viewing them; that is one of the reasons for Unicode. That this tool
>may to some extent be an example of that problem is a simple fact of life,
>and doesn't call for it to be thrown out.
>
>

Either you are confused or I am. I was not referring to pre-Unicode
legacy encodings. I was referring to Unicode plain text data which may
(when Unicode includes all the necessary characters) be in any one of
6000+ languages, some of which have a variety of scripts and spelling
conventions. The problem is not that people are using obscure legacy
encodings, but that they are not defining their UTF adequately.

-- 
Peter Kirk
peter@qaya.org (personal)
peterkirk@qaya.org (work)
http://www.qaya.org/

Next message: Peter Kirk: "Re: New MS Mac Office and Unicode?"
Previous message: Peter Kirk: "Re: German characters not correct in output webform"
In reply to: D. Starner: "Re: Detecting encoding in Plain text"
Next in thread: D. Starner: "Re: Detecting encoding in Plain text"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.5 : Wed Jan 14 2004 - 08:22:06 EST