RE: texteditors that can process and save in different encodings

From: Dreiheller, Albrecht <albrecht.dreiheller_at_siemens.com>
Date: Tue, 9 Oct 2012 12:48:52 +0200

One "editor" tool worth to mention in this context is MS Word, using "plain text" format.
Might be unbelievable, but let me explain.

Whenever someone asks how to convert from one encoding to another,
but if he or she is not willing to download (or learn how to use) a new tool or editor,
I suggest to use MS Word.
It's useful to activate the (mostly unknown) "Confirm conversion at Open" option.

The conversion dialog can be used both on reading and saving, and it offers all installed codepages
including some (not all) UTF encodings.
Upon saving, the CR/LF line break setting may be chosen.

When reading, the preview is quite useful to guess the unknown encoding of text files.
When saving, the preview highlights characters that would be lost if the chosen encoding would really be used.

Limitations are: For creating UTF-8, using a BOM is always enabled. Some encodings come with
confusing names, such as "GB2312" (which means GBK) vs. "GB2312-80" (which is indeed GB2312-1980).
Files may have to be renamed to a *.txt suffix to force the conversion dialog to appear.


As an additional hint I want to mention a tool named "WinMerge" which is quite useful
for comparing text files content-based. For both files to be compared, the encoding can be chosen, and
the line break setting may be different, too.

Albrecht

________________________________
From: unicode-bounce_at_unicode.org [mailto:unicode-bounce_at_unicode.org] On Behalf Of Stephan Stiller
Sent: Thursday, October 04, 2012 6:59 AM
To: unicode_at_unicode.org
Subject: texteditors that can process and save in different encodings

Dear all,

In your experience, what are the best (plaintext) texteditors or word processors for Linux / Mac OS X / Windows that have the ability to save in many different encodings?

This question is more specific than asking which editors have the best knowledge of conversion tables for codepages (incl their different versions), which I'm interested in as well. There are a number of programs that appear to be able to read many different encodings – though I prefer the type that actually tells me about where format errors are when a file is loaded. Then, many editors that claim to be able to read all those encodings cannot display them; as for that, I don't care about font choice and the aesthetics of display, as I'm only interested in plaintext.

Some things I have seen that are no good:

 * the editor not telling me about the encoding and line breaks it has detected and not letting me choose
 * the editor displaying a BOM in hex mode even if there is none (a version of UltraEdit I worked with at some point)

Stephan

Received on Tue Oct 09 2012 - 05:55:06 CDT

This archive was generated by hypermail 2.2.0 : Tue Oct 09 2012 - 05:55:09 CDT