Montgomery Securities scripsit:
> How does a software package written on an operating system that supports
> ASCII as well as Unicode (Windows NT) identify the encoding scheme that a
> text file on disk uses? Is there any special marking at the front of a
> Unicode file that helps distinguish it from an 8 bit file?
The specific answer for Notepad on Windows NT 4.0 is that if the first
two bytes are FF FE, then the file is assumed to be (little-endian)
Unicode. Otherwise, it is assumed to be in the current system code page,
typically CP1252.
It is possible for a Latin-1 file to begin with
y-diaeresis followed by thorn by sheer bad luck, but it is most
unlikely.
-- John Cowan cowan@ccil.org I am a member of a civilization. --David Brin
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:51 EDT