Look for the Byte Order Mark (BOM) or the byte-swapped BOM as the first 2
bytes of a file with Unicode text.
Note, however, that Unicode files are not required to have the BOM at the
beginning.
Ed Hart
Edwin F. Hart
Applied Physics Laboratory
11100 Johns Hopkins Road
Laurel, MD 20723-6099
+1-240-228-6926 (from Washington, DC area)
+1-443-778-6926 (from Baltimore area)
+1-240-228-1093 (fax)
edwin.hart@jhuapl.edu <mailto:edwin.hart@jhuapl.edu>
-----Original Message-----
From: Gnanesh Gujulva
Sent: 26 March, 1999 12:54
To: Unicode List
Subject: FW: Algorithm
I am working on a Java application which should handle both
ascii
text files and unicode files. Is there a genralised algorithm to
detect the
type of character set being used? I need to detect whether the
character
set is plain ascii or Unicode.
> Regards
> Gnanesh
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:44 EDT