From: eflarup@yahoo.com
Date: Wed Aug 10 2005 - 11:56:17 CDT
Maybe the new CharsetDetector in ICU 3.4 would be
useful for this situation:
http://icu.sourceforge.net/apiref/icu4j/com/ibm/icu/text/CharsetDetector.html
--- Ritesh <ritesh.h.patel@gmail.com> wrote:
> Now we have few user who upload a file which can
> contain English and
> other language characters(Here it is Arabic).
>
> This files can have different combinations as below,
> 1. File is a UTF-8 and have English and Arabic
> Characters.
> 2. File is a UTF-16 (LE) and have English and Arabic
> Characters.
> 3. File is UTF-8 and Have only Arabic Characters
> 4. File is UTF-8 and Have only English Characters
> 5. File is UTF-16 and Have only Arabic Characters
> 6. File is UTF-16 and Have only English Characters
> 7. File can be in ASCII format.
>
This archive was generated by hypermail 2.1.5 : Wed Aug 10 2005 - 11:57:44 CDT