For the purpose specified, isLatin1 should just test for <= 0xFF. After all,
one would not want to exclude TAB, CR or LF ☺
Mark
----- Original Message -----
From: "John Cowan" <jcowan@reutershealth.com>
To: "Unicode List" <unicode@unicode.org>
Sent: Thursday, October 05, 2000 10:33
Subject: Re: Correct definition for an "isLatin1()" function
> "Rogers, Paul" wrote:
>
> > We're whipping up a little function named isLatin1() that returns true
if
> > the (UCS-2) string in question is "all Latin1".
>
> [snip]
>
> > In other words, should we exclude the C0, C1, and Latin Extended code
> > values?
>
> Including or excluding C0 and C1 is a matter of taste. If you mean
> "strictly containing characters in ISO 8859-1", then they're out.
> If you mean "representable in typical Latin-1 text files", then at least
> C0 is in, and C1 will do no great harm. (Provided your Unicode
> characters don't originate from incorrect transcoding from CP 1252.)
>
> The Latin Extended blocks are definitely out.
>
> --
> There is / one art || John Cowan
<jcowan@reutershealth.com>
> no more / no less || http://www.reutershealth.com
> to do / all things || http://www.ccil.org/~cowan
> with art- / lessness \\ -- Piet Hein
This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:21:14 EDT