Re: Is there a UTF that allows ISO 8859-1 (latin-1)?

From: Gianni Mariani (gianni@corp.webtv.net)
Date: Wed Aug 26 1998 - 14:14:14 EDT

Next message: Kenneth Whistler: "Re: Is there a UTF that allows ISO 8859-1 (latin-1)?"
Previous message: Kenneth Whistler: "Re: Is there a UTF that allows ISO 8859-1?"
Maybe in reply to: Yung-Fong Tang: "Re: Is there a UTF that allows ISO 8859-1 (latin-1)?"
Next in thread: Kenneth Whistler: "Re: Is there a UTF that allows ISO 8859-1 (latin-1)?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

John Cowan wrote:
>
>
> In addition, in some applications those processing inefficiencies are
> not present, thanks to the self-segregating nature of UTF-8. For
> example, the Plan 9 "fgrep" program (which searches a stream of text
> for the presence of one or more of a list of strings) need never convert
> to UCS format at all; the strings are UTF-8 and so is the text, and
> in fact the program looks the same as the corresponding 8-bit program.
>

This is not completely true, fgrep to be Unicode compliant must
deal correctly with combining characters. e.g.

è ( <latin small letter "e" with grave "`" U00E9> ) is exactly
equal to

So, grep should match <U00E9> with <U0065><U02CE> to be truly
Unicode compliant.

See section 2.5 of "The Unicode Standard 2.0" !

Not to say it isn't a good start with fgrep ...

Next message: Kenneth Whistler: "Re: Is there a UTF that allows ISO 8859-1 (latin-1)?"
Previous message: Kenneth Whistler: "Re: Is there a UTF that allows ISO 8859-1?"
Maybe in reply to: Yung-Fong Tang: "Re: Is there a UTF that allows ISO 8859-1 (latin-1)?"
Next in thread: Kenneth Whistler: "Re: Is there a UTF that allows ISO 8859-1 (latin-1)?"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] [ attachment ]
Mail actions: [ respond to this message ] [ mail a new topic ]

This archive was generated by hypermail 2.1.2 : Tue Jul 10 2001 - 17:20:41 EDT