On Fri, Feb 15, 2002 at 02:57:46PM +0000, David Hopwood wrote:
> Not having to add a few more lines of code to grep and sed is a good
> trade-off for a 50% penalty in encoding efficiency for Indic & Southeast
> Asian scripts, Katakana, Hiragana and a few others? I don't think so.
Not complicating every program that does searching on the system? Having
every program just work, instead of having bug reports show up every so
often for the next few years? I happen to agree with you, but I think
the consequences are more important than you imply.
> In another post, you wrote:
> > [...]
>
> If "foo" is a US-ASCII string, "grep foo file" will work fine with any
> US-ASCII-superset charset for which non-ASCII characters do not use
> bytes < 0x80, including the hypothetical one I described, with no
> possibility of a false match.
But this is a different context. It won't work, if grep tries to stick a
BOM on the output.
-- David Starner / Давид Старнэр - starner@okstate.edu Pointless website: http://dvdeug.dhis.org What we've got is a blue-light special on truth. It's the hottest thing with the youth. -- Information Society, "Peace and Love, Inc."
This archive was generated by hypermail 2.1.2 : Sat Feb 16 2002 - 13:56:57 EST