From: Gregg Reynolds (unicode@arabink.com)
Date: Fri Aug 19 2005 - 09:23:00 CDT
Doug Ewell wrote:
> Gregg Reynolds <unicode at arabink dot com> wrote:
>
>
>>Anyway the secret purpose is to spread the *nix environment to
>>windows via cygwin. ;)
>
>
> I hope that winky-smiley was for real, because advising a user to change
> his operating environment -- overtly or covertly -- in order to make a
> basic function of Unicode work will only serve to give the wrong
> impression about Unicode.
>
The wrong impression being that, uh, the user couldn't find what he
needed at the Unicode site? ;) Anyway, who advised anybody to "change
his operating environment"? I just recommended a toolset.
Unicode is as complicated or as simple as it needs to be. I don't think
recommending a toolset implies in any way that said toolset is necessary
to "make a basic function of Unicode work". I happen to think the POSIX
toolset (which is what cygwin is, not an "operating environment", unless
you consider the tools in the toolbox to constitute such an environment)
offers the "best" approach. Best tool for the job, that's all. And
iconv isn't the only reason to look into cygwin for Unicode text
management. There are dozens of other unicode-enabled tools that make
life much easier, and that Windows programmers tend not to be aware of.
Simple example from this week: take a delimited file of 17K lines of
Arabic text in CP1256, example the text in a few columns and remove any
personal names. With POSIX tools it is trivial to convert to utf-8, cut
out the columns, break into words, sort, and remove duplicates; examine
the results by eye to pull out the personal names into a separate file;
then the next time use the names file to match against the data and pull
out names. No programming, just stringing together a few commands.
-gregg
This archive was generated by hypermail 2.1.5 : Fri Aug 19 2005 - 09:24:24 CDT