On 24 Apr 2002, at 14:38, Jungshik Shin wrote:
> We don't expect text tools
> to work on files in UTF-16 the same way as we would expect them to work
> on files in UTF-8 or other ASCII-compatible encodings.
   But it might well be desirable to have UNIX-like tools that work on UTF-16  
files, in a way analogous to the way that the existing tools work with ASCII. 
The underlying philosophy of the UNIX toolset can clearly be applied with equal 
success in a world where "plain text" is UTF-16 everywhere:
      cat16 f1 f2 f3 f4 | sort16 | uniq16 | sed16 '....' > f5
   As we see, we need different versions of all the text tools. This is 
inconvenient, but not an insurmountable problem. (Maybe they could even be 
derived from the same source code as the 8-bit varieties. Maybe some future 
system will have *only* 16-bit text tools.)
   But a BOM in every UTF-16 plain text file would make this completely 
hopeless. If we ever think we might want to do UNIX-style text processing on 
UTF-16, we have to resist that!
        /|
 o o o (_|/
        /|
       (_/
This archive was generated by hypermail 2.1.2 : Wed Apr 24 2002 - 20:57:53 EDT