From: D. Starner (shalesller@writeme.com)
Date: Thu May 13 2004 - 01:58:18 CDT
> > If the input is in
> > multiple (Indic) scripts, and let's assume that the audience
> > (which may be a single person just asking for an sorted list
> > of his/her files) can read the Indic scripts used, it may be
> > helpful to interleave. (But I will not push this.)
>
> Now let's asume that person can't read all the scripts. Then they
> get lots of unintelligible garbage in their sort. This, and the upside is
> "may be helpful". Which side did you say you're making the case for?
Garbage in, garbage out. If you didn't want unintelligible garbage in the
output, you shouldn't have put it in the input, and no sort procedure is
going to remove it. The user that can't read all the scripts is not an
interesting person here, because it doesn't really matter to them if the
garbage is interfiled or at the end.
What's the actual usage pattern for multi-lingual sorts? Possibly the most
common case, IMO, is a collection of Serbian or Tibetan or Sanskrit or Hebrew
data in mixed scripts; the most convenient thing to do there is to interfile.
Another common case is computer directory listings in English & some other
language, which should probably be seperate; but that's Latin, which is out
of the scope of this discussion. Again, a Serbian user would probably like
Latin and Cyrillic interfiled, and someone working on paleo-Hebrew or Sanskrit
would probably like their characters interfiled. I've never seen a multi-script
index; is there any real legacy behavior here, besides computer programs which
were forced to do something?
-- ___________________________________________________________ Sign-up for Ads Free at Mail.com http://promo.mail.com/adsfreejump.htm
This archive was generated by hypermail 2.1.5 : Thu May 13 2004 - 01:59:08 CDT