From: Mark Davis (mark.davis@icu-project.org)
Date: Thu Jun 07 2007 - 10:39:35 CDT
For #1 you can use the ICU implementation. Markus Scherer can tell you more
about it as well.
Mark
On 6/7/07, Harald Alvestrand <harald@alvestrand.no> wrote:
>
> Having failed to find anything, I appeal to this list...
>
> as part of the (slowly moving) investigation into the requirements for
> using RTL scripts in domain names, I have been checking out the
> properties of the Unicode BIDI algorithm.
>
> One problem I have is that there seems to be a dearth of test datasets
> to test an implementation against; my investigation of the Unicode
> "reference" implementation has revealed that the C++ and C
> implementations are basically toys, fit for verifying an algorithm, but
> totally useless for real data; they assign random directional properties
> to the ASCII characters and use that for testing the algorithm.
>
> (I have not looked at the Java one).
>
> Can anyone point me at:
>
> 1) An implementation of the Unicode BIDI algorithm that can take real
> Unicode data and return something that I can verify (either the list of
> characters in display order or the list of indexes to which I should
> remap the characters)?
>
> 2) Some test dataset of "real" (linguistically sensible, not just random
> characters) that has been verified by hand to display as expected after
> running through the Bidi algorithm? (Ideal would be input/output pairs
> for the implementation above, of course)
>
> Any hints are greatly appreciated!
>
> Harald
>
>
>
>
>
-- Mark
This archive was generated by hypermail 2.1.5 : Thu Jun 07 2007 - 10:42:46 CDT