From: Philippe Verdy (verdy_p@wanadoo.fr)
Date: Sun May 22 2005 - 13:18:48 CDT
You have certainly used a legacy font that mapped hebrew letter glyphs on 
top of symbols or on ISO-8859-1 characters that all have a strong or weak 
LTR directionality. For this reason, the Bidi algorithm did not apply to 
these old documents.
When you replace the codepoints by normal Hebrew codepoints, Word 
consistently applies the Bidi algorithm to render them, and so the visual 
order is now reversed. So the text is effectively encoded with a "visual" 
order instead of the "logical" order.
So you'll have effectively to reverse the effect of what the BiDi algorithm 
makes now:
- This means not only reversing the Hebrew letters,
- but also handling the case where characters with weak directionality (like 
punctuations) are also swapped now,
- and possibly mirrored.
To know exactly what to do, you have to study what the BiDi algorithm does, 
and then adapt the encoding so that the standard BiDi reordering (and 
mirroring) will generate the correct visual order and characters 
orientation. The conversion will sometimes require inserting some BiDi 
controls to avoid that these characters with weak directionality be 
reordered or mirrored
(Be careful about the effect of mirroring: the BiDi algorithm changes the 
orientation of some characters like parentheses, so if you just swap 
characters, the parentheses may look incorrect: you'll have to change their 
orientation by substituting the codepoints by the corresponding mirrored 
character).
There are tools that perform that notably for Hebrew and Arabic: i.e. 
converting texts from visual to logical encoding order. But I don't know one 
that works with Word documents: so you may need to create a conversion 
macro...
----- Original Message ----- 
From: Raymond Mercier
To: unicode@unicode.org
Sent: Sunday, May 22, 2005 6:45 PM
Subject: hebrew font conversion
[This is really a question for the Hebrew Computing Forum, but I have tried 
there and drew a blank.]
The problem is that I composed many documents in Word using an ad hoc Hebrew 
font, and wish to convert to Unicode.
When I run a macro that exchanges the old codepoints for the U+Hebrew 
points, the characters in each word are reversed. I have tried to cure this 
by writing another macro using StrReverse() . Sometimes this works, but 
there are problems - especially with tables.
Does anyone have experience of this, and or/a solution ?
I will have the same problem with Arabic Word docs.
This archive was generated by hypermail 2.1.5 : Sun May 22 2005 - 13:19:48 CDT