On 04/26/2008 02:00:01 PM Sat, Albrecht Dreß wrote:
Hi Peter!
[ snip ]
BTW, if I interpret the HTML 4.01 trans standard correctly, the "dir" tag should /not/ be necessary, as the html renderer is supposed to determine and display the proper direction automatically for unicode chars (see <http://www.w3.org/TR/html401/struct/dirlang.html#h-8.2>).
The algorithm is clear for strings with a common direction. The issue is with lines with mixed (bidi) strings. They get ordered according to the context, which in HTML by default is ltr. So in the second line of the sample text, the first Persian string is placed at the start of the line, on the left, followed by the Ascii string, followed lastly by the second Persian string.
Pango, on the other hand, looks at the first strongly directional character in the line, in this case a rtl character, and uses its direction to define the context. So the line is built up from the right instead of the left, which is, I believe, the intent of the author. Afaict, the only way to get a browser to do that is with an explicit "dir=rtl" attribute.
Best, Peter
Attachment:
pgp6mPvNXVKt6.pgp
Description: PGP signature