Directionality of Arabic combining characters




Hi Dov,

In fribidi_tables.i, you have:

  {0x0640, 0x064A, FRIBIDI_TYPE_RTL },
  {0x064B, 0x0652, FRIBIDI_TYPE_ON },
  {0x0660, 0x0669, FRIBIDI_TYPE_AN },

Where the U+064B - U+0652 range is the Arabic vowel marks.
I believe this is incorrect:

 - The Unicode standard has them as TYPE_RTL
 - It produces the wrong behavior. Consider the case
   where the base direction is LTR, then the
   sequence:

 english text ARABIC <vowel_mark> more english text

will get reordered as:

 english text CIBARA <vowel_mark> more english text

Instead of the proper:

 english text <vowel_mark> CIBARA more english text


The other possible way to handle this would be to
say that combining marks always are handled in a block
with the base character, but the description of
the bidirectional behavior in the standard seems
to imply the simple treatment above.

(I suppose the same applies to the Hebrew points and
cantillation marks from U+0590-U+05c4. The Unicode standard
simply has the entire range from U+0590-U+065f as RTL.) 

Regards,
                                        Owen



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]