Syllable seperation in Indic module



Hi all,

This has been discussed here, but I couldn't find a good solution,
hence this post.

In Sinhala, when two consonants are separated with a Virama, they are
not grouped into two syllables, unless Virama is followed with a ZWJ.

    consonant, virama, consonant, ... => (consonant, virama), (consonant, ...)

    consonant, virama, ZWJ, consonant, ... => (consonant, virama, ZWJ,
consonant, ...)

However, in indic-ot-class-tables.c, the state machine always
continues to the next consonant after the virama, irrespective of the
presence of ZWJ.

We tried to workaround by marking Virama character as a mark, but then
all the other things break: like vatu.

What's the best approach to solve this problem?

        Anuradha

-- 

http://www.linux.lk/~anuradha/
http://www.gnu.org/philosophy/no-word-attachments.html



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]