Fixing RTL search in Evince


There's a long-standing bug [1] regarding searching RTL (Arabic, Hebrew) strings in evince. This is because text is laid out "visually" instead of "logically" in PDF, and that is what poppler searches. This is indeed a poppler limitation but as the discussion [2] on explains, it's fiendishly hard to do correctly (essentially, reconstruct logical text from visual).

A simple workaround that would help Evince users is to change the search string itself from logical to visual before searching. This has no effect on LTR text but flips RTL text so it is searched correctly. It's not the ideal solution but it solves 98% of the problem, and is infinitely better than the user performing the logical-to-visual translation manually when typing in the search string. I can point out some of the remaining issues if necessary.

I have a working prototype and can finish it and send a patch if this solution is acceptable. It generally amounts to replacing the line

job->text = g_strdup(text);

in the function ev_job_find_new() with something like:

job->text = ev_bidi_reorder(text);

where ev_bidi_reorder() is a small helper function that does logical->visual using libfribidi.




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]