Re: Searching for text in PDF files is wrong



Hi;

this list is for the development of applications with GTK; your question
relates to Poppler, so you should ask on a Poppler-related mailing list or
developers forum, e.g.
https://lists.freedesktop.org/mailman/listinfo/poppler

Ciao,
 Emmanuele.

On Fri, 30 Nov 2018 at 20:56, Радомир Хаџић via gtk-app-devel-list <
gtk-app-devel-list gnome org> wrote:

Hi.

I use poppler_page_find_text() to find text in PDF files. This returns
GList of pointers to PopplerRectangles. Then I use
poppler_page_render_selection() to mark the found text.

What is wrong is that PopplerRectangles returned by
poppler_page_find_text() are incompatible with those that
poppler_page_render_selection() requests, which is why the wrong text
is selected.

I have found that to make those two compatible, I have to do the
following to PopplerRectangles returned by poppler_page_find_text():
1) SWAP(rectangle.x1, rectangle.x2);
2) SWAP(rectangle.y1, rectangle.y2);
3) rectangle.y1 = page_height - rectangle.y1;
4) rectangle.y2 = page_height - rectangle.y2;
But this does not solve the problem because the marked text cycles
between right and wrong again while resizing the window.

I have created a small program that illustrates the problem. Here it
is: https://pastebin.com/h3F56Yv7 (I've also sent an attachment but
last time you didn't get it so this paste is a fallback in case you
don't get it again.)
You ought to supply two arguments when running the program: the
absolute path to a PDF file and the text you want to search for,
respectively. Pay attention to the selected text with and without
lines 54-57.

How can I make the found text to be marked properly? This "workaround"
does not work very well and it is an ugly solution anyway.
_______________________________________________
gtk-app-devel-list mailing list
gtk-app-devel-list gnome org
https://mail.gnome.org/mailman/listinfo/gtk-app-devel-list



-- 
https://www.bassi.io
[@] ebassi [@gmail.com]


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]