Searching for text in PDF files is wrong
- From: Радомир Хаџић <radomirhadzic46 gmail com>
- To: gtk-app-devel-list <gtk-app-devel-list gnome org>
- Subject: Searching for text in PDF files is wrong
- Date: Fri, 30 Nov 2018 21:56:12 +0100
Hi.
I use poppler_page_find_text() to find text in PDF files. This returns
GList of pointers to PopplerRectangles. Then I use
poppler_page_render_selection() to mark the found text.
What is wrong is that PopplerRectangles returned by
poppler_page_find_text() are incompatible with those that
poppler_page_render_selection() requests, which is why the wrong text
is selected.
I have found that to make those two compatible, I have to do the
following to PopplerRectangles returned by poppler_page_find_text():
1) SWAP(rectangle.x1, rectangle.x2);
2) SWAP(rectangle.y1, rectangle.y2);
3) rectangle.y1 = page_height - rectangle.y1;
4) rectangle.y2 = page_height - rectangle.y2;
But this does not solve the problem because the marked text cycles
between right and wrong again while resizing the window.
I have created a small program that illustrates the problem. Here it
is: https://pastebin.com/h3F56Yv7 (I've also sent an attachment but
last time you didn't get it so this paste is a fallback in case you
don't get it again.)
You ought to supply two arguments when running the program: the
absolute path to a PDF file and the text you want to search for,
respectively. Pay attention to the selected text with and without
lines 54-57.
How can I make the found text to be marked properly? This "workaround"
does not work very well and it is an ugly solution anyway.
[
Date Prev][Date Next] [
Thread Prev][Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]