beagle-extract-content question: PDF docs
- From: Stephan Hegel <stephan hegel gmx de>
- To: dashboard-hackers gnome org
- Subject: beagle-extract-content question: PDF docs
- Date: Sun, 11 Nov 2007 05:36:39 +0100
Hi all,
I've got a few PDF docs where beagle cannot find any contents
in it:
> beagle-extract-content gebackene.zucchini.pdf
Debug: Loaded 52 filters from /usr/lib/beagle/Filters/Filters.dll
Filter: Beagle.Filters.FilterPdf (determined in .12s)
MimeType: application/pdf
Properties:
Timestamp = 2007-11-11 04:21:32 (Utc)
dc:appname = ESP Ghostscript 8.15
fixme:page-count = 1
Content:
(no content)
HotContent:
(no hot content)
Text extracted in .02s
The files were created by printing a web page to a postscript file
and convert it with the ps2pdf utility to a pdf file. Xpdf and the
Acrobat reader can display these PDFs fine.
And no, it is not just an embedded image as the file size is just
40k for one A4 page (with one image in it) and I'm able to select
parts of the text in the Acrobat Reader.
A known issue / limitation ?
Beagle is 0.2.18 from openSUSE 10.3.
Kind regards,
Stephan.
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]