Re: Will Beagle index PDFs?



Hi -

> I think that files that have been "scanned" and converted to PDF without
> being "OCRed" are like big images inserted in pdf headers, so I don't
> know how is that indexed in Lucene.
> 
> Anyone?
> 

Some of the OCR software I've used allows you to save the result as a
pdf to preserve most of the visual elements (like graphics or
watermarks), but still converts the text elements to text.

-- Matt





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]