Re: [orca-list] Anyone able to OCR a PDF file?

From: Jason White <jason jasonjgw net>
To: orca-list gnome org
Subject: Re: [orca-list] Anyone able to OCR a PDF file?
Date: Wed, 4 Jan 2012 12:13:03 +1100

Janina Sajka <janina rednote net> wrote:

I know people do this on other OS's. Has anyone suggestions on how to do
this in Linux?


I haven't tried it, but here's an outline of a procedure that could work (with
suitable modifications).

Step 1: use pdfimage to extract the images from the PDF file.

Step 2: If necessary, use convert (part of the imagemagick package) to convert
the images into a suitable format.

Step 3: Run your favourite OCR tool.

Step 4: Write a shell script to automate the above.

Follow-Ups:
- Re: [orca-list] Anyone able to OCR a PDF file?
  - From: Halim Sahin

References:
- [orca-list] Anyone able to OCR a PDF file?
  - From: Janina Sajka

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]