[orca-list] New a11y tool "ocrpdf"

From: chrys87 <chrys87 web de>
To: orca-list <orca-list gnome org>
Subject: [orca-list] New a11y tool "ocrpdf"
Date: Thu, 12 Nov 2015 22:22:50 +0100

Hi List,

i some work on a new little tool that make it possible to quick access"Image" PDF files (like scanns).

its in a early state for now.

it could (try to) detect layout and pageorientation (thanks to tesseractand ocrad :) ), cut the PDF file in images and OCR every image viatesseract. it works multithreaded to be more effective. It also shouldbe able to OCR any other picture file (not just PDF)


example:
ocrpdf -f /path/to/your/file.pdf -l deu

ocrpdf -h

a small window in ocrdesktop style with the content pops up.
AUR ( a little out of date, i will update soon):
https://aur.archlinux.org/packages/ocrpdf/
GIT:
https://github.com/chrys87/ocrpdf

depencys:
GTK3
pythonmagick
python-pillow
python-tesserwrap
tesseract
tesseract-yourlanguage

My girlfrind use it really successfull for a month now. so i decide tomake it public to you.

you also could add this for example to nemo via an action for example(so you could just ocr via contextmenu):


[Nemo Action]
Active=true
Name=OCR Datei %N
Comment=OCR File
Exec=ocrpdf -l eng -f %F
Quote=double
Selection=S
Extensions=pdf;jpg;tiff;png;jpeg

cheers chrys

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]