Re: Subtitle extraction via OCR



Hi Liam.

Liam R E Quin, 31.07.2007 02:03:
On Tue, 2007-07-31 at 01:13 +0200, Mathias Brodala wrote:
I've yet to use any open source OCR package that has been less effort than
rekeying -- commercial OCR software is workable though.
Hm, is libgocr that bad? (As an example.)
Yes.

There's one from Google that may be slightly better,
tesseract, as it's based on what was originally proprietary
code written in the 1980s.
http://code.google.com/p/tesseract-ocr/

Yeah, I heard of that one before.

If you have some samples I can run them through Abby FineReader and also
gocr (and maybe tesseract) if it is of use to you.

Yes, please. Itâd be interesting to know how well these tools perform. Take this
example:

http://download.noctus.net/mov/shny06.avi

Itâs one of the simpler parts.

If you have only one font, you might be able to do well with some
pre-processing, and by training the software.

Yep, SubRip seems to be able to do this. But no GNU/Linux app â


Regards, Mathias

-- 
debian/rules



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]