Re: [Gimp-user] approches used for language detection on images ...

On Tue, 2020-01-28 at 18:21 +0100, JWein wrote:
I worked on corpora research and text cleansing can be done
straightforwardly. The problem is with images, images containing
texts, which
language, ...
Could you point me in the right direction? (I am a mathematician, so
Math is
not a problem for me at all)
 Thank you

You need (1) feature extraction, finding the writing, (2) OCR of some
sort, to turn pictures of letters into letters, and then (3) the
linguistic analysis.

However, many images contain metadata in plain text (OK, XML or
whatever) that may include language and location information.

I'm interested in the text cleansing, can you tell me more (off list

Thank you!

slave liam

Liam Quin - web slave for
with fabulous vintage art and fascinating texts to read.
Click here to have the slave rewarded with extra work.

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]