OCR project in Gsoc

Arulalan · March 24, 2009, 6:50pm

Hi to all,

I planned to do project in gsoc… For OCR ( Optimal Character
Recoganisation ) …

That is ,

         If we scanning one full text page from book, it will open

into open office as word format. so that we can edit the page from
scanned text page… I planned to convert scanned letters to words for
Tamil, English Languages… I will try to support few more languages
also…This OCR project will can done by Using Rmagick , i will do
this successfully.

         This is my idea, if any one of you can suggest me and

guide me to do this…

Thank,

Arulalan.

Arulalan · March 24, 2009, 10:53pm

Hi,

What about Google Tesseract???

Harold
escribió:> There are many ways to accomplish this, none of them are easy…

Arulalan · March 24, 2009, 8:21pm

There are many ways to accomplish this, none of them are easy…

There’s ai4r’s backpropagation nueural nets implementation, with a
simple OCR example at http://ai4r.rubyforge.org/neuralNetworks.html

There’s also gnu Ocrad, which I’ve never used:
Ocrad - GNU Project - Free Software Foundation (FSF),
and just found http://gtamilocr.sourceforge.net/ which does OCR for
Tamil characters as well.

I’d be glad to hear other suggestions…