Free OCR software?

Does anyone know of any Character Recognition freeware? Where I can scan in a book page and get a headstart on transcribing it?

(Note: this is for a couple of very old public domain titles that I want to give to a friend and, probably, also “donate” to Project Gutenberg.)

If you can scan it into a TIFF, upon opening it (as a TIFF) you’ll get OCR software included. It will turn the scan into a Word doc, although be warned, it’s not perfect and you will need to reformat, perhaps heavily. Also check every word if accuracy is important. But it does save a lot of time if you’d just have to retype it otherwise.

I have done this on WinXP on several different systems. Not sure if it’s universal to all versions of Windows Imaging or just my computers happened to have it.

To expand & clarify what missbunny said, decent OCR is built into Microsoft Office 2003 & later (maybe 2000 also, I don’t remember).

Assuming you have Offfice 2003 or later, …

It’s one of the optional tools that may or may not have been installed depending on who did the install & what SKU of Office you have (student, home, pro, etc).

On the start menu look under Microsoft Office -> Tools -> Document Imaging (the exact name varies from version to version, but this is close).

That program can open TIFFs & make word docs from them with just a couple clicks. If you have a scanner properly set up, it can also pull the scan directly.

For computer-printed text the OCR rate is nearly 100%. For typeset books it’s pretty good provided you can get the pages 100% flat; it does NOT like where the words curve into the binding margin.

For faxes of photocopies of chldren’s handwriting, OCR works not so much.

SimpleOCR has a stand alone freeware OCR prog to demo their OCR SDK. Don’t know if it’s any good. (I installed it, then found I’d got an OCR bundled with something else)
You have to register to download and the free version doesn’t handle handwriting

Dang. I have Office 2001. I’ll try the Simple OCR app. Thanks.

You said in the OP that you’re going to scan the book. Doesn’t your scanner have some variety of OCR software. I bought a fairly cheap (<$100) Canon 4400F recently, and it has OCR software bundled on the CD. Also, if you can find yourself a copy of Adobe Acrobat Pro, it has OCR included.

No, the scanner has no such software. And I have Acrobat regular; Pro is like $500.

Distributed Proofreading has an OCR pool that will accept scans from anyone, and run it thru their high-quality, professional OCR programs, and then send it on to the DP proofreading process and on to Project Gutenberg.

So when you have scans, they will do the OCR for you.

More info is here: http://www.pgdp.net/phpBB2/viewtopic.php?t=4957

Awesome! Thanks!

I did a quick look at SourceForge.net and found 49 references to OCR software. I have to meet some friends for dinner, so I haven’t researched better, but I saw one or two that might fit your needs.

http://sourceforge.net/search/?type_of_search=soft&words=ocr

The people at Distributed Proofreaders who do OCR work (not me, mostly) keep a close eye on open-source OCR projects, and so far, haven’t found one that they recommend.

To quote the site administrator

But often a version that is one or two releases behind the current version works fine for their proofreading needs, and many people obtain such an older version for a very cheap price from ebay or similar sites.

ABBYY is free to try and has 15 free tries. Then to buy, it costs about $200.