I have hundreds of pages of old journals from the nineties, where the digital versions are lost forever.
I really need them as Microsoft Word documents, and I wonder: Is there a to scan the sheets and then in some way or the other, make a Word document of the result?
(I have the other hundreds of pages as Word documents, and would like to index all, etc; which are the reasons I aim for that type of format.)
You need an OCR (optical character recognition). The software that came with my Lexmark scanner/printer is pretty good. Do you have a scanner? Do you know if you have any OCR software?
Are the journals typed or hand written? OCRs will recognize some handwriting, but do better at type. And once you do get them converted, you’ll likely have to go in and manually change alot of errors.
Hundreds of pages would certainly take some time to do.
Most (all?) scanners come with software to read the thing and convert it to text (with or without formatting it like the original). The hardest part will be proofreading it.
The above is if you’re talking about having the stuff on your system for personal use. I’m presuming you’re not planning on anything that will get you in trouble with the copyright police.
Wow, thank you! That was more than I was hoping for - I had no idea it could be that simple (though time consuming, but I’m in no hurry). I don’t have a scanner, but now I will definitely get one (with the software mentioned).
The journals were once written on a Mac in Times New Roman, and then printed, so the software should have no difficulties recognizing the characters.
Often, the OCR software that comes for free with a scanner is worth what you paid for it – little or nothing. It will work for an occasional page or two, but it’s usually a crippled version, not up to heavy use. So you may wish to invest in buying a better OCR package if you’re doing hundreds of pages. And it would probably be worthwhile to pay extra for a scanner with an automatic page feeder.
Distributed Proofreaders, which has thousands of people doing this, has info on their wiki giving their recommendations on OCR software and Scanner Reviews. Most of them seem to prefer Abbyy Finereader software (though many say they do fine with older versions, rather than needing the newest one).
You can also get a lot of info on proofreading from their website. For example, their DPCustomMono font is available for download. It’s ugly looking, but works very well to identify OCR mis-scans (like and scanning as arid).