I've still a treasure in this world--A Picture Of Some Text

I’m disputing a bill from my former doctor. We have asked for and gotten a printout of all the charges we ever incurred, but it looks like it was typed up on the old Underwood, and then scanned into a PDF document, taking care that each page is slightly off true vertical.

The result is that I’ve got about fourteen pages of text, but each page is a picture. I subscribe to a PDF conversion service, but the files come back as pictures. Granted, they are now MS Office pictures, but I still can’t edit the text. I’d like to convert them to spreadsheet form so I can work with the numbers, but so far I haven’t had any success doing so.

Is there any way to convert text in an MS compatible picture to actual text?

You need to use OCR - Optical Character Recognition software. There’s lots of different packages out there, but all of them do very poorly with crummy images.

Have you tried just selecting the text with the Acrobat text selection tool? I was surprised to see that it worked for a PDF I got that was clearly made up of scanned images.

You say they’re “MS office pictures”. Not quite sure what that is, but the MS Office Document Imaging tool does a decent job of OCR-ing text out of scans & jpgs and such and creating word docs from them.

If you have Office Pro from 2000 or later, it is probably in your start menu under MS Office >> MS Office Tools >> MS Office Document Imaging.

That will only work if you converted it to a pdf from a MSWord file or something, not from a jpg or whatever made from a scan.

How long would it take you to get on a word processor and transcribe the text by hand?

I hate companies that do things like that. I would call the DR’s office and tell them to send me something else. I’m not getting paid to convert their paperwork into something useable.