.pdf to MS Word.

I’ve got a small number of pdfs that I would like to have in Word format. Can anyone recommend a good conversion utility? I want freeware or modestly priced shareware, I don’t want to pay $70 for something I’ll use only once. A demo version that expires after a few days would be fine, as long as it’s fully functional.

I’ve searched and found a few but none I’ve tried are suitable, having one or more of the following problems:
[ul]
[li]limited to 4 pages[/li][li]output is text only, does not include the graphics from the pdf.[/li][li]outputs a series of forms text objects, rather than editable word processor text.[/li][li]demo version inserts random characters into the output.[/li][li]$70 registration fee required.[/li][/ul]

anyone?

Google the phrase “.pdf to text”, I got loads of hits from various freeware sites.

Here’s a few to start you out.
http://www.a-pdf.com/text/
www.download.com/Free-PDF-Text-Reader/3000-10743_4-10373188.html
http://www.softpedia.com/get/Office-tools/PDF/PDFTXTPDF-to-Text.shtml

I haven’t used these , they’re just among the first hits.

http://www.hellopdf.com/

I haven’t used it before but it looks promising, doesn’t look like it’s limited either. It was Firefox’s auto-redirect when putting “pdf to doc converter” in the address bar.

I happened to see this one (googled “.pdf”).

http://www.cutepdf.com/Products/CutePDF/writer.asp

I haven’t tried it, but I did use cuteFTP a while back and was very pleased with it.

FWIW, I usually just cut and paste text when I need to quote it from a .pdf. Only I just tried to C&P a page with graphics on it into Word and the graphics did not transfer. (I did get all of the text by using Select all, and I did not try copying and pasting the image itself.)

I used to have that and at least the old build only converted TO .pdf, not from. A cursory glance of the current page makes me think the new version is the same.

I used something called, I think Able2Extract, which supposedly could be set to copy text only, enhanced text (I think that was bold, italics, underlining, etc.), or graphics. But if it did graphics, it did the whole page as a graphic, so the text could not then be edited. With the other options the graphics came out as either blobs or boxes.

It wasn’t free, but it wasn’t terribly expensive, either. The trial version, which we used first, would only do four pages at a time and IIRC had some kind of watermark that had to be dealt with.

I believe that a new version of Open Office should be able to do it. We use the Star Office version, and I’ve been told the new release will allow you to edit pdfs, which should allow export to Word. I haven’t had time to try the beta yet, though.

The pickings are very few for this format. Apple has in the past had programs that go from pdf to other non protected formats removed from public distribution.

You can find many that will convert to pdf format, because apple supports you needing their reader.

If you do find a free program that does a good job at a conversion, please say so in this thread.

Nobody need tell me to use Fox It reader, because it’s not a free converter, it’s a free reader.

I’d likely use OpenOffice or Abiword. While I haven’t used the following, you might want to look at the Free File Convert web utility or one of the options listed here.

Guys, not that I don’t appreciate you Googling the phrase for me, but I noted in the OP that I already did so. I tried several, and found they don’t work how I want.

I’m looking for recommendations from people that use a particular piece of software. If you haven’t actually tried it, there’s no point linking to it.

I just downloaded hellopdf.com. The free version only converts 3 pages and a maximum of 3 conversion per document. So I’m stuck. The pay version is 39.95. I suspect the other “free” converters offer the same deal…

Likely. Keep in mind that one of the main, uh, raisons d’être of Acrobat is to make such things impossible; to “lock” a document and make it uneditable.

A slight clarification…

Acrobat was intended for make the formatting of a document–pagination, layout, font–constant, no matter where it might be viewed or printed. It is possible to inhibit copying of text out of a PDF, but that wasn’t the main purpose.

Ah, my apologies. I took “freeware or modestly priced shareware” as a program to install; the web utilities were not only unknown to me, but I didn’t think they were covered by the OP.

As I said, I’ve used Abiword and OpenOffice to go from .pdf to .doc, but it wasn’t the easiest thing and I’m not sure they’ll do all you want them to. Again, my apologies for the unwanted links.

The reason that there are very few tools to do this is that it is very, very difficult due to the nature of a PDF. It is not another document format (like Word or Wordperfect or OpenOffice). It is a set of instructions for the printing (or display) of a page. And the generation of the Postscript is geared towards optimal printing, not editing. This means that individual characters may be individually located on the page, thus breaking up words. Graphics may be Postscript vectors, compressed bitmaps, and other things. Trying to work backwards to convert page output to something useful can be nearly impossible, as word processor primitives do not map to the postscript ones.

Thus there are plenty of restrictions and few comprehensive solutions, and they cost money. If you want the tools, you will probably have to pay - or wait for OpenOffice 3.0.

Si

Have you tried the Adobe full Acrobat package? Although I don’t use it, I understand it has many features and I think this is one of them. Could cost a few bucks, or whatever you call money down there, but might be useful for other tasks, too.

Postscript stores text as text internally, and PDF is very close to Postscript. It shouldn’t be hard for a programmer conversant in the language to extract it, and it certainly is not impossible.

No, that’s wrong, but it should be able to extract it.

Not necessarily. Sometimes text may be converted to a postscript path if the driver decides to. But the main problem is that the text elements in the PDF iare not necessarily in word order. And spaces don’t have to exist if the word spacing has been modified by justification. So just finding the letters does not help, you need to build up the whole page and scan along the lines to figure out the words and spaces - a bit like OCR.

I am not saying that the problem is insoluble, but it really is hard to solve. So few people have done the job, and expect some money for their efforts.

Si

This one claims to be free: http://www.hellopdf.com/