saving .pdf file as text?

Knighted_Vorpal_Sword · May 14, 2002, 2:22pm

Is it possible to save a .pdf file as either a text or Word file? If so, how?

RealityChuck · May 14, 2002, 2:24pm

Not directly, but you can hit CTRL/A to select all, and then copy the text to the clipboard. It can then be pasted into a document.

handy · May 14, 2002, 2:39pm

You can also use the adode website to do a free PDF to HTML & cut & paste from there if you want.
It depends on the complexity of the doc. Just search the web for pdftohtml.

handy · May 14, 2002, 2:44pm

Oh, that should be adobe.com

joemama24_98 · May 14, 2002, 2:57pm

There’s a program called GhostView (you’ll also need GhostScript) that you can download that will perform the conversion.

11811 · May 14, 2002, 5:05pm

The only problem with the copy and paste is that you lose the formatting of columns, indents, etc.

It’s not without its benefits, though.

Chrome

Koxinga · May 14, 2002, 9:23pm

I use a software package called Paper Port, which serves as a sort of driver for my Visioneer brand scanner. I first convert the PDF document into a Paper Port document, and then convert again into Word. It’s very convenient, and IIRC it preserves the column formatting.

Bear in mind a potential pitfall accomanying all of the methods mentioned in this thread: if the author of the PDF document did not perform a “paper capture” in this document, then you won’t be able to copy or convert any text at all–as far as your computer is concerned, it’s just a great big (nontext) image.

Koxinga · May 14, 2002, 9:25pm

And of course the best way to get to the text in a PDF document is to use Adobe Acrobat, if you’re willing to spend around $300-$400 for it–you can capture your own documents and do a format-preserving copy and paste.

kanicbird · May 14, 2002, 9:51pm

Paperport may be using OCR (optical character reconition) which it does use in converting scanned images to editable text - actually i’m 99% certain that’s how paperport does it because paperport stores image files not text. So your pdf file would be subject to ocr error

Koxinga · May 14, 2002, 10:13pm

No, OCR wouldn’t enter into it because the characters have already been “recognized”, assuming that the PDF file has already been through the capture routine.

Topic		Replies	Views
.pdf to text Factual Questions	12	1502	February 24, 2009
Adobe .pdf file to MS Word? Factual Questions	6	828	April 6, 2003
Best method to convert PDF to text Factual Questions	14	2842	March 15, 2014
How can I convert a PDF file to a Word document? Factual Questions	12	2987	January 23, 2013
PDF file format Factual Questions	13	994	July 4, 2001

saving .pdf file as text?

Related topics