Why do some .pdf files print so deucedly SLOW?

A good portion of my day is spent printing the .pdf attachments to e-mails. Some .pdf files print at the same speed as any Word document prints, but some print excruciatingly slowly. As in, print a page, wait ten seconds, print the next page, etc. When I’m printing an 80-page agreement, this really slows down the proceedings, as you can imagine.

Why does this happen? I can’t see any pattern in the image quality of the slowly-printed documents; some are sharp, some are kind of sketchy, it doesn’t seem to matter.

WAG: Even if the document is sketchy, the information that produced said sketchyness is not guaranteed to be less than a sharp document. When printing a word file, and I may be totally wrong, fonts can be programed to use a very small ammount of data to transmit. However, a .pdf is more of a picture, and takes as long to print as a regular picture would. [/WAG]

I work converting, editing, and sometimes wishing I never heard PDF files, but it contributes to my paycheck so I don’t complain too much.

Pygmy Rugger is spot on.
A PDF file is just one highly modified TIFF (a lossless picture format) file embedded per page.

That’s the simple version, there is more to it than that. But if I told you, I’d put you and everyone else to sleep.

One thing I’ve found helps is using a PostScript driver (as opposed to PCL) - PDFs generally spool and print much faster this way.

However. Not always. Sometimes we’ll get PDFs here that print slowly no matter what. Or that yield nothing but “PCL XL” errors (some kind of corruption in the document). Or that I’ve got to crank out the HP Universal print driver to get them to process.

So Cabin_Fever, I’d be happy to hear your advice! I won’t doze off.

A PDF file can contain both text and images. If the person that created it didn’t know that, they might have used image code when they could have used text. Or they were forced to scan text, making it into images. Or the images are grayscale or color when they originally were single-color text. Or they didn’t compress when they could have and the data transmission time suffers.

Image data will always take longer to print than text, and badly converted images can take a lot longer.

I’m saving up everyone’s suggestions to pass along to our IT guys - I’m sure they’ll love that! And Cabin_Fever, I’d like to hear it, too. We legal secretaries are faced with dealing with an exponentially-increasing quantity of .pdfs, and none of us are great computer experts. Any info is welcome.

Amusingly enough I am “the printer guy” at a large Bay Area law firm. I feel your pain on the PDFs - it’s a nice universal format but it sure isn’t perfect. One of our constant itches.

Part of the problem here is that you may be printing a document created in one page description language - Postscript, which is Adobe’s and Apple’s standard - in another page description language - PCL, which is HP’s standard. Have you tried using a Postscript printer?

Heh. Given that the most popular method of exchanging documents between attorneys has become PDFs during the last few years, yours has got to be a very valuable “niche job”. Only five years ago, hell, three years ago, faxing was the way we swapped our documents back and forth. No wonder Adobe has the moola to inhabit that big fat skyscraper I see a block or two away.

I recently updated our drivers for our aging HP LaserJet 4000, I also set everyone’s default print resolution down to 600 DPI from 1,200. I think this has helped, but only time will tell. I guess I’ll find out next time one of the bosses decides to print a 300 page document with everything formatted as a high resolution picture, though I don’t think anything’s gonna help it with that.

Our far newer Xerox WorkCentre 7655 seems to be able to handle all of this stuff like a champ though, so I try to gently persuade them to print big fat PDFs to that instead.

I have to print some large PDF files at my internship job sometimes. I have had success dramatically reducing the print time by lowering the resolution to 300 dpi from the normal 600 dpi or 1200 dpi that is so common. You can’t even tell the difference in resolution for it to matter. Try it out. I can get 30-50 megabyte PDFs to print almost like they are word documents.

The reason this works is because the files you send from your computer to the printer have to be converted to a format the printer understands. Regular text documents are easy but PDFs ballon to an enormous size for some reason. The next time you print, double-click on the printer icon in your lower-right icon tray and note the size of the file that is sent to the printer. For a Word document, this is usually in kilobytes but for PDFs, this is usually many megabytes.

If you want this problem to really go away, perhaps you can get your IT department to upgrade the processor and memory in the printer itself.

Oh yeah, there is likely a hard drive in there too if it is a large all in one machine.

Can you explain this is more detail? I’m curious about why the 2 (PS and PCL) make a difference. I always assumed there was since both drivers are available on most (all?) HP laserjet’s i’ve installed at my hospital. I just always used PCL without really understanding why.

PostScript is a Page Description Language - a real programming language based around a stack and postfix operator notation. It has been used for printing, for displays (Display Postscript used on the NEXT computer), and as a document exchange format (PDFs). Postscript is ascii text.

The simplest postscript file just says

  • render this text

The next level says -
** - here are a pile of curves

  • define them as a font
  • use them to render the text**

Then things get … complicated.

The conversion from raw document (ie a Word file, or a DTP package) can follow several paths.
The first is direct output - the program itself uses internal knowledge to deliver it’s internal representation as postscript. This preserves as much information as possible about the internal data, but may not be optimal in terms of PDF features.
The second is import conversion - this is where the PDF toolset can import a document format (through a native format importer) and output PDF - this usually has very good PDF features, but the file import may not cover all features.
The final approach is Printer Driver conversion. Here the native document program prints the output via a PostScript printer to a file that is wrapped into a PDF. Here, all the display elements (called GDI operations in Windows) are converted to Postscript commands. These can be very efficient (put this string of text as TimesRoman 10pt here) or not (put this character here, this one here and this one there using these glyphs). It depends on the how the source package generates the print output based on the print device capabilities. For example - a drawing package may not clip overlaying primitives and curves, relying on the Postscript processor to do that. On the other hand, it may just rasterise everything and dump big bitmaps - hugely inefficient.

The final problem is Fonts. Postscript has limited default fonts. Some printers have more, and where they can be used, they are. Some fonts can be permanently installed into the printer. Other fonts can be uploaded as part of the print job. TTF fonts do this - they are converted into Postscript curves and rendered as required. Some fonts cannot be uploaded - by tags within the font file (copyright issues) or because they were bitmap fonts to start with. These can be uploaded as bitmap fonts, or all the text is rendered into bitmap glyphs and placed on the page. This causes the postscript/pdf file to blow out to large size.

PCL (HPs Printer Control Language) is similar to Postscript, but is not as feature rich and is more focussed on rasterisation. It is also not a text format.

So - PDFs are not just TIFF files, but they could be. Slow PDF prints usually relate to lots of big bitmaps, or page by page font downloads, or complex, inefficient page layouts.

Also, the fastest way to print a PDF is to a Postscript printer - the Postscript gets passed through. Otherwise, the PDF viewer has to render the Postscript to a display device, and then pass that to a printer device as GDI primitives, that are then rendered to the print device - this can be slow, and the DPI resolution can have an impact. Doubling the DPI setting quadruples the amount of data to send to a pure raster device.

Si

Well put, Si, and a good description of what Postscript is in your post. This is what happens when people scan text documents into PDFs. They look like text files, but they render as graphics. Basically, not much is gained except portability when this is done; scanning to a TIF produces the same effective result.

I use a HP printer with an internal postscript engine as well as PCL, and it works well for both. Even tho it’s an old printer, I suspect the internal rendering is more efficient than an external PC’s postscript interpreter, and it offloads the bulk of the processing to the printer.

Hard to believe, but I see people write documents in Word, print them out, then scan them into grayscale PDFs when they could have printed directly to an Acrobat PDF Distiller. The file size difference between the two end results can be 100:1.

PDFCreator is a really good FOSS (free open source software) PDF Print creator. Really tidy, really easy to use.

Of course, OpenOffice.org outputs to PDF directly. Office 2007 was going to do the same, but licensing issues got in the way (and MS are pushing their own PDF equivalent).

Si