Companies use it for on line versions of thier printed manuals. (I just got a bike computer that had a manual but not in English). It was a relatively simple matter to print the manual to PDF and e-mail it me. I think that is the main reason. If you already have a paper manual, making PDF is pretty easy.
I’m not sure how you you would doo the US Passport application in HTML (of couse the for could be redesigned to have the same info but be more HTML friendly) http://travel.state.gov/DS-0011.pdf
Right, the thing is, if you already have anything, making PDF is pretty easy. PDF is simply the best document format that it’s easy to convert to. By comparison, HTML is near the opposite end of the spectrum.
Another advantage is that a PDF document is a single file that can be saved on disk, e-mailed, distributed on CD-R, etc. An HTML document consists of a number of files, one for each image (including icons, bullets and buttons).
PDF is also good for mathematical equations and symbols. HTML doesn’t have an equation display system, so if you want to use HTML to display scientific papers you need to use GIF file for every single special character and equation. There are programs that generate such HTML documents automatically, but the result usually looks poor.
PDF is ideal for forms that need to be printed out and submitted by mail or fax. It’s a lot of effort to set up an electronic submission system, but it’s easy to put a PDF file on the web and say “fill it out and fax it to us.” Actually PDF is ideal for anything that’s meant to be printed out, or as a printer-friendly version of an HTML document.
But I do agree that many people misuse the format. Just last week I went to the official home page for the Tokyo governor’s election to see the complete list of candidates. The “candidate list” linked to a PDF file. It was just a simple list that had been printed out, scanned back into a computer and saved as a PDF file!
Off subject slightly, but at our small daily newspaper, we get ad agency ads in PDF format. Why? Because it doesn’t matter if we have the fonts originally used in the ad or not. The ad will print just fine. And it works across platforms, which is good because many ad agencies are Mac based and we at the newspaper are (unfortunately, IMHO) PC-based.
From what I’ve read, buffer overflow is still a theoretical exploit, but the Peachy Virus is real. Of course, that only affected users who had Adobe’s file creation software installed (not people who only had Acrobat Reader), but it was still a virus spread through PDF files.
Ordinarily, yes, but you can save the entire layout of a page, pictures, text and all, in one file with IE. Just go to the File Menu->Save As, and when prompted with a Save box, set the “Save as Type” to “Web Archive, single file (*.mht).” I’m not sure if the other browsers have a similar feature.
As to the OP, PDF is probably used more than it needs to be on the web. It doesn’t bother me too much, though. I dislike having to wait a couple of seconds for the Reader plug-in to load in my browser, but it’s no big deal. And the text in Acrobat usually looks much smoother and more pleasing to the eye than anything else I see on the web.
Really? Other than printing and scanning? The only thing I don’t like about .pdfs is that some of them are totally locked. OK, you don’t want me exporting stuff or copying text, fine. But why can’t I just add highlighting, or set freakin’ bookmarks? (With the full version of acrobat that is).
Otherwise, PDFs rock, especially for my most common uses: equipment manuals and component datasheets. IC datasheets I n .pdf always look exactly like the old printed databooks (do they even print them anymore?) **and ** they’re searchable. You can’t grep dead trees… or images. You can also easily zoom in on any part of the document.
They’re also easy to create. WTF should I have to learn to - or pay someone to - + php/perl/asp/cfm/jsp script when all I need to do is click “print” and select the pdfwriter? Some of my pdfs really are that simple: load, say, a plot in HPGL and print it to a pdf, because that’s one format everyone can view.
Having a manager who insists that “It has to look the same to everybody.” is not a plus for PDF, it is a minus for the manager. Not at all a relevant argument, especially since …
There are more than one PDF displayers out there. I don’t use Acrobat under Unix flavors (and sometimes not even under MS-OSes when I’m working between systems). So it will not look the same to everybody. It’s not supposed to look different, but things aren’t perfect in the world of computer software (well duh!). It also won’t print the same. I can change all sorts of print settings to change a lot of different effects. I can change default fonts which in turn changes a lot.
PDF files have no security whatsoever. They are no more read-only than a pencil and paper. The readers I use mentioned in 2. above do not pay attention to read-only flags at all.
PDF, like it’s predecessor PS, does have extensive scripting capabilities. You can for all practical purposes write any self-contained program you want (i.e., it is Turing complete). From time to time, flaws have been found that allow Bad Things to happen in suitably written code. These have been fixed. There may be more errors to be found. Not all people have upgraded to fixed versions. Not all PDF readers/printers have the same error immunity. Definitely raises a lot of security questions.
The real PDF stupidity: people who scan in paper docs, and make the PDF file just a series of images, 1 image = 1 page. This is Not Good.
Proprietary format. 'Nuff said.
Given the many abuses possible of PDF, it would be a real good idea if it went away.
It’s certainly true that most techies would rather see things in HTML layout and we can look at an HTML layout on several browsers and say they “look the same” because the basic layout is the same. However, different people have different priorities, goals and requirements. A lot of people want it to look exactly the same to everybody because their job functions (marketing, brand recognition, usability, etc.) require this. Sometimes it’s pointless, but in other cases they have very good reasons. There are a lot of usability experts who study color, font, layout and how they affect branding, and PDF provides them with a tool to meet their requirements. While ftg points out that there can still be variations in the display of PDFs, they’re a whole lot more consistent than HTML browsers and that makes them valuable to people who need these features. Saying offhand that exact layout control is not meaningful is to ignore all the research to the contrary (which is typical of people who do HTML layout, judging by the design of most websites).
PDFs also provide functions that are more familiar to many users. While it is certainly possible with most browsers to save an HTML document to a single file (bundling related images, etc.), it takes much less sophistication to do the same with a PDF. The same goes for other mundane features like searching for keywords in a document or printing a single page out of hundreds. In addition, many people who do lengthy HTML documents do pay attention to the usability experts and break their documents into many linked pages instead of one long HTML file. This helps the user browsing the content, but it makes saving an archival or portable copy much more difficult. Sure the website can provide two copies, bundled or linked, but many don’t. In addition, many don’t bother to create proper relative links so the entire site breaks when you download a local copy.
PDFs also provide a “step in the right direction” for people who want to exchange even worse formats like Microsoft Word documents. I’ve dealt with a lot of companies who provide things like specifications, technical documentation, proposals, etc. in DOC format. This is horrible for many reason including security issues, revision histories, content protection, etc. While it might be nice if these companies provided an HTML version, they’re not willing to take the time or effort to do that, but they can be convinced to print to PDF. Personally, I’d prefer they print to PDF rather than using the automatic HTML export form apps like Word because that output is so broken and invalid as to be almost useless in the real world.
I’m am in no way a great defender of PDFs. I don’t use them much and I’m philosophically against proprietary file formats. However, they are useful in some cases and I see no other options which provide the same combination of widespread support, layout control etc.
Can you elaborate here? Do you object to a single company having control over the format, or do they make it difficult for other software writers to use the format? It’s not like they forbid others to make PDF writers - I haven’t installed any Adobe products on my work PC, but I can use the ps2pdf utility to convert LaTeX documents into PDF and use xpdf to read them.
PDFWriter is easy to use, as you essentially just print to a PDF file. Find me a techie who wants to sit down with a user, explain how to save a Word document as HTML, then go into the HTML document and make it look a little bit less like complete crap.
If you have non-technical people who have to post documents to a company intranet, PDF is the way to go. Fonts, image linking, and page format become a non-issue (most of the time, anyway). The only real problem is that some users will screw up the page orientation and print Portrait oriented docs to a PDF writer set on Landscape (which it will happily do, just to spite you).
If you are sending a contract out to a client in electronic form, it is considerably more convenient and less likely to be screwed around with in locked PDF format than in, say, the raw Word document. And, again, the formatting and fonts are much more likely to come across correctly.
HTML is useless for documents that have to print correctly unless they have no variation in their layout. If you’re assembling a document with ASP (say, from user-entered data), it is a bitch and a half to get it to format correctly. Browsers don’t always do page breaks correctly, for example, and one bad page break wrecks the entire document.
If you format a Word document correctly, Acrobat will also scan it for bookmarks you set during the creation process and create a decent list of bookmarks. Doing that in HTML is much more tedious.
Basically, Acrobat is currently the best solution for documents that have to look good or be printed from a web browser, particularly if the person creating the document is not technically savvy enough to make a full-blown web page.
PDF is unmatched for web presentation of mathematics. Math fonts are not well standardized across browsers and platforms (I’ve had some disasters posting equations on the web that appeared with different symbols on different browsers, and with symbols entirely missing on some platforms). The new HTML “standard” character entities for common math symbols are not widely supported (in my own tests earlier this year, fewer than 20% can be rendered in all major browsers). PDF, however, embeds the fonts used in a document as part of that document. That’s a major advantage for me.
I’ve studied the Postscript (PS) and PDF specifications extensively. PDF has no general-purpose looping construct. It was deliberately removed when going from PS to PDF. PDF does not allow you to create functions (in the usual programming sense of a callable body of PDF code - there’s something in the language called a “function” but it’s much more limited) so you can’t do recursion either. No looping plus no recursion pretty much guarantees that PDF is not Turing complete and is not a particularly powerful scripting language.
If anyone has found a way to use PDF’s limited facilities to do general iteration, I’d be interested in hearing about it.
An aside: It is possible to embed Javascript code inside a PDF document (just as one can embed Javascript inside an HTML document), and javascript is Turing complete, but that no more makes PDF a Turing complete language than it makes HTML a Turing complete language. And from a practical point of view, PDF’s DOM is so much more limited than HTML’s, it’s hard for me to believe that Javascript in PDF poses anywehre near the security issues that it does in HTML.
I’m not talking about basic web pages; I’m talking about stuff like my aforementioned online job application. Using a basic html form that spits out a simple tabled page (for printing) or sends it a plain text (for “submission”, for lack of a better word) seems a heckuva lot more efficient.
Not for you, mind you, but for the IT guy who set up the application system in the first place.
I can see everybody’s point about those occasions where appearance is important; I actually like PDFs for times where a form is meant to be printed out and filled in by hand.
Still, I can’t see how that’s the case most of the time. My employer sends out memos in PDF. They’re literally just single pages, Times New Roman, meant to be read by everybody on the job who all use the exact same system. And they read like this:
No fancy formatting. What’s more, the “memo” page on our intranet is dynamically generated from a list of links to various PDF files stored in an SQL database. So the PDFs are nothing more than giant bandwidth suck.
And I see it used like that all over the web.
I dunno. Maybe I’m too much of a “content is king” type to get it. Also, at home, I’m on dialup, and I run Linux, so I gotta use xpdf instead of Acrobat. So the whole PDF experience probably sucks for me more than most.
Sorry for turning this into more of a rant than a GQ.
However, the PDF scanner for Windows has a cool feature that it will OCR the scanned data and underlay its interpretation behind the image. Although the OCR may approach 98% accurate, it’s still stupid looking if a human reads it, yet if you just provided scanned image, you couldn’t search it. By providing both, a scanned document is 98% searchable and (assuming the scan is clean) 100% readable.
Aside from product manuals and other documents where page numbering or complex page layout is important, I’ve only seen the format used as a cheat when you want to publish something electronically but don’t want to spend the time to render it in HTML. Usually it’s things like reports and papers where the original document was written with a word processor or page layout program, and although it would be perfectly acceptable to render in HTML, it’s additional effort that can be largely skipped by going to PDF.
So far I haven’t seen CNN publish their home page as PDF or anything like that. Just situations where it’s easier to make a PDF than to make HTML.
At least PDF’s can be read on multiple platforms in a form that’s reasonably close to what everyone else sees. I frequently dump Word documents into a plain text editor to get things like driving instructions or Christmas lists … why, oh why?
I’ll also point out that the Acrobat Reader has a terrible user interface, is sloooooooow, awkward, proprietary, and just generally irritating to use.
Examples?
Sure.
The in-browser version tends to take over key combinations that are most important to users of web browsers; for instance, hitting CTRL-N when inside a .PDF doc (which is, itself, inside IE) will bring up a “Go to page X” dialog box. What you wanted to do was most likely open a new IE window.
The “hand” panning tool: slow and awkward. Moving from page to page in a .PDF is surprisingly difficult, and the program has a tendancy to interpret certain movements as “Oh, let’s move back to the beginning of the entire document.”
Slowness. Being a vector format, slowness is somewhat inherent – but it needn’t be this slow. After initially rendering the document, why not simply cache it as an image, or as a combination of text and image, until such time as it needs to be re-rendered at a different resolution? This would greatly speed up panning around .PDF docs.
Anyway, sorry for the hijack, but I’m just not a big fan of the Acrobat Reader. PDF may have its uses, but I’ll agree that its vastly overused, and would be greatly improved if the format were open to other developers to try their hand.
Or not, if the fonts are embedded in the file… -Then the correct fonts will always display, even on a computer that doesn’t have that font installed…
Yea, but stupid people could do that with a regular image, and save it as a bitmap besides.
Yea, but, there’s no other format, open or otherwise, that has similar features that’s so widely effectively distributable. Did you build your car from scratch, or did you buy a car from a giant corporation using proprietary parts?
~