Well, that was one of the more frustrating experiences I’ve had with a computer.
Following Mort Furd’s instructions I extracted the links from each of the PDFs and did a find-and-replace to turn them all into the correct URL. I then downloaded Flashget. Unfortunately I couldn’t get FLashget to pickup all of the links. I experimented by taking away the extra text leaving only the links and it did work, but there were too many links to make this a practical proposition.
I suspect that Flashget would have done what I wanted but my patience with its rather brief help file was waning. I decided to turn all the links into working HTML links by doing another find-and-replace and placing the bare minimum of tags at the start and end of the page. Although I used to have a good working knowledge of basic HTML, I couldn’t remember much, I initially tried using square brackets! Opera kept opening my feeble attempts as pure text displaying all of the HTML source. Obviously my HTML was wrong.
Next step was to open a blank page in a WYSIWYG editor, go to the source code and place all my links in the body. I had relearnt enough now to know that my links were correctly tagged.
Unfortunately, when I saved the file, Netscape Composer nicely placed a few extra bits in my links which caused them to link to the local computer. Specifically I found it had added %3F at the start and end of each of my links. Now, I don’t know what this little group of characters really does, but I suspected it was the cause of my links being screwed when I opened the page in a browser.
So, I copied all of the source over to Word again, intending to find-and-replace all those %3Fs. Although Word displayed it as a web page, I found the menu item that lets you view the source. I did that and it opened up a little HTML editor which had a find-replace function.
With the changes made, I saved, I opened, and it worked! Now I could open it in IE and use Flashget to download all links on the page.
I thought about everything I’d done and tried to eliminate as many steps as possible so I could quickly do the same with the rest. At this stage I still had a problem with my own HTML coding and so I needed to use Composer or something similar.
On my attempt with the next PDF it all worked smoothly, in fact I used Mozilla’s Composer rather than Netscape’s and it didn’t add those %3Fs so it was all done in a couple of steps.
“This is getting better” I thought.
When I tried the last one though, I ran into problems. Initially I shunned Composer and used Word to get me a blank web page in which to insert my links. However, Word insisted on removing everything between the quote marks in my links when I saved. I went back to Composer. It decided it preffered to include %3F in my links after all. I went through and removed them with Word and resaved, but when I opened the page, the links still pointed at the local computer. I spent a lot of time comparing one of my successfull pages with this one and couldn’t find anything. I then tried pasting my links from the good page into my latest failure and it worked. So there was something wrong with the actual links. I compared the links character by character and they were identical to the good ones. At around the peak of my frustration I noticed the quote marks were different. The good links had basic straight up and down " and the bad ones were slanted.
Back in Word I check the font, it’s the same as the good example. I try retyping some of the quotes but they’re still the slanted ones. Grasping at straws I try different fonts, no change. I EVENTUALLY note that there is a little box in the Auto Correct window. It resides under the words Replace as you type. It’s ticked. Beside it is this harmless looking phrase, “Straight quotes” with “smart quotes”.
I can now get my links working consistently. I’ve even found what the minimum in HTML tags are to get everything working, so I don’t need to shag around with a WYSIWYG editor again.
All my PDFs are downloaded and working correctly.
Although it took longer doing this than if I’d just gone through the Contents pages and downloaded each page manually, the document gets updated every few months. Next time I’ll be able to extract the links, do a find-and-replace, add <HTML><BODY> at the top and </BODY></HTML> at the bottom of the page, and it’ll probably take about 5mins for all of them.
Thanks to you guys for your help which did, despite my lack of understanding of my computer and its programmes, get me the result I was looking for.