How to save an ENTIRE web page offline?

Anise · January 13, 2013, 5:59pm

Hi all! (waves)
So here’s what I’m trying to figure out-- what’s the best way to save an entire web page and all its internal links and files for offline use? It’s a self-uploading archive, and I really want to be able to access the database of all my fics and reviews while on vacation (and probably almost never having wifi access.) Especially the reviews. I don’t care about images anywhere, including the front page (which seems to be about all that is saved with Firefox’s File-save web page option.) There are very few images anyway-- almost everything is text. Would I need to save the whole database, every file and fic, etc? Is there a way to do this?

Many thanks in advance to all the smart people here!

zombywoof · January 13, 2013, 6:05pm

I use this.

Sunspace · January 13, 2013, 6:48pm

I’m going to have to try that.

goldmund · January 13, 2013, 6:53pm

Since you’re a Firefox user, check out the Scrapbookextension, which is made for exactly this purpose.

Anise · January 13, 2013, 6:54pm

Thanks, but Sitesucker seems to only work with OSX. I am not cool enough to have a Mac. Is there anything that people would recommend for Windows?

Reply · January 14, 2013, 1:37am

And if it’s downloading too slow for you, disable its speed limits.

AaronX · January 14, 2013, 1:49am

If you don’t need Javascript or Flash, I’d print it as a PDF.

Absolute · January 14, 2013, 1:55am

You can download Safari for Windows from Apple for free. It has a “Web Archive” format that saves the entire page, plus all resources, images, etc. for offline viewing.

EDIT: Never mind, I didn’t realize you wanted it to follow the links too. Web Archive will only work for a single page. Ditto for saving a PDF.

eldowan · January 14, 2013, 3:18pm

I use wget: here’s a windows copy of it.

I’ve used it to make full backups of internal websites, including download files, documentation, etc…

It even keeps things in relative link format, so that you can run the site from your hdd or thumb drive.

hogarth · January 14, 2013, 3:20pm

This is what I’ve used in the past. Worked fine.

Kenm · January 14, 2013, 4:39pm

Firefox’s save choices are Web Page, complete; Web Page, HTML only; Text Files; and All Files.

Saving a page using “Web Page, complete” saves everything in a folder, including links.

But if the page saved is dragged into the same folder as the rest of its files, the pathways are severed, breaking the links. The html page must remain separate from the folder when it’s time to view the page.

Anise · January 14, 2013, 10:44pm

I tried HtTrack and ran the program for 13 hours. Sorry-- I don’t think I’m smart enough to make it work.

(hunts unsuccessfully for brain) (cannot find)(I’m SURE I saw it on that MRI last year…)

Anyway. I’ll try wget, but this entire project may not be approved of by the Fates-- in which case, I’ll just work on it when connections are to be had.

Reply · January 14, 2013, 11:31pm

It’s probably one or more of these:

You’re digging too deep (having it download the page, anything it links to, anything THOSE pages link to, etc.)
You’re having it go outside the starting domain
You’re restricted by its speed limits (which, by default, is 25 KB/sec, I think, when a modern connection is easily capable of much more)

Also, is this site a dynamic site (served from databases?) If so, you won’t be able to easily mirror it from the client side unless your spider performs all the searches you would…

Which website is it?

goldmund · January 14, 2013, 11:31pm

Not sure if you missed it but the Firefox extension I posted earlier should do the trick.

https://addons.mozilla.org/en-us/firefox/addon/scrapbook/

njtt · January 15, 2013, 12:37am

The “official” way to do it, with Firefox, would be to use the Mozilla Archive Format add -on.

deltasigma · January 15, 2013, 3:20am

Except how do you read that? I tried saving something in maff format once and I don’t think firefox was able to read it.

njtt · January 15, 2013, 4:38am

:dubious: Very easily. Double click on the file.

I just did it with this page. I saved it to a file in MAFF format, then double clicked on the file and it loaded right up in Firefox. No problem.

Maybe you need to have the MAFF extension installed for it to load, as well as to save. I don’t know. Anyway, the MAFF extension also allows you to save in other formats such as MHT (the one used by Internet Explorer).

deltasigma · January 15, 2013, 5:04am

I tried it just now and it did in fact work so obviously something was amiss. I mean I was able to save it in that format so clearly I had the extension loaded. Why it wasn’t recognized, I don’t know. Maybe I tried to open it from windows explorer and there wasn’t an appropriate file association. This time I opened it from firefox.

Anise · January 15, 2013, 5:39am

I really will put in some more time on this at some point. But after trying one thing and realizing just how complex this can be, I honestly think that I just don’t have enough free brain cells for it right now. Thanks to everyone who answered!

Topic		Replies	Views
A way to capture a webpage and first layer of links? Factual Questions	3	810	December 5, 2004
Software to save a website Factual Questions	3	682	December 29, 2008
Making web pages available offline (IE) Factual Questions	4	1999	August 25, 2011
Recommend an offliner browser/automatic downloader In My Humble Opinion	6	972	December 24, 2007
How do I save a webpage? Factual Questions	4	815	August 3, 2006

How to save an ENTIRE web page offline?

Related topics