Saved a web page, support files folder gets copied with HTML file. How does this work?

SenorBeef · August 26, 2009, 11:36am

I saved a webpage using firefox. It created both an .htm file with the page’s content, let’s call it webpage1.htm

It then also created a folder - webpage1_files, which contains the images contained within that webpage. Fine, logical enough.

However, I wanted to e-mail the contents of the page for someone else to print out. I wanted to make sure the page looked presentable enough with just the .htm itself, and not the associated images, so I copied the webpage1.htm file to another directory to look at it again. Except the webpage1_files subfolder came with it and was created in the new directory, even though I only copied the single .htm file.

How does this work? How within the windows file system does it know that the webpage1_files subfolder is intrinsically linked to the webpage1.htm file, and needs to be moved with it? Is that data contained within the file somehow? If I attach webpage1.htm to an e-mail message, will the image data be included and when the person on the other end of the e-mail saves it, will it create the webpage1_files subfolder too?

LSLGuy · August 26, 2009, 1:46pm

(Assuming XP) If you open the disk Explorer, then Tools >> Folder Options … >> View tab, then scroll down aways, you’ll see a setting for “Managing pairs of Web pages & folders”. That will let you experiment with the various behaviors Windows offers.

As to how it works inside, I don’t have an technical answer off the top of my head, but I would expect the link to be stored in an alternate file stream of the htm file, just as document properties (author, revision, etc) are.

yoyodyne · August 26, 2009, 2:20pm

The file system just knows that an HTML file and a folder with the same name_files are associated and by default copies them together. It’s simply the names.

If you email the HTML file the _files folder and images don’t go with it.

yabob · August 26, 2009, 2:58pm

You can illustrate this by simply renaming any old file to .htm, saying “yes” to the “are you sure” message, then create a “*_files” folder to go with it, and put anything in it. Windows will then happily copy the folder with the file.

To do what you wanted to do, you could rename the .htm to a .txt, copy it (which won’t cart the folder along), rename it back to .htm and look at it.

It’s a kludge, is what it is. Note that when the browser saves the html file, it has to modify the contents so that image tags and so on in the file point to the companion directory. Then rely on the OS to keep the two together. It would be far more sensible to create an archive format for “save” that the browser knows how to open and display as a unit (could be a modified .zip with some metainformation, much like a java jar, which allows you to manipulate it with zip, if you like). Then, you could actually mail the whole ball of wax to somebody, and the OS wouldn’t have to do anything special.

yoyodyne · August 26, 2009, 3:03pm

Simpson’s did it.

jjimm · August 26, 2009, 3:09pm

Internet Explorer already does this, as a .mht file.

yabob · August 26, 2009, 3:19pm

Interesting. I’m not that surprised that something exists - it seems like the sensible thing to do, but the catch is that it hasn’t become standardized as the “save” operation that everybody’s familiar with in their browser:

Piggybacking on zip rather than mime might be a bit more flexible, though. I’m thinking in terms of having a manifest like a java jar, which could contain a whole translation table for how the resources in the file were packaged from their original URLs, instruct the browser which resource to open by default, support multiple URLs in the archive, and be enhanced with other metainformation which might make sense.

Topic		Replies	Views
What windows OS/filesystem feature lets Firefox do this ? Factual Questions	3	953	November 28, 2008
Uncoupling files folder from a converted Word-Html Document Factual Questions	2	4981	February 1, 2010
How to save to an HTML file About This Message Board	6	709	June 19, 2001
How to save an ENTIRE web page offline? Factual Questions	18	48889	January 15, 2013
Saving a backup e-mail when sent from a company's website Factual Questions	6	608	October 2, 2003

Saved a web page, support files folder gets copied with HTML file. How does this work?

Related topics