Is it possible to recover a file from the Windows swap file?

Suppose I edit a Word file on Windows 7. I close the file. Suppose further that Windows has put that file from RAM into the paging file at some point and it’s still there at the moment I close Word. Is it possible to recover that Word file from the paging file? This is a question about security, whether someone else using the same computer could recover my sensitive data by hacking into the paging file. I assume a reboot will wipe out the paging file.

This is a hypothetical; I am just wondering about how the swap file works and not really trying to protect data.

Anything that has been written to the hard disk is recoverable. The page file is like any other file in this sense: deleting it does not remove it from the hard disk. If the ones and zeroes that comprise your Word document were written to any part of the disk, someone determined enough could find them, unless they have been replaced with something else. In other words: you can’t count on a reboot to wipe out the page file data.

There are utilities out there that will overwrite your hard disk so that software-based forensic utilities (like enCase, which a lot of law enforcement agencies use) will not be able to recover the data, but you should never assume any of it is bulletproof.

Hell, a determined enough hacker with the right equipment could recover the data from your computer’s RAM if he were of a mind to. The electrical discharge of the RAM is not instantaneous, and by freezing it (literally speaking there, I believe they use liquid nitrogen), it significantly delays the discharging, enough that they can recover such sensitive information as encryption keys, or information stored in RAM by whatever programs happen to be open (so if your browser’s “private browsing mode” was active, the sites you were visiting could be determined).

What CoastalMaineiac says is true, but the actual problem may not be what you think. If the system was paging out the doc, it would use the actual Word file for that process, not the system page file.

The system would use the page file for paging out memory pages as needed. This means that: one, unless your Word doc was very small, only pieces of it would be paged out; two, those pieces have a good chance of being disconnected in page-sized fragments throughout the page file; and three, the pieces would most likely be some intermediate representation of the doc, as the internal working version of a Word doc is different than the version stored in a file on disk. Nonetheless, someone looking most likely would be able to find sections of text.

You have to consider the resources of the wannabe recoverer. The NSA, FBI, RCMP, KGB, MI6, whatever agencies are going to have a lot more money and resources available to try to recover your data if they really want it. Your little sister, not so much.

This is what I was thinking. I know that, in general, even a deleted file can be recovered. But I have no idea what format the swap file is stored in, and how difficult it would be to actually read it. I am a software development manager with a Comp Sci degree and studied all this stuff. I wrote a paging system in Data General assembly language for an OS class project. But my degree is from 1979 before anybody even heard of Windows. I imagine that the swap file is an image of memory blocks with each having some kind of header and probably fragmented as well. So my question isn’t can you recover the swap file from the hard drive after it’s been deleted; my question is if you had easy access to it, how sophisticated would you have to be to get specific data out of it? To do it from scratch you would have to be a Windows OS programmer but maybe there are hacker tools out there to simplify the task. I do not want to actually do this so don’t link to any such tools. I am just wondering what it would take.

The page file is raw data. The headers that keep track of this stuff only reside in memory as the page file is non-persistent. It would be trivially easy for someone to look at that data with a hex editor, but there would be so much chaff, that it would be hard to locate something unless you knew what you were looking for ahead of time. Even then, as in the case of your Word doc, it would be both incomplete and non-contiguous. With the enough time and the right tools, you could likely recover what there is to recover. It does not require sophisticated utilities or programming knowledge, just sweat.

Remember to practice safe hex, always use a virus scanner and remember to cover your disks.

Don’t assume that. There is no need for an OS kernel to wipe the paging file on startup or shutdown. It keeps track of what pages it has written to each location of the paging file, so there is no risk of accidentally reading old data from a previous boot and therefore no need to wipe the paging file. Some systems may have setting that security-concious users may enable to wipe the paging file on shutdown (for Windows, see http://support.microsoft.com/kb/314834) at the cost of shutdown taking longer.

Word may memory-map the document to read it (so that if memory pressure requires, the pages that have been read in to memory may simply be dropped and re-read from the source file later). Or it may read the file manually into a buffer using I/O function calls - in which case the document contents could be temporarily written to the paging file. Hard to say for sure without seeing the source code to Word.

Additionally, changes that you make to the document while editing could not be to a memory-mapped view of the file. If they were, the changes would be subject to being written to disk at any time at the OS’ whim - defeating the point of Word having a ‘Save’ button.

Finally, Office documents now are actually Zip archives. Word may memory-map the file, but it must decompress the Zip archive to a normal memory buffer that is subject to being written to the paging file.