The leaking of some MS Windows source code is causing a lot of concern. Because of all the uproar, it seems that Window’s coding has been kept entirely secret until now. But it seems hard to believe that—given all of the hackers/crackers et cetera that don’t like Microsoft or that just want to mess with people’s computers—no one has ever been able to hack/crack a Windows install disk and gain access to the whole code. Is that true? Has no one ever managed to get by whatever encryption or cipher used by Microsoft to compromise a Windows CD? How does Microsoft (or any company selling expensive software they don’t want copied and/or freely distributed) protect their source code when they essentially give you the entire source code on disks? What stops a hacker/cracker from taking a Windows disk home and finding some way to view the entire code? If they could, then this latest release of some code being released wouldn’t be making headlines. So what makes Windows and other software sold in disk format so secure?
Windows install disks do not have source code on them. Why would they?
Only a limited understanding of computer programming here, but what you get sold on a install CD, are complied version of the source code. All the source code is translated by Microsoft from human understandable programming languages into computer understandable instructions, grouped into the various classes of files, and then placed on the disk and sold to you. From this translated instructions it is impossible to reverse engineer them to find out what the initial coding was. Hence no one has ever gotten unpermitted access to the MS source code, and so the leak is a big deal.
-
-
- “Source code” is what the programmer writes, that is easy to understand. Then that gets fed through a linker/compiler, and what results is “executable code” that your computer understands-- but it is not-directly-human-readable. What you get on a MS CD is executable code only.
-
- Now there are decompilers that can attempt to “convert back” executable code into source code, but the practical problem with doing this is that there’s a vast number of variables used in an operating system, and during the creation of the source code the variables are named appropriately as to what they control. The problem that using de-compilers has is that when source code is converted into executable code, the variable names are not normally preserved at all–instead, all those variables get numbered. Some of these variables contain string character values and so can be identified, but most do not. So you can run a decompiler on an OS kernel, but the decompiler identifies tens of thousands of different variables – but you as the would-be cracker have no way to detirmine which variable controls what, because the original descriptive names are not recovered.
~
To give an example of how hard it is to understand decompiled code, assume someone showed you an immense novel with a few basic words (articles and such) readable but all the rest labelled by type and number. E.g.,
“The Noun237 Verb180 a Adjective2003 Noun624.”
for thousands of pages.
While it can be parsed, understanding the actual meaning is very hard.
For crackers, you have to know where to look to find possible holes to exploit. Even though the distributed code is quite large, it will take a single cracker less than a week to find code worth targetting. And then it’s just a matter of days to find an exploit.
Even though the code is just for 2000 and NT4, XP probably shares an enormous percentage of the same code. So a lot machines are going to be compromised in the coming months. And there could be multiple new attacks every single day. It will make recent worms/viruses seem trivial by comparison.
Note: as usual many of the newspaper articles about this are idiotic. One I read today pointed out how difficult it would be to recompile the source code after modifying it. ??? Crackers won’t be modifying the code. They’ll be designing viruses and worms to attack it. The source code lets them find the weak spots.
One of the reasons MS is so worried about the source code being released is that those who have legitimately seen it have consistently reported that it is really lousy code. But that’s been whispered in semi-secret. Now this will all be quasi-open to see, businesses, governments and such will be appalled at the revelations.
Keep in mind the MS OSes are used in those already dodgy voting machines. This will hopefull kill the use of those.
Even though I don’t subscribe to the whole liberal hippy-ness and feel-goodness of the open source movement, I do believe in it, and this is a damn good example of when open source is a great, great thing. Information systems professionals know that security through obscurity is essentially no security at all. The fact the Microsoft is worried about exploits is enough to demonstrate that there’s the possibility of an exploit. When source code is open, though, you have the whole world working to ensure that exploits aren’t possible. When they do slip through, though, they can be fixed practically instantly.
I have an XP machine, but my main computers are my Macs. “Darwin” is the Unix-like layer beneath the human interface, i.e., it does the brunt of the work not related to eye-candy. It’s open-source. Is Apple worried?
(I’m not trying to be pro-Apple [even though I am], but only wish to point out the difference open code makes).
Microsoft has been compromised before, and it’s source code has been seen. I don’t know if it has been compressed into an archive and made widely available before.
This article from October 2000 provides one reference.