OK, I just know this is a dumb question, but I need some ignorance dispelled here.
I recall instances where some proprietary part of Windows has been leaked on to the net, allowing hackers to attack vulnerabilities, etc.
My question (and here comes the dumb bit): how is anything in Windows a secret when the code is all there on any computer that uses it, and can presumably be disassembled and examined at leisure.
What gets leaked is source code, which is different from the runnable software that the computer actually executes. Software source code is written in higher-level languages such as C, C++, C#, etc, which is much easier to understand than the executable code it is compiled into.
The runnable software can be reverse-engineered into source code, but the result of that process is still pretty hard to understand.
#include <iostream>
#include <tchar.h>
int Tally[6] = { 0, 0, 0, 0, 0, 0 };
int min( int a, int b )
{
return ( a + b - abs( a - b ) ) / 2;
}
void FindMin( int a, int b, int c, int d )
{
Tally[ min( min( a, b ), min( c, d ) ) - 1 ]++;
}
int _tmain( int argc, _TCHAR* argv[] )
{
for ( int i = 1; i < 7; ++i )
{
for ( int j = 1; j < 7; ++j )
{
for ( int k = 1; k < 7; ++k )
{
for ( int l = 1; l < 7; ++l )
{
FindMin( i, j, k, l );
}
}
}
}
for ( int n = 1; n < 7; ++n )
{
printf( "%d: %d
", n, Tally[n - 1] );
}
return 0;
}
And here’s what that actually looks like as an executable:
I wrote that code as a test of some kind, and I don’t really remember what it does. Figuring it out from the executable is significantly more difficult, as you might imagine.
Just to add this… ‘disassembly’ of source code is a very basic transform of the machine code instructions into marginally human-readable commands… taking the form of a huge mass of instructions like ‘load from memory address 55323 into register 17. Add registers 12 and 19 into register 3. If register 7 is anything but zero, branch ahead 234 lines. Store from register 22 into memory address 28341. Now jump to line 48124.’
This will tell you, kind of, the stuff that a program is actually doing and how it is doing it, but in such a disorganized way that it’s hard to track the significance of anything that’s going on… you can see plenty of trees, but can’t get the lay of the forest.
Some programming languages have had decompilers worked out, which can go one step further that disassembly, and make a guess at expression statements, if-then and looping structures, functions and subprocedures. However, certain elements from the source code are never compiled, such as programmer comments, variable names, function and subprocedure names. Thus, decompilers can’t even see the comments, and they make up their own variable names etcetera. This means that it can still be enormously difficult to figure out what the decompiled source code was trying to do.
(Heck, sometimes it’s even hard to tell that with the original source code, depending on how good the author was as far as writing self-documenting code - choosing variable names that mean something, etcetera.)
Considering the millions of lines of code there must be in Windows, and if it all presents in the same way as ultrafilter’s example, I can see one might have a slight problem disentangling it!
Well, you might actually be able to figure out ultrafilter’s code. As shown, the compiled version is gibberish, but if you format it properly, it makes a lot more sense. Since the logic is pretty simple, you might be able to follow it.
Given that, the logic in Windows is not simple. It’s not simple what it does and it’s not simple how it does it, since the logic is spread across dozens of libraries.