From my understanding, a malware/virus program N will have some property P that endows it with its nefarious function. Why is it not the case that there is some sufficiently strong property of the anti-malware/anti-virus program that it can just scan for P? Is it because malware/viruses are consistently finding new ways to hide P? Or do anti-malware/anti-virus programs merely just scan programs and see whether they are identical to an item in their library of nefarious agents?
A lot, but not all, of what they do is to see whether they are identical to an item in their library of nefarious agents.
Back in the ancient days of personal computers (as in folks were using MS-Dos) there were two types of virus scanners. The first type checked to see if the program was doing certain types of things that viruses did, like overwriting the boot sector, attaching code to files, etc.This type of virus checker did not need to be updated as it just looked for the things that the virus did, as per the OP.
The second type of virus scanner looked for a virus’s “fingerprint”, certain combinations of bytes within the virus that could be used to identify it. These had to be updated constantly as new viruses were found so that their fingerprints could be added to the scanner’s database.
What people found was that the second type was much more effective, as it would catch viruses whose programmers had come up with new and interesting ways of creating their code that hadn’t been anticipated by the folks who wrote the first type of virus scanner. So now all modern virus scanners work this way. Virus scanners have also integrated some of the techniques of the first type of scanner into their software, but mostly it’s the fingerprints (generally called “virus definitions” these days) that catch the bugs.
This isn’t true any longer. There are tens of millions of new virus variations released every year and no way for signatures to keep up.
The definitions-based antiviruses were NOT more effective. They just were able to give a name to the viruses.
Thus, Antivirus A (behavior-based) would say “There are three viruses on your system and they have been cleaned.” Antivirus B would say “There are three viruses on your system: NY-B, Stoned, and Michelangelo. They have been cleaned.” Computing magazines would call Antivirus B the better scanner. Companies would buy Antivirus B, so Antivirus A would either have to switch to the definitions-based model or go out of business.
That’s been the model ever since.
And the most effective anti-virus is the “don’t click on it or go there button” which the operators of computers are unable to comply with or use.
I’m not 100% certain, but I’m pretty sure that a program as described by the OP would be impossible, as a consequence of the Halting Theorem. If you can’t even tell whether a program will output, then you surely can’t tell what it’ll output.
This is true. When I taught Theory of Computation, I’d sometimes do a first/second day lecture about how there cannot exist a perfect anti-virus detection program. It’s a slightly disguised version of the halting problem.
Basically any attempt to write a program to determine anything non-trivial about the external behavior of programs is doomed.
True, but you can watch it while it’s actually being executed and ring alarm bells if it does certain general sorts of things.
This is a lot harder than in looks, both in theory and especially in practice. If there’s a new exploit, the program might make an undetectable change to either the monitoring program or the interface it’s using to monitor. So the monitoring program continues to report “Everything’s hunky-dory!” while your system gets turned to swiss cheese.
One of the nastiest exploits out there writes to the firmware of the hard drive. No virus scanner using regular disk access can see it. (And HD makers have failed in providing software and hashes so that others can check if this has happened to them. And if they did, what if the viruses checked for the firmware checking software and …)
There’s also the issue that anything a virus does maliciously, some software needs to do legitimately. How do you distinguish between, say, the legitimate registry edits and the illegitimate ones?
As Yoyodyne noted, because of the vast amount of viruses, let alone the types of malware out there, you can’t rely on just signatures. You need different types of security.
If it were just signatures, then you can imagine the anti-virus software writers said stop everything that looks like “x” (and so on to the millions) now.
Then the bad guys wrote trojan horses and got you to open and launch things that looked harmless (my vacation photos, naked wife). And security companies wrote code to stop you from doing that.
Than the bad guys said, well, we’ll tell you to fill in this form from “your bank”, and you did it, and security companies tried to stop that.
Then the bad guys looked for holes in your browser, and some of them broke into your machine that way, and security companies started paying attention to vulnerability protection.
You didn’t like spam. Anti-spam was born.
There’s actually a whole lot more to this, but hopefully you’re getting the picture. For every hole that gets plugged, a new hole somewhere opens. Teams of people on both sides, good and bad, spend all day every day trying to anticipate the next gap and either exploit or block it.
Why? Because there’s lots and lots and lots of money at stake. Welcome to the online economy.