How do hackers test for virtual machines?

I was recently reading an article about Google’s efforts to keep its app store free from malware. It’s mostly an automated process and very efficient but some malware does get through to the store in spite of their safeguards. The article said that the hackers were extremely good at hiding their code and that the code could lie dormant if it detected that it was being run in a virtual machine.

But how could it do that? If a virtual machine is an exact copy of the original how could the code tell the difference?

They ask nicely.

VMs aren’t exact copies. They advertise that they’re VMs, and they have standardized ways of doing so. I agree this is a problem, not just for security but for basic software development: Ideally, you should be able to run a VM as a guest inside a VM, so you can develop a new VM inside a working VM and be certain that the new VM will work the same way on real hardware as it does on the VM it was tested on.

Now, be clear: VMs aren’t emulators. A VM leverages hardware features to run an OS inside an OS, such that most opcodes are run directly on the hardware with zero overhead. An emulator, OTOH, is a software recreation of some piece of hardware, and those are indistinguishable from real hardware.

That’s where I was mistaken. I was thinking of a VM as similar to an emulator. The question arises though as to why in that case they don’t use emulators to test for malware if they really can’t be distinguished from the real machine? (I may be asking a really dumb question there but my expertise in these matters is non-existent).

Well, aside from the fact emulators might be distinguishable from the real hardware if they’re not perfect emulations (including the fact that timing characteristics might be different, like a given operation takes a certain amount of time on real hardware and a different amount of time on emulated hardware, and if the software being run on the emulator has access to an accurate clock it could figure it out), emulation is inherently more resource-intensive than running the software on real hardware, because you have the hardware CPU running every single emulated opcode itself, running every single piece of emulated hardware itself, and possibly emulating bits of the outside world, depending on what tests you want to run.

(The rule of thumb in the emulation community is that you need to wait a couple generations before it becomes feasible to run an emulated system on current-generation hardware. It takes that long for hardware to get sufficiently faster than what it’s emulating that the inherent inefficiencies don’t make the experience of using the emulated system actively unpleasant.)

So. Running a complex program under emulation is a resource-intensive task, and doing so while checking for non-trivial behavior is doubly so. And even so, the malware is going to try its damnedest to do things you never contemplated, so it will be a serious stress test.

Thank you, Derleth, I see the problem now.

The definition of “virtual machine” is very wide.

A java virtual machine might not emulate hardware, the program is unable to run if the hardware isn’t there . That is because the java VM was designed merely to run the byte code, and then be capable to use any hardware or resource of that platform. eg to be a full OS, to be the large application.

But the OP’s question was in terms of Android. In android, VM and emulator are synonymous. That is because android has translation - 1-4 has just in time translation, 5+ has install time translation so the installed app is saved as CPU matched machine code. So an android device doesn’t run a VM on ARM cpu’s (phones,etc). It runs a VM when inside an emulator… the main thing the emulator has to emulate is the ARM CPU , as its not plausible to convert to 80x86 machine code. (unless someone writes the translator ??? Its about time we got android apps integrated into windows… )

The VM and emulator, in the context of android, are only for the same thing.

I guess there are also signatures a bit of malware could look for. Machine name, workgroup name, the NIC vendor, virtual machine additions, etc, could all point to it being a VM.

Plus, unless it’s trying to be incommunicado, most VMs run a known set of tools (I.e. VMware tools) that allow the host machine to manipulate the actions of the VM - reclaim memory, monitor processes, etc. VM’s support a limited number of (emulated) peripherals, like VMXnet network cards…

There are sort of two questions here.

  1. How would malware detect that it’s on a current VM?

This has been answered. It’s pretty easy right now, since there are relatively few VMs and they all have some obvious tells.

  1. Could you create a VM that could not be distinguished from hardware by processes running in it?

This one, I’m not sure. I’m inclined to say that you could not, given sufficient resources on the part of the attacker, but I don’t know. This is sort of the AI in a box problem. I suspect that computers are so complicated that there are sufficient emergent properties in any system built on top of them that an attacker could learn to recognize.

I don’t have to be perfect, I just have to be better than you.

It would be an arms race, and it would begin with VMs not outright telling their guests that they’re VMs. That alone would thoroughly fool the current generation, which relies on such honesty. After that, you’d get into the kinds of behavioral things that malware could look for but that normal software wouldn’t notice, but the fact VMs (as we’re talking about here, also called hypervisors) ideally don’t perform emulation should help. After that, who knows?