Ooogh - what fun! A chance to write about viruses!
There are a lot of points to make - I am just skimming the surface with this brain-dump.
- Definition of a virus.
Self-replication comes into it - but it is not as easy as a ‘set of memory locations which can reproduce itself’. Polymorphic viruses, for instance, reproduce a bit string which is completely different - but which is the ‘same’ virus. And how about ‘hoax’ viruses, self-replicating and changing rumours and things of that kind? Nevertheless - I think we all understand what you mean. I just wanted to point out that there would be quite an argument as to what constituted a virus at the limit. See 2) below.
- Random creation and minimum size.
JPGs and other bit strings processed in computers are rarely random - they have characteristic structures, which are not like code. So a fragment of video memory or database innards would be very unlikely to happen to function as a virus if it were treated as code. Anything may happen at a boundary, of course, but this will usually occur on a byte junction, and modern systems tend to have large buffers, so you are more likely to get 2k sized chunks of…well… swap file is probably the most common thing to end up in memory. Much of this will be executable code, so the best chance of getting a new virus is if you are processing very similar code already.
As a point of interest, there was a short period in the 1990s when virus writers tried to write the smallest possible virus. Without looking up in detail I remember the ‘Tiny’ family of viruses, which were between 100-200 bytes. Mark Ludwig proposed a 7 byte memory-resident overwriter as part of a discussion about ‘artificial life’ arising - reference here:
http://www.arn.org/docs/newman/rn_artificiallife.htm
In fact, this reference covers a lot of what you mighht want to know.
- Mechanisms for creation
Last bit - honest!
Virus writers have developed a few techniques for ‘evolving’ or making new viruses. These have been uniformly unsuccessful in their aim of making virus detection impossible. Examples are the binary virus, and polymorphic techniques such as the Mutation Engine.
One unexpected effect from the creation of Macro viruses is ‘virus breeding’. This occurs when two macro viruses infect the same system - the virus macros both overwrite the central store (normal.dot?) and depending on the order in which they do so, a working virus made up of a combination of the two may be the result. (usually, the whole thing falls over).
All of the above is generally known. A last point is an issue which is not so well understood - that of ‘random’ bit changes in memory. I have never seen this happen, but I have had it reported to me by one professional virus researcher - Alan Solomon. You will appreciate that, to be sure that a bit has changed for no reason, you have to have an unusual set-up - a system where you are monitoring all aspects of the memory and processing, and where you can show that no intended change has occurred. You only find this sort of thing in a Virus Lab, and Alan told me one day that they had isolated an instance where a bit changed value for no obvious reason. He theorised that this may actually happen quite often - ionising cosmic rays or other particles may hit a chip, failing memory, or poor connections may all cause a bit to change. If you are running without parity this will not be noticed during normal running.
This explains some of the situations where the same virus is found with odd single bit changes in a non-critical area - what virus researchers often call varients. A good proportion of these are obviously intentionally modified to avoid detection, but some are pretty inexplicable.
end dump - need to work!