What is the biggest program ever written?

What is the largest and most complex software program out there? (Actually, I guess there could be two answers to that, as the biggest wouldn’t necessarily have to be the most complex).

Is there a limit to the size a program can be, ie the larger it gets the greater number of possible bugs, until it gets tangled up in its own complexity?

In that (I think) most complex software suites are composed of a number of interactive program modules that are virtually standalone “programs”" by themselves this question may be difficult to answer in a meaningful way.

There is a technical term for that phenomenon. It’s called “Windows.”

Tough to say what the largest program is because so much of software development is proprietary. This guy claims it might be the Visanet transaction system. Not a bad guess – your candidates are probably operating systems (IBM or Windows), financial transaction handlers, or something government related (tax or military applications). The answer may change depending on how you view what “program” means. Operating systems come with lots of API packages, integrated development environments, and browsers. And as we know from the various Microsoft monopoly cases, the boundary between part of the operating system and application is poorly defined.

If proper design philosophies are followed (modularization, documentation, and so on) there’s no reason why there should be an upper bound on the size of a program, memory and disk space permitted. At least in theory, you can write modular programs where each component can be developed and tested independently.

Yeah, you need a good definition of “program.” If it exists in several binaries, some of which are shared with other “programs” (for instance dlls), do you count that?

As a WAG, I’ll throw out the software that “runs” space shuttle missions on NASA’s mainframes. It may or may not fit your definition of program, but my WAG is based on the observation that NASA is usually the first to come across new stumbling blocks - the first to need terrabytes of information, the first to require good source control, etc. I’m guessing they’re usually first because their problem is of wider scope than anything else out there.

It’s often convenient to define a program as anything that can be modeled by a Turing machine with no input. It’s not necessarily intuitive, but until someone comes along with a better definition, let’s run with it.

Given that, the largest program is undoubtedly going to be somebody’s big cluckin’ database, the DBMS used to do queries, and the longest query that’s ever been executed on it.

Most of that software is 30 years old. I don’t think you could even fit a modern operating system onto the Mission Control mainframes. I’m not sure where you get the idea that NASA rides the cutting edge of computer bottlenecks, but they’re generally considerably behind even the typical home computer. So far as I know, the major driver currently is actually the US nuclear Stockpile Stewardship program: The idea there is to perform elaborate computer simulations of nuclear explosions, to keep our weapons systems up-to-date without needing actual tests. Even there, though, the programs generally require lots of memory and CPU operations, but the executable code itself isn’t all that terribly large.

We can’t include the size of the data in the definition of the size of a program, or else I could trivially write as big a program as I wish. In fact various web crawlers would probably qualify under that definition.

As for the OP, there is no common measure of complexity. It could be function points, or lines of code, or any number of things. I’ve seen some quite small programs tangled up in their own complexity, due to the incompetence of the coder. So I don’t know the answer, or even how to find out.

The larges program, in lines of code, without a known bug would be an interesting question.

We certainly can. Compression theorists do it all the time. If you’ve got a better definition, please offer it up.

Hmm, but this “largest program” could be written by a Sophomore CS Major in a week. I don’t think this is really in the spirit of the OP.

Lines of code is a much better metric to use, and indeed is what most software companies use to measure complexity. If we also define a program as something that completes a set of related tasks, then we can start looking at candidates. I think Windows Vista is probably a good contender, even if we exclude the web browser and media player. Most estimates tend to put its lines of code in the tens of millions. Even all of the major government software projects I’ve heard of only run into a few million lines at most.

So, how is that definition? We can quibble a little about what constitutes a “related task”, but it sure as heck beats a “let’s include the data it uses as an input” definition.

That’s interesting. I seem to recall that a lot of sites were saying this about Windows XP when it came out.

Here’s one example

“Windows XP has 35 million lines of code; it is the biggest program ever written.”

It’s a safe bet that Windows Vista has added more than a few million lines to that!

Would distributed processing algorithms be a fair inclusion? - admittedly it’s multiple copies of the same client program running on different computers, but they act together in concert as a single meta-process.

No, they actually update it. http://www.fastcompany.com/online/06/writestuff.html


Also, NASA was the first to each SEI level. That’s by necessity. They’ve been the first ones to each ‘type’ of software complexity problem.

Next time, why don’t you just ask for a cite, instead of declaring someone wrong?

Windows Vista is clearly larger than the LoC I was able to find with regard to running the space shuttle, so my self-described WAG was off, but NASA isn’t running the Apollo XI computers anymore either.

The stuff I cited above has 1 known bug in 440,000 lines of code.

When I worked at DEC we upgraded Houston (is that Mission Control?) to run on a modified and hardened version of Digital Ultrix 2.0. That was their first major upgrade since the 60’s. As far as I know, that O/S is still in place.

According to Wkipedia:
1993 Windows NT 3.1 6million
1994 Windows NT 3.5 10
1996 Windows NT 4.0 16
2000 Windows 2000 29
2002 Windows XP 40

Red Hat Linux 6.2 17million
Red Hat Linux 7.1 30
Debian 2.2 56
Debian 3.0 104
Debian 3.1 213
Sun Solaris 7.5
Linux kernel 2.6.0 6.0
Gimp-2.3.8 0.65
Others I’ve seen in various technical journals:
SABRE 200million (as of Y2K project)
SAP 160million
IRS 62million

Number of non-commented lines of code is one metric - when I was in Bell Labs we even had a tool which could produce this for C programs. The sum of the sizes of all appropriate executables, including DLLs or the like, might be another.

That being said, the answer of Windows XP sounds plausible to me. Hardly anyone else would have the resources to write a program that big, or would want to, and there is relatively speaking lots of program and very little data in there.

That’s very impressive - more so than if no bugs were known, which might be because no one was tracking them.

Actually there are lots of common measures of complexity. Several commercial tools can provide McCabe cyclomatic complexity and Halstead measures. These are the most well-known and widely used but there are lots of others. Lines of code and function points give an indication of size but not necessarily complexity.

Then NSA would seem the place to look.

No it wouldn’t. I could generate an arbitrarily large program that simply repeatedly outputs the phrase, “Hello, world!”, and make it provably correct (i.e., bug-free).