Are incredibly complicated pieces of software built from lines-of-code up?

I don’t really understand software or programming other than that “software is made of codes that run and stuff.”

I understand that back in the day (and based on elementary school memories of writing basic programs in BASIC and stuff), software was actually written by people typing out strings and strings of code that looked like garbled nonsense, but when that code was run, it turned into a game or an animation of a picture of titties or whatever.

Now, these days, is a big piece of complicated commercial software, like an OS or a game or video editing suite still actually created by people sitting there and typing in lines of code? Or are there, uhh, “maker” programs that give more intuitive toolsets and interfaces that people use to create such things, then the actual code is somehow generated or compiled?

Remembering that it took hours just to make a basic pac-man style game, I can’t imagine that something like Mac OSX or FinalCut Pro is created by hundreds of guys typing out “xrun=$5urlt+clockdiv65765464569integer9349349” for hours and hours and hours at a time, but maybe it is.

Sub-question; I hear nerds and friends that work in programming complain about how “good” or “bad” someone else’s coding is. Is there a lot of variation with how efficiently and successfully code operates?

It depends, more “basic” (by “basic” I mean “provides a base for other things to run on”, not, “simple”) things, like an OS, tend to be built, literally, lines of code up from my understanding (I’m more on the application side of things and need to study this still). However larger applications (think a game) tend to have “tools programmers” which make a set of tools for the other guys to use. If you’ve ever played a game that has a world creator (like Warcraft 3) it’s usually fairly close to what the game makers used to make the game, in some cases the exact same. Oftentimes the engine they buy or use will have a tool-set and they just need to modify it. Without an existing engine it can slow development times by 3-5 years, so you can tell line by line (which engine programming boils down to in a lot of cases) takes a lot longer.

For other applications there are ways to simplify it, for example Microsoft’s Visual Studio and Netbeans provide IDEs (essentially the same tools that I mentioned for games above) that allow you to lay out your applications visually and apply code to them via an interface, however most programmers still go into the code and mess with it to clean it up and make it run more efficiently. This is still quite a bit faster than line-by-line though.

As for your bonus question, code can be a nightmare. Certain processes run slower. Some code is the equivilent of taking out every book and looking at the cover until you find it, some is the equivilent of only looking at the spine and taking out the one you want, and some is like taking out every one and then going back to the right one after you’ve looked at all of them. Not to mention a lot of people don’t comment their code so you may have no idea what they were trying to do (and many people who don’t comment can’t even remember what their own code does). Due to the physical component of the hardware certain things just run better if you know the way stuff is stored, accessed and such and you know what path it takes to do each. “Bad” programmers don’t know the intricacies and can make code SIGNIFICANTLY slower than others. In one of my threads I think someone brought up a new programmer that hadn’t learned all of the lower level stuff that had the speed of his program improved 2000% (or similar) by having the mechanisms changed.

I’m by no means a programming guru (working on it, consider me an apprentice), but this is what I understand as of now. If I screwed something up I leave it to those who have been doing this for years to correct me.

My experience with commercial software is somewhat limited, but I do know that for one of your examples (video games), a company will typically license an “engine” for the graphics. Two examples from the past would be the Quake Engine and the Unreal Engine, one or the other was used in almost every computer game with 3-d graphics. The engines are of course fully customizable, but they provide a starting point and save developers a ton of time to develop basic graphics code. Occasionally you will hear a developer claim they were unable to get the effects they wanted from a particular engine, so they re-wrote at least part of it (unfortunately I am unable to think of a specific example off the top of my head). I believe the Half Life 2 engine is one of the current and most widely used ones right now.

Another part of the answer is that most programming languages now are OOP, short for Object Oriented Programming. What that means is, instead of writing an entire program from scratch, you can make “calls” or references to pre-written sections of code. As a very simple example, I have a Java program that draws a snowman on the screen. I have a pre-written module that draws a circle. All I need to do is call the circle module, giving it the size, location, and color of the circle I want drawn. I call the program 3 times, giving it different coordinates for each section of the snowman. I don’t need to look at one line of code in the circle drawing module, I just need to know what its input is. There are a huge amount of pre-written modules for most OOP languages, and they can be major time savers. Also, many commercial software companies create their own modules, for example, a bank might have their own interest calculation module which can be called by any program the bank writes. Simple examples, but you can see how OOP saves a ton of time and effort by reducing coding, bug testing, etc.
Also, I have written code since the TRS-80 days, and the newer coding interfaces are just plain slick. They color code different keywords, check for syntax errors on the fly, have tons of nifty debugging options, and are just an amazing leap from the old days. I haven’t come across any that actually write code for you, although some may have the ability to generate certain functions that would be tedious to type out yourself.

Finally, to answer your sub question (and this is something I am totally familiar with): Different programmers have different styles, there are usually a lot of different ways to accomplish the same thing. Some programmers write code that is very neat and easy to read and follow. Others might write code that accomplishes the same thing, but is very sloppy, hard to follow, and poorly documented (every programming language has the ability to add “comment lines”). Like writing styles, there’s a lot of variation between different people. Sometimes a programmer will come up with code that is less complex but harder to read (maybe they use programming shortcuts for example). Most of the more skilled programmers like to “show off” by putting code in a program for no other reason than to make other programmers say “how the hell did he do THAT?”, but that’s another rant…

Most of what a professional programmer does is writing lines of code, like you described (but of course they don’t look like gibberish to the programmers). However, the lines of code will mostly call functions which were already written by other programmers. Ultimately, everything comes down to lines of code, but the guy at the game company probably never saw most of them, since they were the lines of code used to make the functions he used.

As far as “good code” and “bad code” goes, some code really will be a lot slower or less efficient than others, but learning which things to avoid is pretty elementary stuff, that you’ll learn in your introductory classes. Mostly, when people say code is “good” or “bad”, they’re referring to how easy it is for a human to read it. This is important, because some other programmer might later need to make modifications to the code. In the worst cases, even the original programmer can’t figure out what he was doing, after coming back from a hiatus.

Professional programmers learn to design and structure their code to reduce the complexity of large projects. The primary skill is abstraction. Abstraction is the art of taking a problem and eliding unnecessary details to simplify it to the point where it’s manageable. The details themselves can be handled separately, so they’re also of manageable size(and if a detail isn’t of manageable size, you abstract the details iteratively until everything is manageable). That’s quite an abstract definition, so an example might clarify the kind of thing I’m getting at.

Suppose we have a problem that wants to save data to a file on a drive. This is quite a complicated problem. Consider that there are dozens of hard drive manufacturers out there, and each of them have released many different types of drives. Every different kind of drive from these different manufacturers will have a different protocol for performing any operations on it, like writing data to it. You can think of each type of drive speaking its own language.

This would seem to present an almost insurmountable problem to the programmer wanting to write to a file: his program must speak hundreds of different “languages” to do so. And yet there hardly exists a program out there that doesn’t read from or write to files. The solution to the problem lies in abstraction. If you think about it, the programmer only wants to write to a file. He really doesn’t care what kind of drive it gets stored on or what language the drive speaks. The language for speaking to the drive is the kind of detail I was talking about earlier: there’s a lot of complexity in actually dealing with that detail, but the programmer doesn’t want to get bogged down in the details of how to write a file to whatever kind of drive you have. So what happens is that the program basically says, “Write my data to the file called c:\MyFiles\data.txt”, and the operating system itself handles the details of talking to the drive.

Programmers apply the same kind of thing to their whole program(the good ones do, anyway). You have one part of your program deal with the “big picture” and several parts beneath it, each dealing with a closer view of a part of the big picture, and each part might have even more parts beneath it dealing with an even smaller part of the big picture in even more detail, and so on.

It’s very much like the hierarchy that tends to develop in human organisations, like the government, for example. You have the President setting the broad direction of the government, the Secretaries and other officials implementing that direction for one part of the government, and you have many layers all the way down to the front-line workers, each of whom deal with a tiny part of the government’s business in very close detail.

Does that give some idea of how complexity in large software projects is managed, or have I just made a muddle of things?

At bottom, yes, large software projects are made up of millions of lines of code. As others have alluded to, this is far too complex for any one person to understand. So we organize similar bits together, and stop worrying about details that are too far beneath our level to care about.

Right now you may hear a lot about the .NET framework, which is Microsoft’s attempt to create a library that handles all sorts of common tasks. If you need to write to the hard drive, you just need to find the appropriate .NET component, and tell it what to do. Similarly, you can find components to make internet connections, or to draw graphics on the screen, or to do pretty much anything that a lot of programs want to do. In theory, programming just becomes a matter of finding the right components and putting them together in the right way.

One other thing to keep in mind is that not all programming languages are created equal. One line of assembly is different from one line of C, which is different from one line of Python, which is different from one line of Haskell.

It really does boil down to this. Of course, there are tons of tools available to assist with certain types of things. For example, my current project is being written using C#/.net and I am using something called Visual Studio to do my programming, so for the visual design parts of the program (laying out buttons and forms) I can do it graphically, point and click style. However to be honest, I spend maybe 5% or less of my programming time doing that, and most of the rest of it just typing lines of code directly (or worse, staring at lines of code and trying to figure out what the problem is). Visual Studio helps out here as well a little, by providing tools to help me find documentation easily as I work.

Bear in mind that there’s a lot of non-programming in a typical day. Planning and design, testing, bug fixing, writing documentation, and arguing are all part of a regular work day for most people in the field. Depending on how structured the company is, this stuff will consume more or less of your time. Small startups may have rooms full of people who do little else other than crank out source code all day, and large corporate institutions sometimes put their programmers through 7.5 hours of meetings a day.

Like any product, it can be good or bad in a variety of ways. Some code just plain stinks because it results in an obviously slow and buggy application. Other times, it looks like it works ok, but when you look at the actual source code, it’s a huge mess that is impossible to modify or debug. A great website for nerds to complain about the latter case is the daily WTF. I don’t expect non-programmers to find that site funny, but it’s a good example of the sort of stuff we find funny because it’s so terrible.

And of course one line of C is different from another line of C. (Except ones that are just curly braces. They’re all the same. :))

To address the OP, yes most software applications are written as huge pieces of code, and believe it or not, that is the most intuitive and the best way to express software. A lot of programmers will argue about what language is best, or what approach is best, but the majority will agree that programming in a text-based computer readable language is the correct approach – generating actual software from graphical diagrams or higher level visual abstractions generally doesn’t cut it and is generally a purely academic exercise. There’s certainly tools and GUI editors, and content creation systems – but the actual software, the logic behind the data and the interface is usually written as a program.

A huge chunk of software people use is the software that runs on consumer electronics. A set-top DVD player or recorder can easily have a few hundred thousand lines of C code to make it work.

I’m an embedded software engineer and you’d be surprised how many of my coworkers don’t have any idea how to optimize anything. Some of these have graduate degrees and years of experience and they still produce incredibly slow code with “Well, it’s O(n), what more do you want?” as an excuse. If you consider O(1) to take no time, and the compilers to always optimize ideally, then “things to avoid” is pretty elementary stuff. In the real world, you need a good feel for the entire system up and down, down to the RAM latencies, in addition to having a thorough understanding of complexity to truly write code that is not just readable but is also truly excellent.

Of course we live in a world where people somehow think a 1 minute bootup is not ridiculous for a PC that’s a thousand times faster than a PC I had 16 years ago that booted up all the way to the Windows 3.1 desktop in like 30 seconds. Out of memory problems can always be solved with more RAM, bad libraries and algorithms by a faster CPU, and insecure bad code by layering on another VM, garbage collector and abstraction layer. :rolleyes: I sound like one of those old timers defending main frames and punch cards, and I’m only in my 20’s.

Most big commerical software are basically designed in ‘layers’. For example, Microsoft provides .NET and the Windows SDK for developers to write windows application in - so to open up a dialog box, read something from the registry and so on, you still need to invoke the functions, but you don’t have to code those in.

Meanwhile, the programmers will write code pertaining to the specialised function of the software. The rest is being taken care by the OS and the OS-compliant compiler.

The OS will then work with drivers (another layer of code provided by manufacturers) to get everything in working order - keyboard, mouse, video, soundcard and etc so programmers who do the applications need not care about those low-level details.

Even for programming languages, there are optimised data structures (such as Boost and the Standard Library for C++) which makes coding less repetitive.

But yes, FinalCut Pro is basically the programmers’ own code + code provided by Microsoft for all those winodws widget + libraries + OS’ code in the end.

(PS. This is for application programming. As mentioned, for embedded programming, when you program a washing machine for example, usually you don’t have such libraries at your disposal. Games development is also one quirky area where people like to do things from ground up).

I’ll cover what I haven’t seen in the answers already. Few software packages are written 100% from scratch. Most make use of binary libraries containing functions whose source code the developer will probably never actually see. They developers just write a few lines of “glue” or “goop” to integrate them together. All they know is they call a function “foo” with a parameter of “bar” and get a certain type of return. As you mentioned, there are a number of visual tools to assist. These are more often used for visual design aspects (GUI, games, and whatnot) and there are also form-driven ones for less visual types of software.

Pertaining directly to your question, some developer tools generate actual source code, which the developer can then customize to his/her own needs. Skipping ahead to the next point, much generated code is absolute junk, but it doesn’t matter because humans will never read 99% of it. (See the next para regarding code quality for what constitutes junk).

When people talk about “good” or “bad”, it can mean many things, but generally can mean any of several concepts. This is what I think people usually mean when they talk about code quality, from most common to least:
[li] Correctness and completeness: Does it do everything it is supposed to do? Does it do what it does correctly? Do bugs seem to pop up in it every time something else changes? Does it handle the exceptions that are easily foreseeable? Does it fail gracefully when it doesn’t know what to do?[/li][li] Style, part 1 (understandability): Is it obvious what the code does, and how? With modern programming languages, there’s little excuse for mysterious code, most everything can be designed in a self-documenting way. When that isn’t the case, comments can be inserted (though this is bad, because comments always lie)[/li][li] Maintainability: Small behavior changes should not require widespread code changes if the code is well-designed[/li][li] **Complexity: ** should be a minimum to avoid unintended consequences[/li][li] Efficiency: Naive algorithms usually have performance problems, though optimized ones can cost additional complexity.[/li][li] Style, part 2 (readability): This relates to comprehension and readability… things like appropriately-named variables, use of whitespace for readability, the endless war about bracket alignment, use of capitalization, etc. This matters less when the code is well-designed in every other aspect. But when code violates the above listed principles, bad readability just makes you want to strangle the author (who, alas, sometimes is you yourself).[/li][/ul]
Again, that’s just my take, and many others have different takes. Here is a link to a wiki (actually The Wiki) where people discuss this in great depth. The page I linked is not particularly rich but it links to a number of other interesting conversations.

If the OP wishes to delve into the subject of good code and bad code, I recommend the book Code Complete

Other posters have brought up the layers of abstractions and libraries of functions that have already been written.

The other point is that very few huge projects, such as an operating system, start from scratch as a brand new application. Mac OS X was not written by hundreds of guys writing code from scratch for hours and hours, but was rather hundreds of guys modifying code that existed as the previous version of the operating system. At this point in history writing an entire operating system from scratch would no doubt be a daunting prospect. Lucky this is unnecessary, an entire infrastructure of previous or similar operating system exists and is built upon as new functionality is introduced.

I believe a large part of the demise of Netscape was attributed to a very bad decision they made - basically they decided to rewrite their browser from scratch, giving their competitors a nice long period of time where Netscape was dead in the water and burning through money.

Arguing, gesticulating and drawing boxes and arrows on whiteboards probably account for a good 60% of my time. The problem is that with software, there can be one million different ways of solving the same problem that are all “correct,” but with each one having its own strengths and weaknesses. In getting a product out the door, you have to strike a balance between resource usage (RAM, throughput, Flash/Other Storage, etc.), responsiveness to I/O and/or the user, and what exactly constitutes “full testing” of the product.

In my area (embedded radar software), we have some additional challenges, mostly due to harsh operating conditions, having to meet fairly high standards of reliability, and operators who sometimes express themselves so clearly that we soil our pants.

From my experience, we spend about 70% of the time talking about and refining designs before writing a single line of software, then spit the code out in about 10% of the time, and then spend about 90% of the time in the lab reworking things because the hardware doesn’t actually work the way they said it would. That would actually add up to 100%, but our schedules are rather short because, as any software manager will be happy to tell you, everything we do is at least 90% reuse.

[edit: Another vote for Code Complete]

Writing a commercial operating system like Windows or OS X from scratch would be pretty unpleasant, but writing a bare bones operating system is a pretty standard junior or senior project for a computer science degree.

We did a virtual memory system using Nachos OS - but I am quite sure that this is not as ‘hardcore’ as it used to be back then.

I take my hat off to real-time, high risk embedded applications programmer. Too stressful for me!

I’m part of a team that creates very complicated commercial software in a niche market, so I can vouch for what’s been said above. For us, the basic idea is that each of us works on one tiny aspect of the overall development, but we discuss things frequently and test the combined system constantly to make sure we are not getting out of sync with one another.


The biggest piece of software written at the lowest level that I know of is, at least a few versions ago, “FastCAD” by Evolution Computing was written, they said, in Assembly language, to make it fast. I think this was true for Windows versions.

The microprocessor itself contains tiny programs. For example, the instruction to subtract one value from another would involve several steps executed in hardware. These are designed in integrated circuit components, transistors and maybe diodes, maybe grouped as gates. Then, modern computor microprocessors contain “microcode”, which sequences those lowest level instructions into larger groups. People writing programs that deal with specific microprocessor instructions (either opcode or, much liklier, Assembly) can’t get any lower, and can’t mess with things inside the microcode.
>Object Oriented Programming. What that means is, instead of writing an entire program from scratch, you can make “calls” or references to pre-written sections of code.

This isn’t what OOP means. An “object” is an instance of a “class” in the same way that a “variable” is an instance of a “type”. Objects are like variables or groups of variables such as structs, in the sense that they can contain data that can be written and read. But they are also like subroutines or functions in the sense that they contain small programs that act on data or otherwise do things. For the commonest example that it seems every introductory OOP discussion uses, a “shopping cart” on a web site is typically an object, and it has “methods” (functions) like add or delete items, and it has “fields” (variables) like customer ID and date started.

Heck, yes. I wrote one in assembly language years ago just because there wasn’t any for the hardware we were using. Admitedly, it wasn’t a GUI and it had many limitations, but it worked well, reliably and fast.

Now if only we could say that about Windows.

Here is an example of the difference between good and bad code:

I was writing a (very!) simple script the other day to make sure that all 2000+ data points I had chosen for an analysis were within the bounds that the sponsor asked for, and cobbled something together. I set it running, and then asked one of the actual programmers in my office to check it. He suggested two simple changes that shortened the code by about six lines, eliminated a nested FOR loop, and improved the execution speed by an order of magnitude – that is, on a larger dataset, his improved code would finish in under one minute and mine would take as long as two hours.

In a “introduction to programming” course, both versions of the code would be marked correct, probably for full credit. In the real world, his code saved my sponsor a few hundred dollars this time, and a few thousand dollars over the life of the project.