Computer Programmer Question

I’ve heard, on more than one occassion, a statistic about the average number of “good” lines of code produced by a programmer per day. The number is usually rather small, which is understandable, but I cannot seem to find some sort of reference to this information. Anyone out there who can help? Thanks in advance!

There is no such information that would reflect anything close to reality. Most of what a programmer does has nothing to do with writing code. Indeed, writing new code is rare; the vast majority of actual programming involves maintaining and updating old code, probably written by someone else. “Lines” are also a rather arbitrary measurement of code volume, as they generally just represent a single functional statement. It’s easy to write 200 lines of code if your function must begin by defining 200 constants. On the other hand, it may be extremely difficult to write a fast, efficient sorting algorithm which only takes ten lines of actual code. The latter example will almost certainly require a great deal more debugging and be far more prone to errors. A more useful metric is to divide a project up into modular tasks that resemble the planned structure of the code, and measure when each task is completed. This is especially easy if you’re working in a functional or object-oriented environment. It’s a bit more difficult if you’re working with spaghetti.

Well, there are all kinds of “code” (different languages, environments), all kinds of programmers, and all kinds of company policies which can shape productivity.

I’ve heard the average range from 3 lines, to 50 lines, to 1000 lines, per day, depending on many factors.

Here seems to be a good article on the subject. Found with a google search on “lines of code per day”. Include the quote marks.

New code is being written all the time; much of it is junk. Some of the junk is “good enough” in the sense that it will escape the QA process unscathed, and count as quality code, even if it’s not. This is hard to measure.

I can’t answer your question with a cite or a number, but two points:

a) It’s generally true, but the reason is that there is a ton of stuff that a coding team does that’s not coding. Like designing, testing, integrating, and maintaining. So when you average how many lines of code a team develops over say a year and divide by the number of team members, the number is relatively low.

b) It’s an incredibly variable number. For a lot of reasons: one line of asm does a lot less then one long line of perl, so doing the equivalent in one can take 50 times as many lines easily. Also, coding something complex like a driver in say c takes a lot more time per line then something simpler like UI code in Visual Basic.

So… Comparing one line of code to another can be apples to oranges, but in general, the concept is true.

You may find the book “The Mythical Man Month” interesting.

Boy, in the time it takes to type it in, two excellent answers beat me

I understand how programming works. The first thing I was really taught about programming was team team team. And I understand that lines are rather arbitrary. Thanks for the info friedo! I actually came across that site before I posted AmbushBug. Very informative. It seems like everytime I preview theres a new post! Yeah, after seeing a small variety of different languages I would have to agree with you Bill H. Development, testing, etc along with variations in language also have to be factored in. I just thought that the statistic that I heard was actually a statement by a certain person. I guess the actual quote I’ve heard before is just a rumor though.

      • A college instructor I had once said that IBM did some study back near the ice age and they found that one line per day was a realistic estimate for bug-free code. He claimed that this was a reference to the total time it would take to write any program and completely debug it. So a 10,000 line program would take a single programmer 10,000 days to create+debug, or ten programmers 1000 days, and so on.
        ~

Just remember: managing programmers is like herding cats.

DougC wrote

I don’t dispute that a college instructor said it, or even that someone in IBM said it, but I completely dispute it’s accuracy. For the reasons I cited, plus (and this is where my recommendation for reading “The Mythical Man Month” came in) coding’s nature is such that if 1 coder can do it in 10,000 days, does not imply that 10 can do it in 1,000.

Also, the phrase “bug-free” is a dream at best.

In the business where I work we keep acurate records of this type of information and we work to about 1.7 LOCs/hr with a defect density of <2 / 10000 LOC at the final verification and validation phase. This is of course government work and involves human safety, with the corresponding (high) cost. YMMV

Ambushbug
I liked the article you linked to but disagree with the following:

bolding mine

The problem with this is that it assumes shorter = simpler.

A (n overly simplified example of a) statement in a program like:
A=((B3)*(C+D))
requires more and longer analysis and more skilled debugging skills than if the statement were coded:
X= C+D
Y=B
3
A=X*Y

The first is more “elegant”, but if it produced a wrong result, one would have to dismantle it to understand it and produce results of the components to debug it. The second is already dismantled and, since the components are already discrete, each could simply be displayed to determine which is incorrect.

I’ve been coding off and on since 1972 and consulting in IT since 1986. My “mantra” has always been to code for the “next guy to debug the code”. The cost of manpower to correct code is almost always less than the cost of a few clock cycles.

rackman’s numbers seem about right to me, in my experience. These numbers come from taking the total NCLOC (non-comment lines of code) in a program and dividing by the time it takes to design, write, and debug the program. It also is for programming in the large, not programming in the small. Sure, we can all bang out more lines of Perl than this in a day for simple stuff, but that’s not what’s being measured.

This also doesn’t count functionality, only lines. I’ve written in assembler, and the lines of code per day are about the same, you just can’t do as much with them. Ditto reuse.

NutMagnet is correct in general, and there are lots of better progress and complexity metrics than lines of code (such as function points.) In the long run you pay for complexity.

My motto is “code in haste, debug in leisure”

Oh crap!
“The cost of manpower to correct code is almost always less than the cost of a few clock cycles.”

That should have been more.

If it was hard for me to write it should be just as hard for the next guy to understand.

Of course, you hope I’m joking.

Whether more or less, that statement depends in great part on what you are programming.
If I’m doing assembler on a DSP or microcontroller that has to meet very tight demands for low power and low memory, then I am constrained in what I can do with clock cycles. In such a case, it doesn’t matter how easy a piece of code is to maintain - if it can’t perform the required operations fast enough at the speed the processor is running, then I am screwed.
On the other hand, if I’m doing database operations, then clarity is going to win out. The database read and write operations dominate, and a few cycles more or less of calculation between database accesses are pretty irrelevant - and in this case code clarity is cheaper than clock cycles.

On another note, the completeness and clarity of the comments in the code are also important. You might write 1000 lines of actual code a week, and I only 500 - the chances are damned high, though, that I’ve commented mine thoroughly enough that anyone can see what I’m doing.
When I’m writing assembler, this is especially noticeable. Every line has a comment explaining it, and every functional block has a lead in description.
My worst experience was with a database system I had to modify. It was virtually uncommented. It had such useful comments as “This is the payroll module,” and then not a damned thing else for hundreds of lines of code.
Making changes to that system was mostly guess work, praying, and lots of full text searches of files to find out what functions called from a zillion places actually did. Finding a spot that needed changing could take me a couple of days - and the change a few minutes, all for lack of comments.
That was a crappy system, anyway. Mulitiuser with simultaneous access, but not server/client. No, no. Every client had full access to a single file on a fileserver and the clients regulated access by locking records. no problem inernally, as we use a 100Mbit Lan. We tried giving a customer access to it one time over an ISDN connection. That was an excercise in patience. Finding a single record could take upwards of 20 minutes - with indexes.

It depends entirely what your job is. I provide new engineering solutions to clients which are largely customized applications. I was shocked to find I’ve written about 2,000,000 lines of FORTRAN, VBasic, C, C++, and Java. However, a lot of that “new” code is is completely re-writing functions on existing applications, so even though it’s maintaining older code, it’s still “new” as far as anyone is concerned.

You’re exactly right on the code volume issue. Writing a “Hello World” MFC app takes about 10,000 lines of code behind the scenes, but it wouldn’t be fair to claim that one has written 10,000 lines by clicking through the Class Wizard.