Understanding the programming language of DNA?

Do we know anything about the “programming language” of DNA? I don’t have a better term than programming language, but what I mean is the A’s and T’s, C’s and G’s encode certain proteins (we know this much: http://en.wikipedia.org/wiki/File:GeneticCode21-version-2.svg). But do we know how all these various proteins are read and translated into actually making an organism?

What reads all these proteins and determines that it means “make two legs, with 5 toes on each foot”? Something in the programming language must understand how all this data is translated into making 4 fingers and a thumb, or making a dorsal fin, or making a nose that can smell 1000 times better than a human nose. Could we rewrite the DNA to make someone have 100 fingers on each hand? If DNA can code for 5 fingers, why not 100?

I know we know enough to identify certain genes and sometimes we can splice them into other animal’s DNA. Sheep whose milk produces spider silk etc. But what reads the particular section of DNA and translates that into making spider silk? Could I sit down and make DNA that would cause an organism to produce acetaminophen? Do we know how this would work? There has to be some underlying “language” to this, right?

I’m not sure the question is entirely clear… What reads and understands the sequence of proteins produced by DNA and figures out that it means “grow two eyes here” or “grow a furry tail here”? A programming language is the best analogy i can think of.

The part about going from DNA to proteins, we understand pretty much completely. The part about going from proteins to entire functioning organisms, though… Well, we understand a few tiny snippets. A handful of traits are in fact due to the simple, direct action of a single protein. But if we wanted to create an organism that had three eyes, we’d be clueless on how to do it, unless we happened to find some organism where it’d conveniently already happened, and even if we could duplicate that specific organism, we probably couldn’t cross it with most other traits.

The answer is probably long and complex and the “language” is probably much more indirect and less precise than you may be imagining.

There is most likely no 3-D coordinate system that the DNA explicitly references during development regarding placement/positioning, instead I believe chemical gradients guide a lot of development - but someone else more knowledgeable will need to go into detail.

You’re basically looking for a course in developmental biology. Before that you will need courses in cell & molecular biology and genetics. The short version: The system is extremely complex and there are still many unanswered questions. But it’s not the case that DNA can make any cell/organism do anything you can imagine.

To take one example, programming a cell’s DNA to make acetaminophen would require that you create a whole suite of new proteins inside a cell that can do something completely new to the cell without otherwise destroying it. Even if we knew how to do such a thing (we don’t), the process involved would be a staggering technical challenge.

While it seems like life has accomplished a nearly infinite range of engineering feats, each one is only possible when all the ingredients are just right. Spider silk requires the whole spider, not just its DNA. And overall, the many things cells can do are done with a fairly small tool chest. You’ve mostly got carbon, hydrogen, oxygen, nitrogen, phosphorus to work with. Temperature and pH have to be maintained in a fairly narrow range (except for a few extremophiles). And you’re stuck with a lot of historical baggage and inefficiency.

Basically, you can do a lot in theory with biology, but in practice it’s a real headache.

You’ve been hired to add a feature to the most god-awful mass of spaghetti code you can imagine. There’s not text editor and it takes nine months to compile. Welcome to bioengineering.

To stretch the computer analogy, we understand the machine code. We know the first level - DNA to RNA to protein. That’s been worked out for decades. Higher-level programming involves a vast number of proteins interacting with each other and an even vaster number of non-protein molecules in a myriad of incredibly complex, constantly changing, and widely varied ways. We’re still working on those. We’ve figured out, oh, a few hundred to a few thousand pathways, perhaps, but that’s just a start.

The programming language analogy has been stretched beyond its usefulness when you start asking questions like these. The pattern of nucleotides isn’t an abstraction that’s read by something - it’s a chemical, and reacts with other chemicals in the cell in various ways, switched on and off by the presence of yet other chemicals there. There’s no top-down understanding of the code anywhere - each cell does what it does with the chemicals that it’s sitting in. At the end of the process you have a functioning organism.

It’s best to understand your genes, not as blueprints but as a recipe. There’s no gene that says “Build an eye over here, and another eye over there. Oh, and make a 27 inch tail waaay over there.”

Instead you have genes that code for proteins. When the genes are activated or turned on by a complex process we don’t understand very well, your cells crank out the protein. Lots of times those proteins are enzymes which catalyze other reactions in the cell. So digestive enzymes (like, say lactase) catalyze the breakdown of the sugar lactose into glucose and galactose.

So cellular development is more like “Churn out lots of chemical X” But when chemical X concentrations reach a certain point, that triggers the production of chemical Y, which does certain things, and when chemical Y gets to a certain concentration that triggers the production of chemical Z which happens to turn off the production of chemical X. And when chemical X concentrations are high, processes A, B, and C happen in the cells, while when they are low processes D, E, F and G happen.

As was said earlier we understand very exactly how nucleotide sequences on a strand of DNA are transcribed to RNA which is taken to protein factories and turned into proteins. We don’t know very well what all those proteins do, or how genes get turned on or off, or why the whole thing doesn’t just collapse out of too much complexity.

The DNA is like the alphabet, we know know to make words (amino-acid triplets) and we’re beginning to learn how to make sentences (proteins). We’re still a ways from learning how to make paragraphs (tissues) and chapters (organs). Obviously no where close to to knowing how to write a novel (organism).

However, we can take sentences out of existing novels and insert them into another novel and have it make sense … like Bt corn or Alex Rodriguez.

Start at the Hox genes to get a very basic understanding and read from there. Unusually, these genes which drive some of the body organization are apparently read in the order they are coded on the chromosome.

I thought the pattern of nucleotides produces certain proteins based on the triplets of A-T’s and C-G’s. I’ve seen charts of it (i linked to one in my OP). There are like 30 or so proteins as I understand it. And these somehow interact with and are ‘read’ by something to ‘mean’ a finger or a toe or black skin or triple-penis.

One thing I am sure it’s not is a 3D coordinate system with “cell here” and “no cell here” or whatever. If anything it probably works through fractals at least to some extent.

But I guess the answer to my question is: we don’t yet know. I kinda figured that was probably the answer. I was hoping we’d know something about it though.

That thing about the Hox gene is pretty interesting though, I’d never heard of that.

They need a fairly narrow range, too-- It’s just a different fairly narrow range.

Dumb question: is it possible to have two almost identical organisms of human-level complexity with wildly different DNA? I’m not speaking of a “real world” scenario here, so practicality isn’t an issue.

Partly true. A sequence of three nucleotides is called a “codon”, and there are 64 of them (that’s 4^3). Each codon corresponds to one of 20 amino acids, or to a sort of “punctuation mark” (some amino acids have more than one codon that codes for them). The 20 amino acids are the building blocks of proteins, but there’s an infinite variety of proteins that can be made from those building blocks. When the cell is producing a protein, it reads off a section of DNA in order, and attaches the corresponding amino acids in that same order. That chain of amino acids is the protein.

And then, all of those myriad proteins interact, and modify other proteins, and enable other proteins to interact in other ways, and produce other chemicals that aren’t proteins, in an extremely complicated manner of which our limited understanding only scratches the surface. Something like dark skin is relatively simple: You have a dark-colored chemical called melanin, and if you have a lot of it the organism has a dark color and a small amount means a lighter color (though there’s still some complications in figuring out just how much melanin to produce). But fingers, or toes, or penises, are much more complicated, and come from that dance of all the many chemicals.

We know some of it. For instance, take the germinal stem cells in the fruit fly testes. Basically, every time a fly wants to make sperm cells, it starts by a stem cell dividing into two daughter cells. One daughter stays a stem cell, and the other daughter starts down the path of differentiation to become a bunch of sperm cells. What’s the difference? How do two genetically identical cells end up going down such different paths?

It turns out that it has to do with the position of the two daughters after cell division. The stem cell lives inside a “cup” made up of a certain other cell type. When it divides, one daughter cell will be down in the bottom of the “cup”, and the other one will be partially kicked out of the cup. The cells that make up this cup are programmed to secrete certain chemical signals which, above a certain concentration, tell the receiving cell to stay a stem cell. The daughter cell in the bottom of the cup gets lots of these signals, because it’s got more secreting “cup” cells around it. The other daughter gets much less signal, and so heads down a different path, instead, guided by signals from within and from yet other cells in the area.

So at least in some cases (and probably in just about all cases), cells are busy telling one another what to do in various clever ways.

There has to be some underlying organization and meaning to all those proteins and chemicals interacting. It’s the same for all life on earth. An elephant or a tube worm at the bottom of the ocean all rely on the same building blocks.

How about this. My dream pet is a mini pygmy elephant. Like the size of a golden retriever, with a trunk about a foot long. How far away do you think we are from being able to engineer something like that?

What, no one’s said the Flying Spaghetti Monster yet?

my emphasis

I once presented some research on the thyroid hormone signaling pathway, showing gene expression data, and was asked by a doctorate level researcher why I didn’t do real-time PCR for thyroid hormone.

Uhmm…because it’s not a gene product.

The Flying Spaghetti Monster (marinara be upon him) created all life on earth, DNA, etc. but that doesn’t mean we can’t understand his creation.

Do you mean Covergent Evolution? Lotus and water lilies are something of a real world example.