Do biologists currently know the sequence of events that go into making any given protein or enzyme or cellular structure? I suspect that the answer is no, but how much do they know? For example, what is the sequence of steps that the body uses to generate a phospholipid molecule from a gene and how does it get from where it is made to forming the cell membrane? Is there a general name for these processes?
Proteins are easy. A protein is just a chain of amino acids strung together into a particular order. Get the right links in the chain, and it folds into the proper shape automatically. Usually, at least: Some proteins have two or more stable configurations (see: prions).
So how do you get that sequence of aminos? From the genes. A sequence of nucleic acid bases is transferred (I can’t remember if this process is called translation or transcription) from the DNA to a strand of messenger RNA, and the mRNA then wanders around the cell until it hits a ribosome. The ribosome fits onto the mRNA strand like a zipper pull on half a zipper, and works its way down the line. Also floating around in the cell are 64 different kinds of fragments of transfer RNA. Each fragment has an end which matches to a particular three-base segment of mRNA (called a “codon”), and another end which matches to a particular amino acid. So, for instance, one of those tRNA molecules matches up to the sequence ACG on one end, and to the amino acid threonine on the other end. Every one of the 20 common amino acids is associated in this way with at least one codon, and some of them to several. The mapping of codons to amino acids is called the genetic code.
Anyway, when the ribosome gets to a codon, it’ll stop there and wait, until a bit of tRNA happens to come along with the amino acid stuck to it. The tRNA then fits onto the mRNA, the amino acid it was carrying gets added to the growing protein chain, and the ribosome moves on to the next codon. Eventually, it gets to what’s called a stop codon (UAA or UGA, which don’t code for any amino acid), at which point the protein stops growing, and it releases it into the cell.
Molecules other than proteins are more complicated, and can be made in a variety of different ways. Some are based on proteins, but have a few modifications made to them (hemoglobin, for instance, has an iron atom attached to it). Most of these reactions to make these new molecules are catalyzed by enzymes, which are themselves proteins.
Note that while we can figure out the exact three-dimensional shape of any protein molecule once we’ve got a sample of it, we generally aren’t able to predict that shape for any given arbitrary amino acid sequence.
Oh aye mate, they’re well up on this stuff. They have their shit together. Its a bit late for a detailed post, but a good way of thinking about this question would be to consider the primary, secondary, tertiary and quaternary structure of a protein. This gets progressively more difficult to understand and predict, with assembly of a primary structure being very firmly grasped, whereas quaternary processes are the focus of significant current research
The question ‘…the body uses to generate a phospholipid molecule from a gene’ doesn’t make good sense to me, although that might just be me. Seriously, different areas of the sciences employ quite different terminology. Genes do not code for lipids directly, they code for the proteins that take part in the assembly of lipids. Lipids are comprised of fatty acids. Again, the primary processes that put the structures of fatty acids together are canonical. Moving up to higher levels of assembly is progressively more challenging to describe.
I’ll post specifics tomorrow if no one else takes care of it.
Take a graduate level course in Molecular Biology. You’ll learn more than you ever wanted to know about transcription and translation.
groan
There are still unanswered questions, but the process and regulation of transcription and translation are well known. Chronos gives a good thumbnail sketch of protein synthesis. Other molecules like lipids and sugars are made by enzymes using intermediates and byproducts of several key pathways such as glycolysis, TCA cycle and fatty acid synthesis. There are also secondary metabolic products, like limonene, mycophenolic acid and adriamycin that are made in an assembly-line fashion by a series of enzymes. Many protiens and molecules can also undergo post-translational modification, where sugars, amino groups, lipids or oxidation products are added to make the molecule available for a specific function. As I said, we know a lot about synthesis and modification, but there are still a lot of holes in our knowledge that people are working on.
Thanks for all your replies. I am somewhat familiar with the process (DNA gets split, one of the types of RNA gets constructed, some stuff happens, eventually the cross-shaped RNA with the two amino acids is constructed, this molecule bonds with another molecule which adds the amino acids to the protein it is constructing, sometimes other molecules help it fold a certain way, etc.) but as I understand it, the protein that gets constructed may have a job making a secondary molecule which might make a third and so on. Finally, you get to a molecule which is the end result of the whole process, e.g. a phospholipid. What percentage of these processes are well understood?
A while ago, I posted a thread asking how this process evolved which didn’t get much response. I am willing to bet that no one knows for sure, but what do we know?
I am also curious how a humble computer programmer like myself might make a contribution to the understanding of the cell. Obviously, I only have an amateur’s knowledge of biochemistry and molecular biology, but as I understand it, you don’t have to dig to deep to find interesting questions in this field.
You can do what I did and join a computational biology graduate program.
There’s any number of interesting problems that are being attacked via computational methods right now. The big one you probably heard of is protein folding: for a given chain of amino acids, what is the correct tertiary and quartenary structure? I need to disagree somewhat with what Chronos said - proteins don’t necessarily fold into their ‘correct’ structure automatically. There are a series of ‘chaperone’ proteins whose job it is to set up the appropriate environment for the protein to fold up correctly. Without those proteins, if you just dump the amino-acid chain in a beaker, it won’t fold up correctly, even with a properly biological solution.
Another place that’s currently under intensive research is PTMs and SNPs. PTMs (post-translational modifications) are any sort of covalent modification that are done after the protein has been assembled by the ribosome. Phosphorylation, methylation, etc. These changes will modify the function of the protein, affect how it interacts with other instances of the same and different proteins (and thus modify quartenary structure), and often modify the cellular compartment where the protein is expressed. Epigenetics and the current work on the Histone Code is almost entirely driven by PTMs on the histone proteins. In fact, part of my work is developing a computational model for how gene expression is modulated by PTMs on histones.
SNPs are in a similar boat. They’re a single nucleotide change in the underlying gene. If they’re coding and nonsynonymous (that is, in the part of the gene that gets translated, and the codon that results from the change does not code for the same amino acid as the old one), you get a change in the primary structure (sequence) of the protein. This may modify the properties of the protein considerably - quite a few SNPs show decreased fitness (enhanced succeptibility to multifactorial diseases such as Alzheimer’s, heart disease, etc), while a few show a protective effect (the Milano mutant phenotype of Apolipoprotein A-1 in the case of heart disease).
Then there’s things like NRPSs - Non-Ribosomal Peptide Synthesis. This is important for figuring out how bacteria and fungi produce those weird molecules that we use as antibiotics. If we can understand how, mechanistically, we can string together the NRPS genes into an ‘assembly line’ for a peptide, and we may be able to fundamentally improve our ability to produce antibiotics.
Oh, and speaking of antibiotics - that’s all 3D-molecular-interaction. Computation out the wazoo - anything better than our current molecular simulations would be awesome. Simulating atomistically and Newtonially (that is, no quantum mechanics, and thus no chemistry; each atom is just a ball), we can simulate maybe a million atoms, for 10 nanoseconds or so. This’ll take at least a month. If someone can come up with an accurate coarse-grained simulation that runs fast and can explore the microsecond-timescale, it’ll be a huge boon for molecular biologists.
And that’s just scratching the surface. I didn’t mention the ‘classical’ bioinformatics work of sequence analysis, which is the workhorse of the computational biologist, or biological networks modeling (molecular, cellular, physiological, ecological…) but there’s a ton of stuff going on there too.
Bioinformatics and data mining are getting hot at the moment, and us calculus-challenged bench jockeys need help creating analytical tools. Another area is Systems Biology, where a cell, organ, organ system or organism is treated as a whole and analyzed as a whole in response to envronmental changes. It requires a huge amount of data processing because you’re dealing with an aggregate of subsystems. You need to be able to formally describe the state of each subsystem as well as the state of the whole in order to create an accurate mathematical model of the system. This starts to get into chaos theory and strange attractors and how they can apply to a wide range of molecular biochemistry investigations.
Thanks for the responses. Does anyone know the answer to my other question, namely, what do we know about how the cellular machinery evolved? Also, does any prokaryote have a complete protein map?
The closest would be E. coli K-12 MG1655. I don’t think we have every single possible combination of PTMs for every protein, protein complex, or derived peptide, but it’s probably the closest we have to what you want. There are reasonably complete metabolome and protein-protein interaction maps available too.