I have to say, this topic really attracted my attention. I wasted a couple of hours at work today trying to work through all this and understand it. I read through a couple of other sites on this, too. But I have a question or two, and I’m hoping someone with some patience can explain this to me.
Protein, as I understand it, is made up a string of amino acids. There are approximately 20 amino acids, and these are the building blocks of life. I also understand our DNA structure is a string of 3 billion of these amino acids, but only 4 are present in DNA (A, G, C, T).
If DNA contains only A, G, C, T, how are the other amino acids brought into the creation of proteins? There must be more to the manufacture of proteins than just using the sequence of amino acids found in the DNA.
A protein is a string of amino acids. Great. But as far as I can tell, if DNA is nothing but a string of amino acids, DNA is nothing but a very, very long protein. Is that true? Therefore genes, proteins, DNA, RNA are all just strings of amino acids.
I should have read more before posting… sorry. I found my own answer. I was confused because of the similarity of names between Adenine/Alenine, Cytosine/Cysteine, Guanine/Glycine, and Thymine/Threonine. In other words, I didn’t read carefully enough.
For those that aren’t following, DNA is made up of nucleic acids, not amino acids. There are 5 nucleic acids (those listed, plus Uracil, which RNA uses in place of Thymine) and 20 amino acids. These combined are the building blocks of life. RNA can be thought of as small, single sided section of DNA (with T of the DNA replaced with U in the RNA). The open sided RNA in the crytoplasm attracts and repels amino acids depending on the nucleic acid. Thus, strings are amino acids are formed. The attraction/repulsion apparently is electromagnectic, which I personally found really interesting.
So I thought of another question as I was drifting off to sleep last night.
Assuming there are only 4 different nucleic acids in a string of RNA, and there are 20 amino acids from which to build a protein, the attaction of nucleic acid to amino acid must not be 1:1. In other words, Adenine must not attract just Alenine and repel all others. Otherwise, only 4 different amino acids could be attracted to the RNA strand. Basically this means to me that it takes a string of 3 nucleic acids to attract a specific amino acid. 4^2 would be good enough for 16, and 4^3 is good enough for 64.
Based on some reading I did which called groups of 3 nucleic acids a ‘word’ (and grouping of ‘words’ a gene), I’m assuming this to be accurate. Can anyone verify if I have the concept right?
Are genes in the DNA marked somehow? How does one know when one gene begins and another one ends?
So can anyone put the human genome project in perspective for me? What were they trying to accomplish? If it was simply mapping out a string of 3 billion nucleic acids, wouldn’t that be specific to a single person? I’m assuming it was going after the genes, and not the specific DNA. Which made me wonder how gense are differentiated in the DNA.
There are sequences within the DNA that mark when one strand can begin to form and another sequence that makes it stop. The formation of the RNA strand is called transcription.
This RNA code again has specific 3-nucleic acids sequences (called codons) that indicate when a protein chain starts forming and when it is completed. This is called translation. You are right that there are 64 possible combinations of codons. One codon specifies the start of protein synthesis, 2 or 3 specify the end of protein synthesis. The rest specify for amino acids.
This is something tricky to understand:
Each codon specifies for one and only one amino acid. One amino acid, though, can have more than one codon specifying for it.
Example: Codon UGG always specifies for the same amino acid, say abc. Amino acid abc, though, has codons UGG, UGA, and UGU that specify for it.
Every single human has the same genes. The difference is in the alleles, or the different variations of a gene. Say, for a trait like ABO blood group.
One gene is in charge of the blood group. You have two copies of the gene, one from each parent. In one of the genes, there is a code for protein A. In the other, the code is for protein B. Both are functional, codominant proteins that serve the same purpose. You express both and are an AB individual.
What the humane genome project is trying to accomplish is to know where the genes are located, in what chromosomes and the correct order. The genes would be located in the same exact place for every person, but the alleles can (and are) different among individuals.
You are correct: it takes three DNA nucleic acids to code for each amino acid in a protien. As you noted, 4^3 gives 64 possible three nucleic acid codes (called “codons”). One is a start codon. It codes for the amino acid Methionine, which is the first amino acid in all protiens. This codon basically tells the machinery “start manufacture of protien here.” Three are end codons – “stop the protien here.” The remaining sixty code for the remaining 19 amino acids. As you can see, there is some redundancy in this system, and most amino acids are coded for by more than one codon. Here is a table of the genetic code. Interestingly, this is very stron evidence for the evolution of all life from a single common ancestor, since every known organism uses a virtually identical genetic code, and the odds of this happening randomly are virtually nill.
Also, the amino acids don’t just electrostatically stick to the codons. There is a further step in the process. There is a type of RNA called tRNA (transfer RNA) that is required. It is shown at the top of the link I provided. Basically, tRNA has the three nucleic acids that match the codon (the “anticodon”), which causes them to stick together. The tRNA has the appropriate amino acid bound onto the other end, which the cellular machinery can then add onto the growing protien.
This is all fascinating stuff! Thanks… I never really understood all this until now.
So more questions… if you don’t mind!
I’m assuming not all the genes of the human body are known. The human genome project might tell us that on chromosome <#>, in location <whatever> there is some gene, but that can’t possibly mean we know what that gene does, can it? I’m assuming there are thousands upon thousands of genes that were identified that no one has any clue what they do.
As was pointed out earlier, basically DNA testing is looking at certain genes and then (I might not phrase this right) matching the alleles. If the alleles match, you have the right person. Is that right? If that is correct, can anyone put in simple terms how this is physically done? How do you pull DNA from a nucleus, then unravel it from the protein, then isolate a particular gene in the DNA, then look at the nucleic acid sequence (allele?) of that gene?
cmosdes, there is a technology called PCR, polymerase chain reaction. It is a process in which the chromosome is denatured (separated) in two strands, each of which serve as a template to create a new double helix, which is again denatured, and then copied, and the cycle starts again. This is the way to obtain a lot of copies of DNA from a relatively small sample.
I forgot (and don’t have with me) the next steps in specifically searching for a gene, other than to say that it is done and the process is not that complicated.
cmosdes, your link states it’s highly simplified, which it is. It completely ignores the role of the ribosome in replication. The ribosome is the place where the actual amino acids and protein molecules are formed.