I know there are 20, but those aren’t the only ones. Plus I think those are all L-amino acids, not D-amino acids.
So including all the amino acids not in the 20 we normally talk about when we discuss amino acids, and including all the D amino acids, how many are there total?
The 20 L-amino acids that you mention are the only ones incorporated into proteins by most animals and encoded directly by DNA (add selenomethionine and pyrrolysine for #21 and #22 but that is kind of cheating). Amino acid just means something with an -NH[sub]3[/sub][sup]+[/sup] on one end and a -COO[sup]-[/sup] on the other end. I’m sure there are countless ones. Wikipedia alludes to 500 in nature. You may as well ask how many kinds of acids there are…
The 20 amino acids you hear about are the only amino acids which are coded for by DNA. All proteins synthesized by the cell initially contain only these amino acids. (caveats, yeah, there are sometimes substitutions, like when you eat too much selenium, and get seleno-cystine incorporated instead of cystine, and a few years ago someone did create an arificial tRNA that incorporated non-standard amino acids into proteins.)
Other amino acids, and there are an infinite number of them, get incorporated into proteins by post-translational modification. Examples include hydroxyproline, ornithine, citrulline, 5-hydroxy tryptophane, octopine, hydroxylysine, homocystine etc. etc. Even D-amino acids appear in some marine peptides.
How? There are only so many molecules that you could substitute into the amino acid, and even counting lengthening the chains or making side groups on the chains, etc, I’m sure there’s a ceiling on how big it can get before it would function.
And since I’m not a biochemist, and that may be wrong, there’s only a certain amount of matter in the universe, which is not infinite. So nyahh.
According to my Biochem text (3rd Edition Nelson+Cox), there are over 100 amino acids in human and other eukaryotic cells which are created by chemical alteration and changes after protein expression. So, even without going into the purely synthetic ones, there are a hell of a lot.
If I start with glycine, I can add a methyl group and get alanine. Adding another methyl group gives another amino acid, to which I can add a methyl group to get still another- ad infinitum, worlds without end.
Minor difficulties, like the finite number of carbon atoms in the universe, do nothing to limit the theoretical possibilities.
Alanine for example is GCU, GCA, GCT & GCG so if the 3rd DNA base mutates you will still get an alanine amino acid. The same rule applies toa lot of the amino acid codons on that website, any 3rd base pair will still code for the correct amino acid. What I don’t understand is why is it usually only the third base that seems to be affected by this? With alot of the acids the 3rd base pair can be any of the 4 base pairs (A, T, C, G), does DNA tend to mutate more readily on the 3rd base pair or something?
20 is a nice number. With it you can make proteins with a wide variety of properties. The trouble is, you need 3 nucleotides to code for 20 amino acids. If there were only 16 coded amino acids, you could get by with a 2 letter genetic code, and if there were only 4 amino acids you could use a one letter code. Apparently 20 is enough better than 16, evolutionarily speaking, that life settled on using a 3 letter code. With that code you could specify 64 different amino acids (or 62 plus a start and a stop codon), but cells only use 20 amino acids, so you’re stuck with either having nonsense codons (which code for nothing), or having a redundant code. It would not do to have nonsense codons, so the extra codes have to be partitioned among the amino acids that are used. Theoretically, you could construct a translation system where the slop occurs uniformly in the first, second, or third base. However in real life, the codon must fit into a binding site, and binding sites for extended molecules tend to hold more tightly (more specifically) to the middle of a ligand than to the ends. (if you don’t believe that, try undoing your zipper from the middle some time) The RNA molecule is also asymmetric so one end will bind more tightly than the other, and all codons share a similar structure so the end which binds more tightly will be the same for all of them. It turns out that it’s the 1st base that binds better than the third, so the slop is all concentrated at the third.
But that would only work up to a point, right? After a while of adding methyl(ene?) groups, wouldn’t it cease to function as an amino acid, i.e. be too large to fit into organelles or be too big to fit through nuclear pores, etc.?
You didn’t ask “how many aAs can be used in a human protein”, you asked “how many aAs are there”. The answer is “a pochillion”.
And the ones in human proteins are all alfa (meaning the amino and acid groups are all attached to the same carbon) and all L-. But aminoacid just means " at least one amino and at least one acid group in the same molecule", so something that has them separated (in beta, gamma, etc. positions) is still an aA.
PS: a pochillion is like a bazillion, only bigger.
No, it is just a statistics thing. There are 61 codons for 20 amino acids (and 3 stop codons). Some, like methionine, are only encoded by 1 codon (ATG), while others like your mentioned alanine are coded by more.
DNA mutation happens pretty much randomly. Having the redundancy through the wobble base (and in leucine with two separate tRNAs) means that automatically it becomes less likely to mutate a base pair that matters. If the third base never mattered (which is not the case – for instance ATG is methionine and ATC is isoleucine), 1/3 of mutations would automatically be silent.
A similar reason is often hypothesized to explain the fact that many organisms have huge quantities of repetitive and other non-coding DNA. It just makes it less probable that a given mutation will cause a deleterious effect.