You are here: Gene to Function > Central Dogma > Genetic Code
Once established that DNA carries the genetic information, we need to understand how the information is coded.
It is immediately evident that a combination of three nucleotides would code for an amino acid. Indeed:
The table below shows the possible number of codons depending on how many nucleotides are used each time.
| Singlet Code | Doublet Code | Triplet Code | ||||||||||||||||||||||||||||||||
| A U C G Four possible |
16 possible |
64 possible |
Francis Crick
, in 1961, provided experimental evidence for this hypothesis.
Working with the bacterium Escherishia coli, he made additions and deletions of
one or more nucleotides to specific genes. These changes caused the gene
to be misread and resulted in abnormal phenotypes. But when three bases
where added or substracted, the resulting phenotype was normal or almost
normal.
The code was eventually broken by the preparation of a length of nucleic acid, messenger RNA, in which one triplet code was repeated many times. Initially, this synthetic RNA was made of single nucleotides, producing structures like AAA-AAA-AAA-AAA-AAA which codes for a polypeptide made only of Lysine (Lys-Lys-Lys....), or CCC-CCC-CCCC (Proline: Pro-Pro-Pro-....), or UUU-UUU-UUU-UU (Phenylalanine; Phe-Phe-Phe...). Subsequently, synthetic RNAs with various known sequences of nucleotides were produced, and added to a cell-free enzymatic protein synthesis system, and the products analysed.
| Second base
First base |
U |
C |
A |
G |
Third base |
||||
|
U |
UUU: Phe UUC: Phe UUA: Leu UUG: Leu |
TCT: Ser UCC: Ser UCA: Ser UCG: Ser |
UAU: Tyr UAC: Tyr UAA: Stop UAG: Stop |
UGU: Cys UGC: Cys UGA: Stop UGG: Trp |
U C A G |
||||
|
C |
CUU: Leu CUC: Leu CUA: Leu CUG: Leu |
CCU: Pro CCC: Pro CCA: Pro CCG: Pro |
CAU: His CAC: His CAA: Gln CAG: Gln |
CGU: Arg CGC: Arg CGA: Arg CGG: Arg |
U C A G |
||||
|
A |
AUU: Ile AUC: Ile AUA: Ile AUG: Met |
ACU: Thr ACC: Thr ACA: Thr ACG: Thr |
AAU: Asn AAC: Asn AAA: Lys AAG: Lys |
AGU: Ser AGC: Ser AGA: Arg AGG: Arg |
U C A G |
||||
|
G |
GUU: Val GUC: Val GUA: Val GUG: Val |
GCU: Ala GCC: Ala GCA: Ala GCG: Ala |
GAU: Asp GAC: Asp GAA: Glu GAG: Glu |
GGU: Gly GGC: Gly GGA: Gly GGG: Gly |
U C A G |
||||
The code is degenerate
: more than one codon
may code for the same amino acid.
The code is (almost) universal: any particular codon represents the same amino acid in bacteria, plants, fungi, or animals (small differences among bateria)
Only two amino acids are coded by a single codon: Methionine - ATG and Tryptophan - TGG.
There are three 'Stop' codons: they do not code for any amino acid, but when they are present, signal the end of a protein (TAA, TAG, TGA).