OpenStax Biology 2e, Genetics, Genes and Proteins, The Genetic Code

The Genetic Code

The cellular process of transcription generates messenger RNA (mRNA), a mobile molecular copy of one or more genes with an alphabet of A, C, G, and uracil (U). Translation of the mRNA template on ribosomes converts nucleotide-based genetic information into a protein product. That is the central dogma of DNA-protein synthesis. Protein sequences consist of 20 commonly occurring amino acids; therefore, it can be said that the protein alphabet consists of 20 “letters” (Figure). Different amino acids have different chemistries (such as acidic versus basic, or polar and nonpolar) and different structural constraints. Variation in amino acid sequence is responsible for the enormous variation in protein structure and function.

Structures of the twenty amino acids are given. Six amino acids—glycine, alanine, valine, leucine, methionine, and isoleucine—are non-polar and aliphatic, meaning they do not have a ring. Six amino acids—serine, threonine, cysteine, proline, asparagine, and glutamate—are polar but uncharged. Three amino acids—lysine, arginine, and histidine—are positively charged. Two amino acids, glutamate and aspartate, are negatively charged. Three amino acids—phenylalanine, tyrosine, and tryptophan—are nonpolar and aromatic. — Structures of the 20 amino acids found in proteins are shown. Each amino acid is composed of an amino group ( $N H_{3}^{+}$ ), a carboxyl group (COO^-), and a side chain (blue). The side chain may be nonpolar, polar, or charged, as well as large or small. It is the variety of amino acid side chains that gives rise to the incredible variation of protein structure and function.