Amino Acids


Amino acids are the building blocks of proteins. The sequence of amino acids in individual proteins is encoded in the DNA of the cell. The physical and chemical properties of the 20 different naturally occurring amino acids dictate the shape of the protein and its interactions with its environment. Certain short sequences of amino acids in the protein also dictate where the protein resides in the cell. Proteins are composed of hundreds to thousands of amino acids. As you can imagine, protein folding is a complicated process and there are many potential shapes due to the large number of combinations of amino acids. By understanding the properties of the amino acids you will gain an appreciation for the limits of protein folding and will learn how to predict the potential higher-order structure of the protein.

All amino acids have the same backbone structure, with an amino group (the α-amino, or alpha-amino, group), a carboxyl group, an α-hydrogen, and a variety of functional groups (R) all attached to the α -carbon. general structure of an amino acid

The general structure of an α-amino acid. The acidic group is a carboxylic acid. The carbon that is attached to the carboxylic acid is the α-carbon. If the R group were a carbon atom, it would be the β -carbon.

If all of the amino acids have the same basic structure with an amino, a carboxyl and a hydrogen fixed to the alpha-carbon, then the large variation in the properties and structure of the amino acids must come from the fourth group attached to the alpha carbon. This group is referred to as the side chain of the amino acid or the R group.

The structures of the 20 common amino acids are shown on the chart below. The simplest amino acid, glycine, is shown in the upper left. The main-chain atoms of glycine are highlighted in yellow and its side chain (H) is highlighted in green. All amino acids have the same main-chain atoms, but differ in the side chains. For clarity, the α-proton is omitted in the remaining drawings.

Screen Shot 2014-11-10 at 9.05.42 PM

The side-chain groups of these amino acids contain many common groups of atoms called functional groups. The majority of functional groups, such as the hydroxyl group (–OH), are commonly polar, allowing them to interact with water. Details of the functional groups can be found in the functional groups interactive chart, which can be accessed by clicking on the Learn by Doing link below.

Peptide Bonds

Proteins are polymers of amino acids. The amino acids are joined together by a condensation reaction. Each amino acid in the polymer is referred to as a “residue.” Individual amino acids are joined together by the attachment of the nitrogen of an amino group of one amino acid to the carbonyl carbon (C=O) of the carboxyl group of another amino acid, to create a covalent peptide bond and yield a molecule of water, as shown below.

structural representation of the dehydration reaction that occurs to form a peptide bond

Peptide bond formation occurs by a dehydration reaction. The amino group of the second amino acid attaches to the carbonyl carbon of the first, forming the peptide bond and releasing water. The resultant dipeptide has an amino terminus (left) and a carboxy terminus (right). The main-chain atoms, which are the same for each residue in the peptide, include the nitrogen and its proton, the α-carbon and its hydrogen, and the C=O group. The R groups form the side-chain atoms.

The resulting peptide chain is linear with defined ends. Short polymers (less than 50 residues or amino acids) are usually referred to as peptides, and longer polymers as polypeptides. Several polypeptides together can form some large proteins. Because the synthesis takes place from the alpha-amino group of one amino acid to the carboxyl group of another amino acid, the result is that there will always be a free amino group on one end of the growing polymer (the N-terminus) and a free carboxyl group on the other end (the C-terminus).

Note that after the amino acid has been incorporated into the protein, the charges on the amino and carboxy termini have disappeared, thus the main-chain atoms have become polar functional groups. Since each residue in a protein has exactly the same main-chain atoms, the functional properties of a protein must arise from the different side-chain groups.

By convention, the sequences of peptides and proteins are written with the N-terminus on the left and the C-terminus on the right. The name of the N-terminal residue is always the first amino acid. The name of each amino acid then follows. The primary sequence of a protein refers to its amino acid sequence.

Nucleic Acids

Primarily located in the cell nucleus (hence the name) nucleic acids are replicating macromolecules. The most important are DNA and RNA. Without them, cells could not replicate, making life impossible. These molecules store the cell’s “software”—the instructions that govern its function, processes and structure. The code is comprised of sequences of four bases—adenine, cytosine, guanine and thymine (uracil in RNA). These are arranged in sets of three called triplets. Each triplet specifies an amino acid, which in turn is a component of a protein macromolecule. All the intricate complexity of the human body arises from the information encoded by just four chemicals in a single long DNA macromolecule.

In humans, mistakes in the structures of DNA and RNA cause diseases, including sickle cell anemia, hemophilia, Huntingdon’s chorea and some types of cancer. Even a small error can result in a dramatic effect. Sickle cell disease is caused when just one amino acid in the DNA base sequence is changed. Through directing chemical processes, nucleic acids instruct cells how to differentiate into various organs. During development, whole sets of DNA sequences are shut down or activated to drive specific processes. These processes lead to different kinds of cells that form organs such as the heart, liver, skin and brain.

Within the cell, nucleic acids are in turn organized into higher-level structures called chromosomes. You can see chromosomes with a light microscope, using an appropriate stain. Early study of chromosomes helped scientists discover and understand the role of nucleic acids in cellular reproduction. Errors in chromosomal structure lead to malfunctions of life processes. For example, in humans, an extra chromosome 21 results in Down Syndrome.

The Backbone

Screen Shot 2014-11-10 at 9.07.36 PM
Structure of RNA and DNA

Our genetic code is determined by only four bases in DNA (G, C, A, T), which are repeated and arranged in a special order. For example,

1 agccctccag gacaggctgc atcagaagag gccatcaagc agatcactgt ccttctgcca

61 tggccctgtg gatgcgcctc ctgcccctgc tggcgctgct ggccctctgg ggacctgacc

121 cagccgcagc ctttgtgaac caacacctgt gcggctcaca cctggtggaa gctctctacc

181 tagtgtgcgg ggaacgaggc ttcttctaca cacccaagac ccgccgggag gcagaggacc

241 tgcaggtggg gcaggtggag ctgggcgggg gccctggtgc aggcagcctg cagcccttgg

301 ccctggaggg gtccctgcag aagcgtggca ttgtggaaca atgctgtacc agcatctgct

361 ccctctacca gctggagaac tactgcaact agacgcagcc cgcaggcagc cccacacccg

421 ccgcctcctg caccgagaga gatggaataa agcccttgaa ccagcaaaa

This may seem like a random string of G, C, A, T, but this DNA codes for human insulin. DNA is organized into a linear polymer in a double helix and maintains the inherited order of bases or genetic code. The “steps” of the DNA ladder have the code that ultimately directs the synthesis of our proteins. This linear polymer of genetic code is maintained when double strand DNA is transcribed to single strand RNA.

Screen Shot 2014-11-10 at 9.08.29 PM
Structure of a nucleotide

The fundamental unit of DNA is the nucleotide. The nucleotide contains a phosphate group (shown in orange), which will eventually give the DNA polymer its charge and interconnect nucleotides on the backbone. The furanose sugar group is a five-sided sugar (shown in purple). The nitrogenous base (shown in yellow) determines the type of nucleotide formed.

The numbering of the positions on the sugar furanose rings of DNA and RNA follow a convention that uses ‘ (the prime symbol) to denote the sugar positions. Thus, the ribose has a nitrogenous base connected to the 1′ position and hydroxyl groups (OH) on the 2′, 3′ and 5′ positions. Using this nomenclature, deoxyribose is formally called 2′-deoxyribose (2 prime deoxyribose) to denote the loss of the hydroxyl at the 2’ position of ribose.

The major difference in the polymer backbones between DNA and RNA is the sugar used in the formation of the polymer. In DNA (DeoxyriboNucleic Acid) the 2′ position of the furanose has a hydrogen. In RNA (RiboNucleic Acid), the 2′ position of the furanose has an OH (hydroxyl) and the sugar is the monosaccharide ribose in the furanose conformation.

structural representations of deoxyribose and ribose, highlighing the difference between the two.

Furanose sugars

The linkage of individual nucleotides is made by a bridging phosphate molecule between two hydroxyl groups, one on each furanose ring. The resulting polymer is a string of furanose molecules linked by phosphodiester bonds in one very long macromolecule.

Screen Shot 2014-11-10 at 9.09.13 PM
Backbone of DNA

The following is a list of structural characteristics of the DNA/RNA polymer backbone.

  • Phosphate-ribose(deoxyribose)-phosphate-ribose(deoxyribose) sequence
  • Linked by phosphodiester covalent bonds
  • 3′ position on one ribose(deoxyribose) linked to 5′ position of adjacent ribose(deoxyribose) through phosphodiester bridge
  • Chain has 3′ end and 5′ end

Hydrogen Bonding Between Bases

The DNA double helix is held in place with the hydrogen bonding of purines to pyrimidines.

Example purines and pyrimidines.

Recall that hydrogen bonds are weak interactions, not like the covalent bonds of the phosphate-furanose backbone. Thus, DNA is held together, but can be pulled apart for transcription to RNA or for DNA replication.

To maintain equal distance between the two strands of DNA, the larger purines must bind with the smaller pyrimidines. Specifically, A always binds with T and G always binds with C in DNA. A useful memory device is that A and T are angular letters and G and C are both curvy.


DNA Transcription

DNA replication: Every time a cell divides, all of the DNA of the genome is duplicated (called replication) so that each cell after the division (called a daughter cell) has the same DNA as the original cell (called the mother cell).

process of DNA replicatioin. DNA arrow to DNA

DNA transcription: For the genetic code to become a protein, it goes through a transcription step. DNA is transcribed into RNA (a single-strand nucleic acid). The RNA is then shuttled away from the DNA to the region of protein synthesis.

process of transcription. DNA arrow to RNA.

RNA translation: RNA is translated from a nucleic acid code into the amino acid sequence of a protein.

process of translatioin. RNA arrow to Protein

Thus, the DNA gene code is able to duplicate to maintain consistency throughout the person’s body and throughout the person’s life. DNA is also used to make proteins through the use of an RNA intermediate.


Lipids include fats and waxes. Several vitamins, such as A, D, E and K, are lipid soluble. Perhaps the most important role of lipids is in forming the membranes of cells and organelles. In this way, lipids enable isolation and control of chemical processes. They also play a role in energy storage and cell signaling.

Lipid molecules forming cell membranes are comprised of a hydrophilic “head” and hydrophobic “tail” (remember, “hydro” means water and “philos” means love; “hydro” means water, “phobic” means fear). A phospholipid bilayer is formed when the two layers of phospholipid molecules organize with the hydrophobic tails meeting in the middle. Scientists believe that the formation of cell-like globules of lipids was a vital precursor to the origin of cellular life, since membranes physically separate intracellular components from the extracellular environment. Thus, lipid membranes enclose other macromolecules, confine volumes to increase the possibility of reaction, and protect chemical processes. Proteins with hydrophobic regions float within the lipid bilayer. These molecules govern transport of charged or lipophobic molecules in and out of the cell, such as energy molecules and waste products. Some of these lipids also have attached carbohydrate molecules jutting out of the membrane are important for cell recognition as mentioned previously.

Lipids are also vital energy storage molecules. Carbohydrates can be used right away, and lipids provide long-term energy storage. Lipids accumulate in adipose cells (fat cells) in the body. As part of the catabolic process, from the days when humans had to forage for food, excess carbohydrates can be converted into lipids, which are then stored in fatty tissue. Ultimately, too many ingested carbohydrates and lipids lead to obesity.