Storing Genetic Information

What you’ll learn to do: Explain how DNA stores genetic information

The unique structure of DNA is key to its ability to store and replicated genetic information:

illustration of a segment of DNA. The molecule is composed of two helixes, which spiral in opposite directions from one another. The two helixes are connected by "ladder rungs" of adenine, thymine, cytosine, and guanine.

Figure 1

In this outcome, you will learn to describe the double helix structure of DNA: its sugar-phosphate backbone ladder with nitrogenous base “rungs” of ladder.

Learning Outcomes

  • Diagram the structure of DNA
  • Relate the structure of DNA to the storage of genetic information

Structure of DNA

The building blocks of DNA are nucleotides. The important components of each nucleotide are a nitrogenous base, deoxyribose (5-carbon sugar), and a phosphate group (see Figure 2). Each nucleotide is named depending on its nitrogenous base. The nitrogenous base can be a purine, such as adenine (A) and guanine (G), or a pyrimidine, such as cytosine (C) and thymine (T). Uracil (U) is also a pyrimidine (as seen in Figure 2), but it only occurs in RNA, which we will talk more about later.

Illustration depicts the structure of a nucleoside, which is made up of a pentose with a nitrogenous base attached at the 1' position. There are two kinds of nitrogenous bases: pyrimidines, which have one six-membered ring, and purines, which have a six-membered ring fused to a five-membered ring. Cytosine, thymine, and uracil are pyrimidines, and adenine and guanine are purines. A nucleoside with a phosphate attached at the 5' position is called a mononucleotide. A nucleoside with two or three phosphates attached is called a nucleotide diphosphate or nucleotide triphosphate, respectively.

Figure 2. Each nucleotide is made up of a sugar, a phosphate group, and a nitrogenous base. The sugar is deoxyribose in DNA and ribose in RNA.

The nucleotides combine with each other by covalent bonds known as phosphodiester bonds or linkages.  The phosphate residue is attached to the hydroxyl group of the 5′ carbon of one sugar of one nucleotide and the hydroxyl group of the 3′ carbon of the sugar of the next nucleotide, thereby forming a 5′-3′ phosphodiester bond.

In the 1950s, Francis Crick and James Watson worked together to determine the structure of DNA at the University of Cambridge, England. Other scientists like Linus Pauling and Maurice Wilkins were also actively exploring this field. Pauling had discovered the secondary structure of proteins using X-ray crystallography. In Wilkins’ lab, researcher Rosalind Franklin was using X-ray diffraction methods to understand the structure of DNA. Watson and Crick were able to piece together the puzzle of the DNA molecule on the basis of Franklin’s data because Crick had also studied X-ray diffraction (Figure 3). In 1962, James Watson, Francis Crick, and Maurice Wilkins were awarded the Nobel Prize in Medicine. Unfortunately, by then Franklin had died, and Nobel prizes are not awarded posthumously.

The photo in part A shows James Watson, Francis Crick, and Maclyn McCarty. The x-ray diffraction pattern in part b is symmetrical, with dots in an x-shape

Figure 3. The work of pioneering scientists (a) James Watson, Francis Crick, and Maclyn McCarty led to our present day understanding of DNA. Scientist Rosalind Franklin discovered (b) the X-ray diffraction pattern of DNA, which helped to elucidate its double helix structure. (credit a: modification of work by Marjorie McCarty, Public Library of Science)

Watson and Crick proposed that DNA is made up of two strands that are twisted around each other to form a right-handed helix. Base pairing takes place between a purine and pyrimidine; namely, A pairs with T and G pairs with C. Adenine and thymine are complementary base pairs, and cytosine and guanine are also complementary base pairs. The base pairs are stabilized by hydrogen bonds; adenine and thymine form two hydrogen bonds and cytosine and guanine form three hydrogen bonds. The two strands are anti-parallel in nature; that is, the 3′ end of one strand faces the 5′ end of the other strand. The sugar and phosphate of the nucleotides form the backbone of the structure, whereas the nitrogenous bases are stacked inside. Each base pair is separated from the other base pair by a distance of 0.34 nm, and each turn of the helix measures 3.4 nm. Therefore, ten base pairs are present per turn of the helix. The diameter of the DNA double helix is 2 nm, and it is uniform throughout. Only the pairing between a purine and pyrimidine can explain the uniform diameter. The twisting of the two strands around each other results in the formation of uniformly spaced major and minor grooves (Figure 4).

Part A shows an illustration of a DNA double helix, which has a sugar-phosphate backbone on the outside and nitrogenous base pairs on the inside. Part B shows base pairing between thymine and adenine, which form two hydrogen bonds, and between guanine and cytosine, which form three hydrogen bonds. Part C shows a molecular model of the DNA double helix. The outside of the helix alternates between wide gaps, called major grooves, and narrow gaps, called minor grooves.

Figure 4. DNA has (a) a double helix structure and (b) phosphodiester bonds. The (c) major and minor grooves are binding sites for DNA binding proteins during processes such as transcription (the copying of RNA from DNA) and replication.

Genetic Information

The genetic information of an organism is stored in DNA molecules. How can one kind of molecule contain all the instructions for making complicated living beings like ourselves? What component or feature of DNA can contain this information? It has to come from the nitrogen bases, because, as you already know, the backbone of all DNA molecules is the same. But there are only four bases found in DNA: G, A, C, and T. The sequence of these four bases can provide all the instructions needed to build any living organism. It might be hard to imagine that 4 different “letters” can communicate so much information. But think about the English language, which can represent a huge amount of information using just 26 letters. Even more profound is the binary code used to write computer programs. This code contains only ones and zeros, and think of all the things your computer can do. The DNA alphabet can encode very complex instructions using just four letters, though the messages end up being really long. For example, the E. coli bacterium carries its genetic instructions in a DNA molecule that contains more than five million nucleotides. The human genome (all the DNA of an organism) consists of around three billion nucleotides divided up between 23 paired DNA molecules, or chromosomes.

The information stored in the order of bases is organized into genes: each gene contains information for making a functional product. The genetic information is first copied to another nucleic acid polymer, RNA (ribonucleic acid), preserving the order of the nucleotide bases. Genes that contain instructions for making proteins are converted to messenger RNA (mRNA). Some specialized genes contain instructions for making functional RNA molecules that don’t make proteins. These RNA molecules function by affecting cellular processes directly; for example some of these RNA molecules regulate the expression of mRNA. Other genes produce RNA molecules that are required for protein synthesis, transfer RNA (tRNA), and ribosomal RNA (rRNA).

In order for DNA to function effectively at storing information, two key processes are required. First, information stored in the DNA molecule must be copied, with minimal errors, every time a cell divides. This ensures that both daughter cells inherit the complete set of genetic information from the parent cell. Second, the information stored in the DNA molecule must be translated, or expressed. In order for the stored information to be useful, cells must be able to access the instructions for making specific proteins, so the correct proteins are made in the right place at the right time.

Structure of DNA double helix. Sugar-phosphate backbone is shown in yellow, specific base pairings via hydrogen bonds (red lines) are colored in green and purple (A-T pair) and red and blue (C-G).

Figure 5. DNA’s double helix. Graphic modified from “DNA chemical structure,” by Madeleine Price Ball, CC-BY-SA-2.0

Both copying and reading the information stored in DNA relies on base pairing between two nucleic acid polymer strands. Recall that DNA structure is a double helix (see Figure 5).

The sugar deoxyribose with the phosphate group forms the scaffold or backbone of the molecule (highlighted in yellow in Figure 5). Bases point inward. Complementary bases form hydrogen bonds with each other within the double helix. See how the bigger bases (purines) pair with the smaller ones (pyrimidines). This keeps the width of the double helix constant. More specifically, A pairs with T and C pairs with G. As we discuss the function of DNA in subsequent sections, keep in mind that there is a chemical reason for specific pairing of bases.

To illustrate the connection between information in DNA and an observable characteristic of an organism, let’s consider a gene that provides the instructions for building the hormone insulin. Insulin is responsible for regulating blood sugar levels. The insulin gene contains instructions for assembling the protein insulin from individual amino acids. Changing the sequence of nucleotides in the DNA molecule can change the amino acids in the final protein, leading to protein malfunction. If insulin does not function correctly, it might be unable to bind to another protein (insulin receptor). On the organismal level of organization, this molecular event (change of DNA sequence) can lead to a disease state—in this case, diabetes.

Practice Questions

The order of nucleotides in a gene (in DNA) is the key to how information is stored. For example, consider these two words: stable and tables. Both words are built from the same letters (subunits), but the different order of these subunits results in very different meanings. In DNA, the information is stored in units of 3 letters. Use the following key to decode the encrypted message. This should help you to see how information can be stored in the linear order of nucleotides in DNA.

ABC = a DEF = d GHI = e JKL = f
MNO = h PQR = i STU = m VWX = n
YZA = o BCD = r EFG = s HIJ = t
KLM = w NOP = j QRS = p TUV = y

Encrypted Message: HIJMNOPQREFG – PQREFG – MNOYZAKLM – DEFVWXABC – EFGHIJYZABCDGHIEFG – PQRVWXJKLYZABCDSTUABCHIJPQRYZAVWX

Where in the DNA is information stored?

  1. The shape of the DNA
  2. The sugar-phosphate backbone
  3. The sequence of bases
  4. The presence of two strands.

Which statement is correct?

  1. The sequence of DNA bases is arranged into chromosomes, most of which contain the instructions to build an amino acid.
  2. The sequence of DNA strands is arranged into chromosomes, most of which contain the instructions to build a protein.
  3. The sequence of DNA bases is arranged into genes, most of which contain the instructions to build a protein.
  4. The sequence of DNA phosphates is arranged into genes, most of which contain the instructions to build a cell.

Check Your Understanding

For the DNA sequence of the 5′-3′ strand ATGGAATT, choose the complementary sequence.

  • 3′-CGAACCGG-5′
  • 3′-TACCTTAA-5′
  • 3′-TTAAGGTA-5′
Show Answer

3′-TACCTTAA-5′