Genes to proteins: Central Dogma

Genes specify functional products (such as proteins)

A DNA molecule is divided up into functional units called genes. Each gene provides instructions for a functional product, that is, a molecule needed to perform a job in the cell. In many cases, the functional product of a gene is a protein. For example, Mendel’s flower color gene provides instructions for a protein that helps make colored molecules (pigments) in flower petals.
Diagram of how a gene can dictate a phenotype (observable feature) of an organism. The flower color gene that Mendel studied consists of a stretch of DNA found on a chromosome. The DNA has a particular sequence; part of it, shown in this diagram, is 5'-GTAAATCG-3' (upper strand), paired with the complementary sequence 3'-CATTTAGC-5' (lower strand). The DNA of the gene specifies production of a protein that helps make pigments. When the protein is present and functional, pigments are produced, and the flowers of a plant have a purple color.
The flower color gene that Mendel studied consists of a stretch of DNA found on a chromosome. The DNA has a particular sequence; part of it, shown in this diagram, is GTAAATCG (upper strand), paired with the complementary sequence CATTTAGC (lower strand). The DNA of the gene specifies production of a protein that helps make pigments. When the protein is present and functional, pigments are produced, and the flowers of a plant have a purple color.
The functional products of most known genes are proteins, or, more accurately, polypeptides. Polypeptide is just another word for a chain of amino acids. Although many proteins consist of a single polypeptide, some are made up of multiple polypeptides. Genes that specify polypeptides are called protein-coding genes. Not all genes specify polypeptides. Instead, some provide instructions to build functional RNA molecules, such as the transfer RNAs and ribosomal RNAs that play roles in translation. 
How does the DNA sequence of a gene specify a particular protein?
Many genes provide instructions for building polypeptides. How, exactly, does DNA direct the construction of a polypeptide? This process involves two major steps: transcription and translation.
  • In transcription, the DNA sequence of a gene is copied to make an RNA molecule. This step is called transcription because it involves rewriting, or transcribing, the DNA sequence in a similar RNA “alphabet.” In eukaryotes, the RNA molecule must undergo processing to become a mature messenger RNA (mRNA).
  • In translation, the sequence of the mRNA is decoded to specify the amino acid sequence of a polypeptide. The name translation reflects that the nucleotide sequence of the mRNA sequence must be translated into the completely different “language” of amino acids.

Simplified schematic of central dogma, showing the sequences of the molecules involved. The two strands of DNA have the following sequences: 5'-ATGATCTCGTAA-3' 3'-TACTAGAGCATT-5' Transcription of one of the strands of DNA produces an mRNA that nearly matches the other strand of DNA in sequence. However, due to a biochemical difference between DNA and RNA, the Ts of DNA are replaced with Us in the mRNA. The mRNA sequence is: 5'-AUGAUCUCGUAA-5' Translation involves reading the mRNA nucleotides in groups of three; each group specifies an amino acid (or provides a stop signal indicating that translation is finished). 3'-AUG AUC UCG UAA-5' AUG $\rightarrow$ Methionine AUC $\rightarrow$ Isoleucine UCG $\rightarrow$ Serine UAA $\rightarrow$ "Stop" Polypeptide sequence: (N-terminus) Methionine-Isoleucine-Serine (C-terminus)

Transcription

In transcription, one strand of the DNA that makes up a gene, called the non-coding strand, acts as a template for the synthesis of a matching (complementary) RNA strand by an enzyme called RNA polymerase. This RNA strand is the primary transcript.

The two strands of DNA have the following sequences: 5'-ATGATCTCGTAA-3' 3'-TACTAGAGCATT-5' The DNA opens up to form a bubble, and the lower strand serves as a template for the synthesis of a complementary RNA strand. This strand is called the template strand. Transcription of the template strand produces an mRNA that nearly matches the other strand (coding strand) of DNA in sequence. However, due to a biochemical difference between DNA and RNA, the Ts of DNA are replaced with Us in the mRNA. The mRNA sequence is: 5'-AUGAUCUCGUAA-5'

RNA polymerase

The main enzyme involved in transcription is RNA polymerase, which uses a single-stranded DNA template to synthesize a complementary strand of RNA. Specifically, RNA polymerase builds an RNA strand, adding each new nucleotide to the strand.

RNA polymerase synthesizes an RNA strand complementary to a template DNA strand. It synthesizes the RNA strand in the 5' to 3' direction, while reading the template DNA strand in the 3' to 5' direction. The template DNA strand and RNA strand are antiparallel. RNA transcript: 5'-UGGUAGU...-3' (dots indicate where nucleotides are still being added at 3' end) DNA template: 3'-ACCATCAGTC-5'

Stages of transcription

Transcription of a gene takes place in three stages: initiation, elongation, and termination. Here, we will briefly see how these steps happen.
  1. Initiation. RNA polymerase binds to a sequence of DNA called the promoter, found near the beginning of a gene. Each gene has its own promoter. Once bound, RNA polymerase separates the DNA strands, providing the single-stranded template needed for transcription.

    The promoter region comes before (and slightly overlaps with) the transcribed region whose transcription it specifies. It contains recognition sites for RNA polymerase or its helper proteins to bind to. The DNA opens up in the promoter region so that RNA polymerase can begin transcription.

    The promoter region comes before (and slightly overlaps with) the transcribed region whose transcription it specifies. It contains recognition sites for RNA polymerase or its helper proteins to bind to. The DNA opens up in the promoter region so that RNA polymerase can begin transcription.
  2. Elongation. One strand of DNA, the template strand, acts as a template for RNA polymerase. As it “reads” this template one base at a time, the polymerase builds an RNA molecule out of complementary nucleotides, making a chain. The RNA transcript carries the same information as the non-template (coding) strand of DNA, but it contains the base uracil (U) instead of thymine (T).

  3. Termination. Sequences called terminators signal that the RNA transcript is complete. Once they are transcribed, they cause the transcript to be released from the RNA polymerase. An example of a termination mechanism involving formation of a hairpin in the RNA is shown below.

    The terminator DNA encodes a region of RNA that forms a hairpin structure followed by a string of U nucleotides. The hairpin structure in the transcript causes the RNA polymerase to stall. The U nucleotides that come after the hairpin form weak bonds with the A nucleotides of the DNA template, allowing the transcript to separate from the template and ending transcription.

Translation

The genetic code

During translation, a cell “reads” the information in a messenger RNA (mRNA) and uses it to build a protein. Actually, to be a little more technical, an mRNA doesn’t always encode—provide instructions for—a whole protein. Instead, what we can confidently say is that it always encodes a polypeptide, or chain of amino acids.
Genetic code table. Each three-letter sequence of mRNA nucleotides corresponds to a specific amino acid, or to a stop codon. UGA, UAA, and UAG are stop codons. AUG is the codon for methionine, and is also the start codon.
Each three-letter sequence of mRNA nucleotides corresponds to a specific amino acid, or to a stop codon. UGA, UAA, and UAG are stop codons. AUG is the codon for methionine, and is also the start codon. In an mRNA, the instructions for building a polypeptide are RNA nucleotides (As, Us, Cs, and Gs) read in groups of three. These groups of three are called codons One codon, AUG, specifies the amino acid methionine and also acts as a start codon to signal the start of protein construction. There are three more codons that do not specify amino acids. These stop codons, UAA, UAG, and UGA, tell the cell when a polypeptide is complete. All together, this collection of codon-amino acid relationships is called the genetic code, because it lets cells “decode” an mRNA into a chain of amino acids.
Each mRNA contains a series of codons (nucleotide triplets) that each specifies an amino acid. The correspondence between mRNA codons and amino acids is called the genetic code. 5' AUG - Methionine ACG - Threonine GAG - Glutamate CUU - Leucine CGG - Arginine AGC - Serine UAG - Stop 3'
Each mRNA contains a series of codons (nucleotide triplets) that each specifies an amino acid. The correspondence between mRNA codons and amino acids is called the genetic code.
AUG – Methionine ACG – Threonine GAG – Glutamate CUU – Leucine CGG – Arginine AGC – Serine UAG – Stop
Transfer RNAs (tRNAs)
Transfer RNAs, or tRNAs, are molecular “bridges” that connect mRNA codons to the amino acids they encode. One end of each tRNA has a sequence of three nucleotides called an anticodon, which can bind to specific mRNA codons. The other end of the tRNA carries the amino acid specified by the codons. There are many different types of tRNAs. Each type reads one or a few codons and brings the right amino acid matching those codons.
Ribosomes are composed of a small and large subunit and have three sites where tRNAs can bind to an mRNA (the A, P, and E sites). Each tRNA vcarries a specific amino acid and binds to an mRNA codon that is complementary to its anticodon.
Ribosomes are composed of a small and large subunit and have three sites where tRNAs can bind to an mRNA (the A, P, and E sites). Each tRNA vcarries a specific amino acid and binds to an mRNA codon that is complementary to its anticodon.
Ribosomes
Ribosomes are the structures where polypeptides (proteins) are built. They are made up of protein and RNA (ribosomal RNA, or rRNA). Each ribosome has two subunits, a large one and a small one, which come together around an mRNA—kind of like the two halves of a hamburger bun coming together around the patty. The ribosome provides a set of handy slots where tRNAs can find their matching codons on the mRNA template and deliver their amino acids. These slots are called the A, P, and E sites. Not only that, but the ribosome also acts as an enzyme, catalyzing the chemical reaction that links amino acids together to make a chain.
Steps of translation
Your cells are making new proteins every second of the day. And each of those proteins must contain the right set of amino acids, linked together in just the right order. That may sound like a challenging task, but luckily, your cells (along with those of other animals, plants, and bacteria) are up to the job.
To see how cells make proteins, let’s divide translation into three stages: initiation (starting off), elongation (adding on to the protein chain), and termination (finishing up).

Getting started: Initiation

In initiation, the ribosome assembles around the mRNA to be read and the first tRNA (carrying the amino acid methionine, which matches the start codon, AUG). This setup, called the initiation complex, is needed in order for translation to get started.

Extending the chain: Elongation

Elongation is the stage where the amino acid chain gets longer. In elongation, the mRNA is read one codon at a time, and the amino acid matching each codon is added to a growing protein chain.
Each time a new codon is exposed:
  • A matching tRNA binds to the codon
  • The existing amino acid chain (polypeptide) is linked onto the amino acid of the tRNA via a chemical reaction
  • The mRNA is shifted one codon over in the ribosome, exposing a new codon for reading
    Elongation has three stages: 1) The anticodon of an incoming tRNA pairs with the mRNA codon exposed in the A site. 2) A peptide bond is formed between the new amino acid (in the A site) and the previously-added amino acid (in the P site), transferring the polypeptide from the P site to the A site. 3) The ribosome moves one codon down on the mRNA. The tRNA in the A site (carrying the polypeptide) shifts to the P site. The tRNA in the P site shifts to the E site and exits the ribosome.
    Elongation has three stages:
    1) The anticodon of an incoming tRNA pairs with the mRNA codon exposed in the A site.
    2) A peptide bond is formed between the new amino acid (in the A site) and the previously-added amino acid (in the P site), transferring the polypeptide from the P site to the A site.
    3) The ribosome moves one codon down on the mRNA. The tRNA in the A site (carrying the polypeptide) shifts to the P site. The tRNA in the P site shifts to the E site and exits the ribosome.
During elongation, tRNAs move through the A, P, and E sites of the ribosome, as shown above. This process repeats many times as new codons are read and new amino acids are added to the chain.
For more details on the steps of elongation, see the stages of translationarticle.

Finishing up: Termination

Termination is the stage in which the finished polypeptide chain is released. It begins when a stop codon (UAG, UAA, or UGA) enters the ribosome, triggering a series of events that separate the chain from its tRNA and allow it to drift out of the ribosome. After termination, the polypeptide may still need to fold into the right 3D shape, undergo processing (such as the removal of amino acids), get shipped to the right place in the cell, or combine with other polypeptides before it can do its job as a functional protein.