Eukaryotic Gene Regulation

The Promoter and the Transcription Machinery

When transcription factors bind to the promoter region, RNA polymerase is placed in an orientation that allows transcription to begin.

Learning Objectives

Describe the role of promoters in RNA transcription

Key Takeaways

Key Points

  • The purpose of the promoter is to bind transcription factors that control the initiation of transcription.
  • The promoter region can be short or quite long; the longer the promoter is, the more available space for proteins to bind.
  • To initiate transcription, a transcription factor (TFIID) binds to the TATA box, which causes other transcription factors to subsequently bind to the TATA box.
  • Once the transcription initiation complex is assembled, RNA polymerase can bind to its upstream sequence and is then phosphorylated.
  • Phosphorylation of RNA polymerase releases part of the protein from the DNA to activate the transcription initiation complex and places RNA polymerase in the correct orientation to begin transcription.
  • Transcription factors respond to environmental stimuli that cause the proteins to find their binding sites and initiate transcription of the gene that is needed.

Key Terms

  • TATA box: a DNA sequence (cis-regulatory element) found in the promoter region of genes in archaea and eukaryotes
  • transcription factor: a protein that binds to specific DNA sequences, thereby controlling the flow (or transcription) of genetic information from DNA to mRNA
  • promoter: the section of DNA that controls the initiation of RNA transcription

The Promoter and the Transcription Machinery

Genes are organized to make the control of gene expression easier. The promoter region is immediately upstream of the coding sequence. This region can be short (only a few nucleotides in length) or quite long (hundreds of nucleotides long). The longer the promoter, the more available space for proteins to bind. This also adds more control to the transcription process. The length of the promoter is gene-specific and can differ dramatically between genes. Consequently, the level of control of gene expression can also differ quite dramatically between genes. The purpose of the promoter is to bind transcription factors that control the initiation of transcription.

Within the promoter region, just upstream of the transcriptional start site, resides the TATA box. This box is simply a repeat of thymine and adenine dinucleotides (literally, TATA repeats). RNA polymerase binds to the transcription initiation complex, allowing transcription to occur. To initiate transcription, a transcription factor (TFIID) is the first to bind to the TATA box. Binding of TFIID recruits other transcription factors, including TFIIB, TFIIE, TFIIF, and TFIIH to the TATA box. Once this transcription initiation complex is assembled, RNA polymerase can bind to its upstream sequence. When bound along with the transcription factors, RNA polymerase is phosphorylated. This releases part of the protein from the DNA to activate the transcription initiation complex and places RNA polymerase in the correct orientation to begin transcription; DNA-bending protein brings the enhancer, which can be quite a distance from the gene, in contact with transcription factors and mediator proteins.

image

Promoters: A generalized promoter of a gene transcribed by RNA polymerase II is shown. Transcription factors recognize the promoter. RNA polymerase II then binds and forms the transcription initiation complex.

In addition to the general transcription factors, other transcription factors can bind to the promoter to regulate gene transcription. These transcription factors bind to the promoters of a specific set of genes. They are not general transcription factors that bind to every promoter complex, but are recruited to a specific sequence on the promoter of a specific gene. There are hundreds of transcription factors in a cell that each bind specifically to a particular DNA sequence motif. When transcription factors bind to the promoter just upstream of the encoded gene, they are referred to as cis-acting elements because they are on the same chromosome, just next to the gene. The region that a particular transcription factor binds to is called the transcription factor binding site. Transcription factors respond to environmental stimuli that cause the proteins to find their binding sites and initiate transcription of the gene that is needed.

Transcriptional Enhancers and Repressors

Enhancers increase the rate of transcription of genes, while repressors decrease the rate of transcription.

Learning Objectives

Explain how enhancers and repressors regulate gene expression

Key Takeaways

Key Points

  • Enhancers can be located upstream of a gene, within the coding region of the gene, downstream of a gene, or thousands of nucleotides away.
  • When a DNA -bending protein binds to the enhancer, the shape of the DNA changes, which allows interactions between the activators and transcription factors to occur.
  • Repressors respond to external stimuli to prevent the binding of activating transcription factors.
  • Corepressors can repress transcriptional initiation by recruiting histone deacetylase.
  • Histone deactylation increases the positive charge on histones, which strengthens the interaction between the histones and DNA, making the DNA less accessible to transcription.

Key Terms

  • enhancer: a short region of DNA that can increase transcription of genes
  • repressor: any protein that binds to DNA and thus regulates the expression of genes by decreasing the rate of transcription
  • activator: any chemical or agent which regulates one or more genes by increasing the rate of transcription

Enhancers and Transcription

In some eukaryotic genes, there are regions that help increase or enhance transcription. These regions, called enhancers, are not necessarily close to the genes they enhance. They can be located upstream of a gene, within the coding region of the gene, downstream of a gene, or may be thousands of nucleotides away.

Enhancer regions are binding sequences, or sites, for transcription factors. When a DNA-bending protein binds to an enhancer, the shape of the DNA changes. This shape change allows the interaction between the activators bound to the enhancers and the transcription factors bound to the promoter region and the RNA polymerase to occur. Whereas DNA is generally depicted as a straight line in two dimensions, it is actually a three-dimensional object. Therefore, a nucleotide sequence thousands of nucleotides away can fold over and interact with a specific promoter.

image

Enhancers: An enhancer is a DNA sequence that promotes transcription. Each enhancer is made up of short DNA sequences called distal control elements. Activators bound to the distal control elements interact with mediator proteins and transcription factors.

Turning Genes Off: Transcriptional Repressors

Like prokaryotic cells, eukaryotic cells also have mechanisms to prevent transcription. Transcriptional repressors can bind to promoter or enhancer regions and block transcription. Like the transcriptional activators, repressors respond to external stimuli to prevent the binding of activating transcription factors.

A corepressor is a protein that decreases gene expression by binding to a transcription factor that contains a DNA-binding domain. The corepressor is unable to bind DNA by itself. The corepressor can repress transcriptional initiation by recruiting histone deacetylase, which catalyzes the removal of acetyl groups from lysine residues. This increases the positive charge on histones, which strengthens the interaction between the histones and DNA, making the DNA less accessible to the process of transcription.

Epigenetic Control: Regulating Access to Genes within the Chromosome

Both the packaging of DNA around histone proteins, as well as chemical modifications to the DNA or proteins, can alter gene expression.

Learning Objectives

Discuss how eukaryotic gene regulation occurs at the epigenetic level and the various epigenetic changes that can be made to DNA

Key Takeaways

Key Points

  • DNA is packaged by wrapping around histone proteins into structures called nucleosomes, which resemble beads on a string.
  • When DNA is to be transcribed, the nucleosomes can slide away from that region of DNA, opening it up to the transcription machinery of the cell.
  • Chemical modifications to either the histone proteins or the DNA itself signals whether or not a particular region of the genome should be “open” or “closed” to the transcription machinery.
  • Modifications such as acetylation or methylation of the histones can alter how tightly DNA is wrapped around them, while methylation of DNA changes how the DNA interacts with proteins, including the histone proteins that control access to the region.
  • This type of genetic regulation is called epigenetic regulation (“above genetics”) as it does not change the nucleotide sequence of the DNA.

Key Terms

  • nucleosome: any of the subunits that repeat in chromatin; a coil of DNA surrounding a histone core
  • epigenetics: the study of heritable changes caused by the activation and deactivation of genes without any change in DNA sequence
  • histone: any of various simple water-soluble proteins that are rich in the basic amino acids lysine and arginine and are complexed with DNA in the nucleosomes of eukaryotic chromatin

Epigenetic Control: Regulating Access to Genes within the Chromosome

The human genome encodes over 20,000 genes; each of the 23 pairs of human chromosomes encodes thousands of genes. The DNA in the nucleus is precisely wound, folded, and compacted into chromosomes so that it will fit into the nucleus. It is also organized so that specific segments can be accessed as needed by a specific cell type.

The first level of organization, or packing, is the winding of DNA strands around histone proteins. Histones package and order DNA into structural units called nucleosome complexes, which can control the access of proteins to the DNA regions. Under the electron microscope, this winding of DNA around histone proteins to form nucleosomes looks like small beads on a string. These beads (histone proteins) can move along the string (DNA) and change the structure of the molecule.

image

DNA Packaging: DNA is folded around histone proteins to create (a) nucleosome complexes. These nucleosomes control the access of proteins to the underlying DNA. When viewed through an electron microscope (b), the nucleosomes look like beads on a string.

If DNA encoding a specific gene is to be transcribed into RNA, the nucleosomes surrounding that region of DNA can slide down the DNA to open that specific chromosomal region and allow for the transcriptional machinery ( RNA polymerase ) to initiate transcription. Nucleosomes can move to open the chromosome structure to expose a segment of DNA, but do so in a very controlled manner.

image

Nucleosomes can change position to allow transcription of genes: Nucleosomes can slide along DNA. When nucleosomes are spaced closely together (top), transcription factors cannot bind and gene expression is turned off. When the nucleosomes are spaced far apart (bottom), the DNA is exposed. Transcription factors can bind, allowing gene expression to occur. Modifications to the histones and DNA affect nucleosome spacing.

How the histone proteins move is dependent on signals found on both the histone proteins and on the DNA. These signals are tags, or modifications, added to histone proteins and DNA that tell the histones if a chromosomal region should be open or closed. These tags are not permanent, but may be added or removed as needed. They are chemical modifications (phosphate, methyl, or acetyl groups) that are attached to specific amino acids in the protein or to the nucleotides of the DNA. The tags do not alter the DNA base sequence, but they do alter how tightly wound the DNA is around the histone proteins. DNA is a negatively-charged molecule; therefore, changes in the charge of the histone will change how tightly wound the DNA molecule will be. When unmodified, the histone proteins have a large positive charge; by adding chemical modifications, such as acetyl groups, the charge becomes less positive.

image

Modifications to histones and DNA can alter gene expression: Histone proteins and DNA nucleotides can be modified chemically. Modifications affect nucleosome spacing and gene expression.

The DNA molecule itself can also be modified. This occurs within very specific regions called CpG islands. These are stretches with a high frequency of cytosine and guanine dinucleotide DNA pairs (CG) found in the promoter regions of genes. When this configuration exists, the cytosine member of the pair can be methylated (a methyl group is added). This modification changes how the DNA interacts with proteins, including the histone proteins that control access to the region. Highly-methylated (hypermethylated) DNA regions with deacetylated histones are tightly coiled and transcriptionally inactive. These changes to DNA are inherited from parent to offspring, such that while the DNA sequence is not altered, the pattern of gene expression is passed to the next generation.

This type of gene regulation is called epigenetic regulation. Epigenetics means “above genetics.” The changes that occur to the histone proteins and DNA do not alter the nucleotide sequence and are not permanent. Instead, these changes are temporary (although they often persist through multiple rounds of cell division) and alter the chromosomal structure (open or closed) as needed. A gene can be turned on or off depending upon the location and modifications to the histone proteins and DNA. If a gene is to be transcribed, the histone proteins and DNA are modified surrounding the chromosomal region encoding that gene. This opens the chromosomal region to allow access for RNA polymerase and other proteins, called transcription factors, to bind to the promoter region, located just upstream of the gene, and initiate transcription. If a gene is to remain turned off, or silenced, the histone proteins and DNA have different modifications that signal a closed chromosomal configuration. In this closed configuration, the RNA polymerase and transcription factors do not have access to the DNA and transcription cannot occur.

RNA Splicing

RNA splicing allows for the production of multiple protein isoforms from a single gene by removing introns and combining different exons.

Learning Objectives

Explain the role of RNA splicing in regulating gene expression

Key Takeaways

Key Points

  • Introns are intervening sequences within a pre-mRNA molecule that do not code for proteins and are removed during RNA processing by a spliceosome.
  • Exons are expressing sequences within a pre-mRNA molecule that are spliced together once introns are removed to form mature mRNA molecules that are translated into proteins.
  • Alternative splicing allows for the production of various protein isoforms from one single gene coding.
  • A spliceosome is a complex comprised of both RNA molecules and proteins which determine which introns to leave out and which exons to keep and bind together.

Key Terms

  • intron: a portion of a split gene that is included in pre-RNA transcripts but is removed during RNA processing and rapidly degraded
  • exon: a region of a transcribed gene present in the final functional RNA molecule
  • spliceosome: a dynamic complex of RNA and protein subunits that removes introns from precursor mRNA

RNA splicing, the first stage of post-transcriptional control

Gene expression is the process that transfers genetic information from a gene made of DNA to a functional gene product made of RNA or protein. Genetic Information flows from DNA to RNA by the process of transcription and then from RNA to protein by the process of translation. In order to ensure that the proper products are produced, gene expression is regulated at many different stages during and in between transcription and translation. In eukaryotes, the gene contains extra sequences that do not code for protein. In these organisms, transcription of DNA produces pre-mRNA. These pre-mRNA transcripts often contain regions, called introns, that are intervening sequences which must be removed prior to translation by the process of splicing. The regions of RNA that code for protein are called exons. Splicing can be regulated so that different mRNAs can contain or lack exons, in a process called alternative splicing. Alternative splicing allows more than one protein to be produced from a gene and is an important regulatory step in determining which functional proteins are produced from gene expression. Thus, splicing is the first stage of post-transcriptional control.

image

Alternative Splicing: There are five basic modes of alternative splicing.

image

Alternative Splicing: Pre-mRNA can be alternatively spliced to create different proteins.

Alternative Splicing

Alternative splicing is a process that occurs during gene expression and allows for the production of multiple proteins (protein isoforms) from a single gene coding. Alternative splicing can occur due to the different ways in which an exon can be excluded from or included in the messenger RNA. It can also occur if portions on an exon are excluded/included or if there is an inclusion of introns. For example, if a pre-mRNA has four exons (A, B, C, and D), these can be spliced and translated in a number of different combinations. Exons A, B, and C can be translated together or Exons A, C, and D can be translated. This results in what is called alternative splicing. The pattern of splicing and production of alternatively-spliced messenger RNA is controlled by the binding of regulatory proteins (trans-acting proteins that contain the genes) to cis-acting sites that are found on the pre-RNA. Some of these regulatory proteins include splicing activators (proteins that promote certain splicing sites) and splicing repressors (proteins that reduce the use of certain sites). Some common splicing repressors include: heterogeneous nuclear ribonucleoprotein (hnRNP) and polypyrimidine tract binding protein (PTB). Proteins that are translated from alternatively-spliced messenger RNAs differ in the sequence of their amino acids which results in altered function of the protein. This is one reason why the human genome can encode a wide diversity of proteins. Alternative splicing is a common process that occurs in eukaryotes; most of the multi-exonic genes in humans are spliced alternatively. Unfortunately, abnormal variations in splicing are also the reason why there are many genetic diseases and disorders.

image

Mechanism of Splicing: Alternative splicing can result in protein isoforms.

Spliceosome

The splicing of messenger RNA is accomplished and catalyzed by a macro-molecule complex known as the spliceosome. The areas for ligation and cleavage are determined by the many sub-units of the spliceosome which include the branch site (A) and the 5′ and 3′ splice sites. Interactions between these sub-units and the small nuclear ribonucleoproteins (snRNP) found in the spliceosome create a spliceosome A complex which helps determine which introns to leave out and which exons to keep and bind together. Once the introns are cleaved and removed, the exons are joined together by a phosphodiester bond.

Regulatory Proteins

As noted above, splicing is regulated by repressor proteins and activator proteins, which are are also known as trans-acting proteins. Equally as important are the silencers and enhancers that are found on the messenger RNAs, also known as cis-acting sites. These regulatory functions work together in order to create splicing code that determines alternative splicing.

The Initiation Complex and Translation Rate

The first step of translation is ribosome assembly, which requires initiation factors.

Learning Objectives

Discuss how eukaryotes assemble ribosomes on the mRNA to begin translation

Key Takeaways

Key Points

  • The components involved in ribosome assembly are brought together by the help of proteins called initiation factors which bind to the small ribosomal subunit.
  • Initiator tRNA is used to locate the start codon AUG (the amino acid methionine) which establishes the reading frame for the mRNA strand.
  • GTP carried by eIF2 is the energy source used for loading the initiator tRNA carried by the small ribosomal subunit on the correct start codon in the mRNA.
  • GTP carried by eIF5 is the energy source for assembling the large and small ribosomal subunits together.

Key Terms

  • reading frame: either of three possible triplets of codons in which a DNA sequence could be transcribed
  • phosphorylation: the addition of a phosphate group to a compound; often catalyzed by enzymes

Ribosome Assembly and Translation Rate

Like transcription, translation is controlled by proteins that bind and initiate the process. In translation, before protein synthesis can begin, ribosome assembly has to be completed. This is a multi-step process.

In ribosome assembly, the large and small ribosomal subunits and an initiator tRNA (tRNAi) containing the first amino acid of the final polypeptide chain all come together at the translation start codon on an mRNA to allow translation to begin. First, the small ribosomal subunit binds to the tRNAi which carries methionine in eukaryotes and archaea and carries N-formyl-methionine in bacteria. (Because the tRNAi is carrying an amino acid, it is said to be charged.) Next, the small ribosomal subunit with the charged tRNAi still bound scans along the mRNA strand until it reaches the start codon AUG, which indicates where translation will begin. The start codon also establishes the reading frame for the mRNA strand, which is crucial to synthesizing the correct sequence of amino acids. A shift in the reading frame results in mistranslation of the mRNA. The anticodon on the tRNAi then binds to the start codon via basepairing. The complex consisting of mRNA, charged tRNAi, and the small ribosomal subunit attaches to the large ribosomal subunit, which completes ribosome assembly. These components are brought together by the help of proteins called initiation factors which bind to the small ribosomal subunit during initiation and are found in all three domains of life. In addition, the cell spends GTP energy to help form the initiation complex. Once ribosome assembly is complete, the charged tRNAi is positioned in the P site of the ribosome and the empty A site is ready for the next aminoacyl-tRNA. The polypeptide synthesis begins and always proceeds from the N-terminus to the C-terminus, called the N-to-C direction.

In eukaryotes, several eukaryotic initiation factor proteins (eIFs) assist in ribosome assembly. The eukaryotic initiation factor-2 (eIF-2) is active when it binds to guanosine triphosphate (GTP). With GTP bound to it, eIF-2 protein binds to the small 40S ribosomal subunit. Next, the initiatior tRNA charged with methionine (Met-tRNAi) associates with the GTP-eIF-2/40S ribosome complex, and once all these components are bound to each other, they are collectively called the 43S complex.

Eukaryotic initiation factors eIF1, eIF3, eIF4, and eIF5 help bring the 43S complex to the 5′-m7G cap of an mRNA be translated. Once bound to the mRNA’s 5′ m7G cap, the 43S complex starts travelling down the mRNA until it reaches the initiation AUG codon at the start of the mRNA’s reading frame. Sequences around the AUG may help ensure the correct AUG is used as the initiation codon in the mRNA.

Once the 43S complex is at the initiation AUG, the tRNAi-Met is positioned over the AUG. The anticodon on tRNAi-Met basepairs with the AUG codon. At this point, the GTP bound to eIF2 in the 43S complexx is hydrolyzed to GDP + phosphate, and energy is released. This energy is used to release the eIF2 (with GDP bound to it) from the 43S complex, leaving the 40S ribosomal subunit and the tRNAi-Met at the translation start site of the mRNA.

Next, eIF5 with GTP bound binds to the 40S ribosomal subunit complexed to the mRNA and the tRNAi-Met. The eIF5-GTP allows the 60S large ribosomal subunit to bind. Once the 60S ribosomal subunit arrives, eIF5 hydrolyzes its bound GTP to GDP + phosphate, and energy is released. This energy powers assembly of the two ribosomal subunits into the intact 80S ribosome, with tRNAi-Met in its P site while also basepaired to the initiation AUG codon on the mRNA. Translation is ready to begin.

The binding of eIF-2 to the 40S ribosomal subunit is controlled by phosphorylation. If eIF-2 is phosphorylated, it undergoes a conformational change and cannot bind to GTP. Therefore, the 43S complex cannot form properly and translation is impeded. When eIF-2 remains unphosphorylated, it binds the 40S ribosomal subunit and actively translates the protein.

image

Translation Initiation Complex: Gene expression can be controlled by factors that bind the translation initiation complex.

The ability to fully assemble the ribosome directly affects the rate at which translation occurs. But protein synthesis is regulated at various other levels as well, including mRNA synthesis, tRNA synthesis, rRNA synthesis, and eukaryotic initiation factor synthesis. Alteration in any of these components affects the rate at which translation can occur.

Regulating Protein Activity and Longevity

A cell can rapidly change the levels of proteins in response to the environment by adding specific chemical groups to alter gene regulation.

Learning Objectives

Explain how chemical modifications affect protein activity and longevity

Key Takeaways

Key Points

  • Proteins can be chemically modified by adding methyl, phosphate, acetyl, and ubiquitin groups.
  • Protein longevity can be affected by altering stages of gene regulation, including but not limited to altering: accessibility to chromosomal DNA for transcription, rate of translation, nuclear shuttling, RNA stability, and post-translational modifications.
  • Ubiquitin is added to a protein to mark it for degradation by the proteasome.

Key Terms

  • ubiquitin: a small polypeptide present in the cells of all eukaryotes; it plays a part in modifying and degrading proteins
  • proteasome: a complex protein, found in bacterial, archaeal and eukaryotic cells, that breaks down other proteins via proteolysis

Chemical Modifications, Protein Activity, and Longevity

Proteins can be chemically modified with the addition of methyl, phosphate, acetyl, and ubiquitin groups. The addition or removal of these groups from proteins regulates their activity or the length of time they exist in the cell. Sometimes these modifications can regulate where a protein is found in the cell; for example, in the nucleus, the cytoplasm, or attached to the plasma membrane.

Chemical modifications occur in response to external stimuli such as stress, the lack of nutrients, heat, or ultraviolet light exposure. These changes can alter protein function, epigenetic accessibility, transcription, mRNA stability, or translation; all resulting in changes in expression of various genes. This is an efficient way for the cell to rapidly change the abundance levels of specific proteins in response to the environment. Because proteins are involved in every stage of gene regulation, the phosphorylation of a protein (depending on the protein that is modified) can alter accessibility to the chromosome, can alter translation (by altering transcription factor binding or function), can change nuclear shuttling (by influencing modifications to the nuclear pore complex), can alter RNA stability (by binding or not binding to the RNA to regulate its stability), can modify translation (increase or decrease), or can change post-translational modifications (add or remove phosphates or other chemical modifications). All of these protein activities are affected by the phosphorylation process. The enzymes which are responsible for phosphorylation are known as protein kinases. The addition of a phosphate group to a protein can result in either activation or deactivation; it is protein dependent.

Another example of chemical modifications affecting protein activity include the addition or removal of methyl groups. Methyl groups are added to proteins via the process of methylation; this is the most common form of post-translational modification. The addition of methyl groups to a protein can result in protein-protein interactions that allows for transcriptional regulation, response to stress, protein repair, nuclear transport, and even differentiation processes. Methylation on side chain nitrogens is considered largely irreversible while methylation of the carboxyl groups is potentially reversible. Methylation in the proteins negates the negative charge on it and increases the hydrophobicity of the protein. Methylation on carboxylate side chains covers up a negative charge and adds hydrophobicity. The addition of this chemical group changes the property of the protein and, thus, affects it activity.

The addition of an ubiquitin group to a protein marks that protein for degradation. Ubiquitin acts like a flag indicating that the protein lifespan is complete. These proteins are moved to the proteasome, an organelle that functions to remove proteins to be degraded. One way to control gene expression is to alter the longevity of the protein: ubiquitination shortens a protein’s lifespan.

image

Ubiquitin Tags: Proteins with ubiquitin tags are marked for degradation within the proteasome.