Genes

Bacterial Genomes

Bacterial genomes are smaller in size (size range from 139 kbp to 13,000 kpb) between species when compared with genomes of eukaryotes.

Learning Objectives

Explain the basic features of bacterial genomes

Key Takeaways

Key Points

  • In prokaryotes, most of the genome (85-90%) is non-repetitive, coding DNA, while the remaining DNA is non-coding.
  • The genome of a pathogenic microbe, “genome” is meant to include information stored on this auxiliary material, which is carried in plasmids.
  • The lifestyles of bacteria play an integral role in their respective genome sizes. Free-living bacteria have the largest genomes out of the three types of bacteria; however, they have fewer pseudogenes than bacteria that have recently acquired pathogenicity.

Key Terms

  • genome: The complete genetic information (either DNA or, in some viruses, RNA) of an organism, typically expressed in the number of basepairs.

Bacterial genomes are generally smaller and less variant in size between species when compared with genomes of animals and single cell eukaryotes. Bacterial genomes can range in size anywhere from 139 kbp to 13,000 kbp. Recent advances in sequencing technology led to the discovery of a high correlation between the number of genes and the genome size of bacteria, suggesting that bacteria have relatively small amounts of junk DNA.

Studies have since shown that a large number of bacterial species have undergone genomic degradation resulting in a decrease in genome size from their ancestral state. Over the years, researchers have proposed several theories to explain the general trend of bacterial genome decay and the relatively small size of bacterial genomes. Compelling evidence indicates that the apparent degradation of bacterial genomes is owed to a deletional bias.

In prokaryotes, most of the genome (85-90%) is non-repetitive DNA, which means coding DNA mainly forms it, while non-coding regions only take a small part. Most biological entities that are more complex than a virus sometimes or always carry additional genetic material besides that which resides in their chromosomes. In some contexts, such as sequencing the genome of a pathogenic microbe, “genome” is meant to include information stored on this auxiliary material, which is carried in plasmids. In such circumstances then, “genome” describes all of the genes and information on non-coding DNA that have the potential to be present.

Amongst species of bacteria, there is relatively little variation in genome size when compared with the genome sizes of other major groups of life. Genome size is of little relevance when considering the number of functional genes in eukaryotic species. In bacteria however, the strong correlation between the number of genes and the genome size makes the size of bacterial genomes an interesting topic for research and discussion. The general trends of bacterial evolution indicate that bacteria started as free-living organisms. Evolutionary paths led some bacteria to become pathogens and symbionts.

image

Graph of variation in estimated genome sizes in base pairs.: Unlike eukaryotes, bacteria show a strong correlation between genome size and number of functional genes in a genome.

The lifestyles of bacteria play an integral role in their respective genome sizes. Free-living bacteria have the largest genomes out of the three types of bacteria; however, they have fewer pseudogenes than bacteria that have recently acquired pathogenicity. Facultative and recently evolved pathogenic bacteria exhibit a smaller genome size than free-living bacteria, yet they have more pseudogenes than any other form of bacteria. Obligate bacterial symbionts or pathogens have the smallest genomes and the fewest number of pseudogenes of the three groups. The relationship between life-styles of bacteria and genome size raises questions as to the mechanisms of bacterial genome evolution.

Researchers have developed several theories to explain the patterns of genome size evolution amongst bacteria. One theory predicts that bacteria have smaller genomes due to a selective pressure on genome size to ensure faster replication. The theory is based upon the logical premise that smaller bacterial genomes will take less time to replicate. Subsequently, smaller genomes will be selected preferentially due to enhanced fitness.

Deletional bias selection is but one process involved in evolution. Two other major processes (mutation and genetic drift) can be used to explain the genome sizes of various types of bacteria.

Evidence of a deletional bias is present in the respective genome sizes of free-living bacteria, facultative and recently derived parasites and obligate parasites and symbionts. Free-living bacteria tend to have large population sizes and are subject to more opportunity for gene transfer. As such, selection can effectively operate on free-living bacteria to remove deleterious sequences resulting in a relatively small number of pseudogenes. Continually, further selective pressure is evident as free-living bacteria must produce all gene-products independent of a host. Given that there is sufficient opportunity for gene transfer to occur and there are selective pressures against even slightly deleterious deletions, it is intuitive that free-living bacteria should have the largest bacterial genomes of all bacteria types. Recently formed parasites undergo severe bottlenecks and can rely on host environments to provide gene products. As such, in recently formed and facultative parasites, there is an accumulation of pseudogenes and transposable elements due to a lack of selective pressure against deletions. The population bottlenecks reduce gene transfer and as such, deletional bias ensures the reduction of genome size in parasitic bacteria.

DNA Replication in Prokaryotes

Prokaryotic DNA is replicated by DNA polymerase III in the 5′ to 3′ direction at a rate of 1000 nucleotides per second.

Learning Objectives

Explain the functions of the enzymes involved in prokaryotic DNA replication

Key Takeaways

Key Points

  • Helicase separates the DNA to form a replication fork at the origin of replication where DNA replication begins.
  • Replication forks extend bi-directionally as replication continues.
  • Okazaki fragments are formed on the lagging strand, while the leading strand is replicated continuously.
  • DNA ligase seals the gaps between the Okazaki fragments.
  • Primase synthesizes an RNA primer with a free 3′-OH, which DNA polymerase III uses to synthesize the daughter strands.

Key Terms

  • DNA replication: a biological process occuring in all living organisms that is the basis for biological inheritance
  • helicase: an enzyme that unwinds the DNA helix ahead of the replication machinery
  • origin of replication: a particular sequence in a genome at which replication is initiated

DNA Replication in Prokaryotes

DNA replication employs a large number of proteins and enzymes, each of which plays a critical role during the process. One of the key players is the enzyme DNA polymerase, which adds nucleotides one by one to the growing DNA chain that are complementary to the template strand. The addition of nucleotides requires energy; this energy is obtained from the nucleotides that have three phosphates attached to them, similar to ATP which has three phosphate groups attached. When the bond between the phosphates is broken, the energy released is used to form the phosphodiester bond between the incoming nucleotide and the growing chain. In prokaryotes, three main types of polymerases are known: DNA pol I, DNA pol II, and DNA pol III. DNA pol III is the enzyme required for DNA synthesis; DNA pol I and DNA pol II are primarily required for repair.

There are specific nucleotide sequences called origins of replication where replication begins. In E. coli, which has a single origin of replication on its one chromosome (as do most prokaryotes), it is approximately 245 base pairs long and is rich in AT sequences. The origin of replication is recognized by certain proteins that bind to this site. An enzyme called helicase unwinds the DNA by breaking the hydrogen bonds between the nitrogenous base pairs. ATP hydrolysis is required for this process. As the DNA opens up, Y-shaped structures called replication forks are formed. Two replication forks at the origin of replication are extended bi-directionally as replication proceeds. Single-strand binding proteins coat the strands of DNA near the replication fork to prevent the single-stranded DNA from winding back into a double helix. DNA polymerase is able to add nucleotides only in the 5′ to 3′ direction (a new DNA strand can be extended only in this direction). It also requires a free 3′-OH group to which it can add nucleotides by forming a phosphodiester bond between the 3′-OH end and the 5′ phosphate of the next nucleotide. This means that it cannot add nucleotides if a free 3′-OH group is not available. Another enzyme, RNA primase, synthesizes an RNA primer that is about five to ten nucleotides long and complementary to the DNA, priming DNA synthesis. A primer provides the free 3′-OH end to start replication. DNA polymerase then extends this RNA primer, adding nucleotides one by one that are complementary to the template strand.

image

DNA Replication in Prokaryotes: A replication fork is formed when helicase separates the DNA strands at the origin of replication. The DNA tends to become more highly coiled ahead of the replication fork. Topoisomerase breaks and reforms DNA’s phosphate backbone ahead of the replication fork, thereby relieving the pressure that results from this supercoiling. Single-strand binding proteins bind to the single-stranded DNA to prevent the helix from re-forming. Primase synthesizes an RNA primer. DNA polymerase III uses this primer to synthesize the daughter DNA strand. On the leading strand, DNA is synthesized continuously, whereas on the lagging strand, DNA is synthesized in short stretches called Okazaki fragments. DNA polymerase I replaces the RNA primer with DNA. DNA ligase seals the gaps between the Okazaki fragments, joining the fragments into a single DNA molecule.

The replication fork moves at the rate of 1000 nucleotides per second. DNA polymerase can only extend in the 5′ to 3′ direction, which poses a slight problem at the replication fork. As we know, the DNA double helix is anti-parallel; that is, one strand is in the 5′ to 3′ direction and the other is oriented in the 3′ to 5′ direction. One strand (the leading strand), complementary to the 3′ to 5′ parental DNA strand, is synthesized continuously towards the replication fork because the polymerase can add nucleotides in this direction. The other strand (the lagging strand), complementary to the 5′ to 3′ parental DNA, is extended away from the replication fork in small fragments known as Okazaki fragments, each requiring a primer to start the synthesis. Okazaki fragments are named after the Japanese scientist who first discovered them.

The leading strand can be extended by one primer alone, whereas the lagging strand needs a new primer for each of the short Okazaki fragments. The overall direction of the lagging strand will be 3′ to 5′, while that of the leading strand will be 5′ to 3′. The sliding clamp (a ring-shaped protein that binds to the DNA) holds the DNA polymerase in place as it continues to add nucleotides. Topoisomerase prevents the over-winding of the DNA double helix ahead of the replication fork as the DNA is opening up; it does so by causing temporary nicks in the DNA helix and then resealing it. As synthesis proceeds, the RNA primers are replaced by DNA. The primers are removed by the exonuclease activity of DNA pol I, while the gaps are filled in by deoxyribonucleotides. The nicks that remain between the newly-synthesized DNA (that replaced the RNA primer) and the previously-synthesized DNA are sealed by the enzyme DNA ligase that catalyzes the formation of phosphodiester linkage between the 3′-OH end of one nucleotide and the 5′ phosphate end of the other fragment.

The table summarizes the enzymes involved in prokaryotic DNA replication and the functions of each.

Prokaryotic DNA Replication: Enzymes and Their Function
Enzyme/protein Specific Function
DNA pol I Exonuclease activity removes RNA primer and replaces with newly synthesized DNA
DNA pol II Repair function
DNA pol III Main enzyme that adds nucletides in the 5′ – 3′ direction
Helicase Opens the DNA helix by breaking hydrogen bonds between the nitrogenous bases
Ligase Seals the gaps between the Okazaki fragments to create one continuous DNA strand
Primase Synthesizes RNA primers needed to start replication
Sliding Clamp Helps to hold the DNA polymerase in place when nucleotides are being added
Topoisomerase Helps relieve the stress on DNA when unwinding by causing breaks and then resealing the DNA
Single-strand binding proteins (SSB) Binds to single-stranded DNA to avoid DNA rewinding back.

Gene Inversion

Gene Inversion utilizes recombinases to invert DNA sequences, resulting in an ON to OFF switch in the gene located within this switch.

Learning Objectives

Explain how gene sequence inversions can have a regulatory effect

Key Takeaways

Key Points

  • Recombining sequences in site-specific reactions are usually short and occur at a single target site. For this to occur, there is typically one or more cofactors (to name a few: DNA -binding proteins and the presence or absence of DNA binding sites) and a site specific recombinase.
  • Many bacterial species can utilize inversion to change the expression of certain genes for the benefit of the bacterium during infection.
  • The inversion event can be simple by involving the toggle in expression of one gene, like E. coli pilin expression; or more complicated by involving multiple genes in the expression of multiple types of flagellin by S. typhimurium.

Key Terms

  • recombinase: Any of several enzymes that mediate recombination of DNA fragments between maternal and paternal chromosomes in prokaryotes.
Phase variation - site specific inversion

Phase variation Site specific Inversion: Phase and Antigenic Variation in Bacteria. pA is the promoter for FimA, pB is the promoter for FimB and pE is the promoter for FimE. IRR is inverted repeat right and IRL is inverted repeat left. FimB and FimE are recombinases that can change the orientation of the FimA promoter by inverting the IRR and IRL.

Recombining sequences in site-specific reactions are usually short and occur at a single target site within the recombining sequence. For this to occur, there is typically one or more cofactors (to name a few: DNA-binding proteins and the presence or absence of DNA binding sites) and a site specific recombinase. There is also a change in orientation of the DNA that will affect gene expression or the structure of the gene product. This is done by changing the spatial arrangement of the promoter or the regulatory elements.

Through the utilization of specific recombinases, a particular DNA sequence is inverted, resulting in an ON to OFF switch, and vice versa, of the gene located within or next to this switch. Many bacterial species can utilize inversion to change the expression of certain genes for the benefit of the bacterium during infection. The inversion event can be simple by involving the toggle in expression of one gene, like E. coli pilin expression; or more complicated by involving multiple genes in the expression of multiple types of flagellin by S. typhimurium. Fimbrial adhesion by the type I fimbriae in E. coli undergoes site specific inversion to regulate the expression of fimA, the major subunit of the pili, depending on the stage of infection. The invertible element has a promoter within it that depending on the orientation will turn on or off the transcription of fimA. The inversion is mediated by two recombinases, FimB and FimE, and regulatory proteins H-NS, Integration Host Factor (IHF) and Leucine responsive protein (LRP). The FimE recombinase has the capability to only invert the element and turn expression from on to off, while FimB can mediate the inversion in both directions.

Slipped-Strand Mispairing

Slipped strand mispairing (SSM) is a process that produces mispairing of short repeat sequences during DNA synthesis.

Learning Objectives

Explain how slipped-strand mispairing can be used as a mechanism to regulate gene expression

Key Takeaways

Key Points

  • Altered gene expression is a result of SSM and depending where the increase or decrease of the short repeat sequences occurs in relation to the promoter will either regulate at the level of transcription or translation. The outcome is an ON or OFF phase of a gene or genes.
  • SSM can result in an increase or decrease in the number of short repeat sequences. The short repeat sequences are 1 to 7 nucleotides and can be homogeneous or heterogeneous repetitive DNA sequences.
  • Transcriptional regulation can occur if the repeats are located in the promoter region at the RNA polymerase binding site, -10 and -35 upstream of the gene(s).
  • SSM induces transcriptional regulation is by changing the short repeat sequences located outside the promoter. If there is a change in the short repeat sequence it can affect the binding of a regulatory protein, such as an activator or repressor.

Key Terms

  • Slipped strand mispairing: a process that produces mispairing off short repeat sequences between the mother and daughter strand during DNA synthesis.

Slipped strand mispairing (SSM) is a process that produces mispairing of short repeat sequences between the mother and daughter strand during DNA synthesis. This RecA-independent mechanism can transpire during either DNA replication or DNA repair and can be on the leading or lagging strand and can result in an increase or decrease in the number of short repeat sequences. The short repeat sequences are 1 to 7 nucleotides and can be homogeneous or heterogeneous repetitive DNA sequences.

Altered gene expression is a result of SSM and depending where the increase or decrease of the short repeat sequences occurs in relation to the promoter will either regulate at the level of transcription or translation. The outcome is an ON or OFF phase of a gene or genes.

Transcriptional regulation occurs in several ways. One possibility is if the repeats are located in the promoter region at the RNA polymerase binding site, -10 and -35, upstream of the gene(s). The opportunistic pathogen H. influenzae has two divergently oriented promoters in fimbriae geneshifA and hifB. The overlapping promoter regions have repeats of the dinucleotide TA in the -10 and -35 sequences. Through SSM the TA repeat region can undergo addition or subtraction of TA dinucleotides which results in the reversible ON phase or OFF phase of transcription of the hifA and hifB. The second way that SSM induces transcriptional regulation is by changing the short repeat sequences located outside the promoter. If there is a change in the short repeat sequence, it can affect the binding of a regulatory protein, such as an activator or repressor. It can also lead to differences in post-transcriptional stability of mRNA.

image

Slip strand mispairing: Purple ovals can either be a transcription factor (TF) or RNA polymerase (RNAP). Black boxes are short sequence repeats. Start (ATG) is the start codon in which the ribosome initiates translation of nucleotide sequence into amino acids, and (-10 -35) is the promoter which is the binding site for the RNAP to initiate transcription of DNA into RNA.