Physiological, Biochemical, and Molecular-Genetic Events Governing the Recognition and Resolution of RNA/DNA Hybrids
- Judith A. Kassis, PhD, Head, Section on Gene Expression
- J. Lesley Brown, PhD, Staff Scientist
- Yuzhong Cheng, PhD, Senior Research Technician
- Sandip De, PhD, Postdoctoral Fellow
- Payal Ray, PhD, Postdoctoral Fellow
- Victoria Blake, BS, Postbaccalaureate Fellow
During development and differentiation, genes become competent to be expressed or are stably silenced in an epigenetically heritable manner. The selective activation/repression of genes leads to differentiation of tissue types. Much evidence supports the model that modifications of histones in chromatin contribute substantially to determining whether a gene is expressed. Two groups of genes, the Polycomb group (PcG) and Trithorax group (TrxG), are important for inheritance of the silenced and active chromatin state, respectively. In Drosophila, regulatory elements called Polycomb group response elements (PREs) are required for the recruitment of chromatin-modifying PcG protein complexes. TrxG proteins may act through the same or overlapping cis-acting sequences. Our group aims to understand how PcG and TrxG proteins are recruited to DNA. Toward that end, one major project in the lab has been to determine all sequences and DNA–binding proteins required for PRE activity. In the Drosophila genome, there are hundreds of PREs that regulate a similar number of genes, and it was not known whether all PREs are alike. Our recent data showed that there is functional and architectural diversity among PREs, suggesting that PREs adapt to the environment of the gene they regulate. A second major project in the lab is to determine how the PREs of the engrailed/invected gene complex act to control these genes in their native location. Surprisingly, we found that not all PREs are required in vivo, suggesting a redundancy in PRE function. To understand the interplay between PREs and enhancers (sequences important for activation of gene expression), we completed an analysis of the regulatory DNA of the engrailed/invected gene complex. We found that regulatory sequences are spread throughout at least a 79kb region in that gene complex and that the same enhancers activate both engrailed and invected expression. The finding lays the groundwork for future studies aimed at understanding how distant regulatory sequences coordinately regulate gene activity. We also conducted a genetic screen to find genes that regulate PRE activity and found an interesting cohesin-Polycomb connection, a project that has now been moved into a collaborative endeavor in the mouse. The aim of these studies is to probe the regulation of gene expression more deeply so as to permit an understanding of how gene expression can malfunction and lead to developmental abnormalities and disease.
Polycomb group response elements (PREs)
PcG proteins act in protein complexes that repress gene expression by modifying chromatin. The best studied PcG protein complexes are PRC1 and PRC2. PRC2 contains the histone methyltransferase Enhancer of Zeste, which tri-methylates lysine 27 on histone H3 (H3K27me3). The chromatin mark H3K27me3 is the signature of PRC2 function. At most well studied genes PRC2 acts with PRC1, which binds to H3K27me3, mono-ubiquitinates histone H2A at lysine 119, and inhibits chromatin remodeling. In Drosophila, PRC1 and PRC2 are recruited to the DNA by PREs (Reference 1). We are interested in determining how this occurs, and, to that end, we defined all the DNA sequences and are finding all DNA–binding proteins required for the activity of a single 181–bp PRE of the Drosophila engrailed gene (PRE2). Binding sites for seven different proteins are required for the activity of the PRE2 (Figure 1) (Reference 2). There are several binding sites for some of these proteins. Our laboratory identified four of the proteins: Pho (the first PRE–binding protein identified); the related protein Pho-like Spps; and a protein we call Protein A (manuscript in preparation). Genome-wide studies show that Pho and Spps bind to hundreds of PREs located throughout the Drosophila genome.
Click image to enlarge.
en PRE1 and 2 are from the engrailed gene; iab-7/Fab-7 PRE is from the Abd-B gene; eve PRE is from the even-skipped gene. The symbols represent consensus binding sites for the proteins indicated below. Figure reprinted from Reference 2.
Studies designed to test the function of PREs in transgenes showed that PREs are largely interchangeable in some assays, with subtle activity differences. To determine how similar PREs are, we compared the binding-site arrangements and requirements in two closely linked engrailed PREs, PRE1 and PRE2, and compared them with two other PREs in the genome (Figure 1). All these PREs mediate transcriptional repression of the reporter gene mini-white in transgenic Drosophila, but the arrangement, number, and order of the binding sites vary dramatically among the different PREs. We tested the engrailed PREs in another reporter vector, one that gives β-galactosidase expression in embryos, larval salivary glands, and brains (Figure 2). In the vector, PRE1 but not PRE2 is able to repress expression in the anterior part of the embryo, an indication of PRE activity in this vector (Figure 2A). In contrast, both PREs are able to repress β-galactosidase expression (Figure 2B) and cause the deposition of the H3K27me3 repressive mark over the PRE and lacZ gene in salivary glands (Figure 3). However, only PRE1 is able to silence β-galactosidase expression in a subset of cells in the brain (Figure 2C). The data show that PREs are a diverse group of elements that share some but not all activities. Because PREs regulate many different genes, in different tissues and times of development, the differences may be important for the fine-tuning of PcG repression. It is also possible that different PREs recruit different PcG protein complexes. We are currently conducting experiments to test the hypothesis.
Click image to enlarge.
A. Drosophila embryos (anterior left, dorsal up) stained with antibody against β-galactosidase (β-gal) show that PRE1 but not PRE2 is able to repress expression of β-gal in the anterior part of the embryos (denoted by the double-headed arrow). B. β-gal activity stain in salivary glands: β-gal is expressed from the vector alone (no PRE), but is repressed by either PRE1 or PRE2. C. Expression in the larval brain is partially repressed by PRE1 but not by PRE2.
Click image to enlarge.
Chromatin immunoprecipitation with anti-H3K27me3 antibodies on salivary glands from transgenic larvae with vector alone or with vector plus PRE1 or PRE2. A Ubx PRE is used as a positive control. Both PRE1 and PRE2 cause accumulation of the repressive H3K27me3 mark over the transgene.
The role of PREs at the en gene
The Drosophila engrailed (en) gene encodes a homeodomain protein that plays an important role in the development of many parts of the embryo, including formation of the segments, nervous system, head, and gut. By specifying the posterior compartment of each imaginal disk, en also plays a significant role in the development of the adult. Accordingly, en is expressed in a highly specific and complex manner in the developing organism. The en gene exists in a gene complex with an adjacent gene, invected (inv); inv encodes a protein with a nearly identical homeodomain, and en and inv are co-regulated and express proteins with largely redundant functions.
The en and inv genes exist in a 113kb domain that is covered by the H3K27me3 chromatin mark. Within the en/inv domain there are four major PREs, strong peaks of PcG protein binding. One popular model posits that DNA–binding proteins bound to the PREs recruit PcG protein complexes and that PRC2 tri-methylates histone H3 throughout the domain until it comes to either an insulator or an actively transcribed gene. As discussed above, there are two PREs upstream of the en transcription unit, PRE1 and PRE2. Both PREs reside within a 1.5kb fragment located from −1.9kb to −400bp upstream of the major en transcription start site. There are also two major inv PREs, one located at the promoter and another about 6kb upstream of that. Our laboratory showed that all these PREs have the functional properties attributed to PREs in transgenic assays. To test their function at the intact en/inv domain, we set out to delete these PREs from the genome. Given that PREs work as repressive elements, the predicted phenotype of a PRE deletion is a gain-of-function ectopic expression phenotype. Unexpectedly, when we made a 1.5kb deletion removing PRE1 and PRE2, flies were viable and had a partial loss-of-function phenotype in the wing. Similarly, deletion of inv PREs yielded viable flies with no mis-expression of en or inv. Importantly, the H3K27me3 en/inv domain is not disrupted in either of these mutants. We hypothesize that en and inv PREs are redundant with each other, so that either is sufficient to recruit PRC2, which tri-methylates histone H3 throughout the en/inv domain.
In Drosophila, PREs are easily recognizable as discrete peaks of binding of PcG proteins in chromatin-immunoprecipitation experiments, but the H3K27me3 mark spreads throughout large regions. PcG proteins are conserved in mammals; however, PcG binding usually does not occur in sharp peaks, and PREs have been much harder to identify. We recently created a chromosome in which both the en and inv PREs are deleted. Surprisingly, the flies are viable, and preliminary results suggest that there is no mis-expression of en or inv. The question arises as to how PcG proteins are recruited to the en/inv domain in the absence of these PREs. We performed chromatin-immunoprecipitation followed by Next-Gen sequencing (ChIP-Seq) on the PcG proteins Pho and Polyhomeotic (Ph). The data showed that, in addition to the large Pho/Ph peaks at the known PREs, there are many smaller Pho/Ph peaks within the inv/en domain. Our data show that the peaks may also function as PREs. Thus, rather than a few PREs, there are many PREs controlling inv/en expression, and some of them may act in tissue-specific ways.
Increasing cohesin binding stability counteracts PcG silencing in Drosophila.
Cohesin consists of the proteins Smc1, Smc3, Rad21, and Stromalin (SA) and is important for sister chromatid cohesion and proper chromosome segregation during mitosis. In addition, cohesin and cohesin-associated proteins play an important role in regulating gene expression. In a recent study, others found that the cohesin subunits Smc1, Smc3, and Rad21 co-purify with the PcG protein Polycomb, suggesting that the protein complexes may physically interact at some loci. Wapl protein regulates binding of the cohesin complex to chromosomes during interphase and helps remove cohesin from chromosomes at mitosis. We isolated a dominant mutation in wapl (waplAG) in a screen for mutations that counteract silencing mediated by an engrailed PRE (Reference 3). waplAG hemizygotes die as pharate adults and have an extra sex combs phenotype characteristic of males with mutations in PcG genes (Figure 4). The wapl gene encodes two proteins, a long form and a short form. waplAG introduces a stop codon at amino acid 271 of the long form and produces a truncated protein. The expression of a transgene encoding the truncated Wapl-AG protein causes an extra-sex-comb phenotype similar to that seen in the waplAG mutant. Mutations in the cohesin-associated genes Nipped-B and pds5 suppress and enhance waplAG phenotypes, respectively. A Pds5–Wapl complex (releasin) removes cohesin from DNA, while Nipped-B loads cohesin, suggesting that Wapl–AG might exert its effects through changes in cohesin binding. Consistent with this model, Wapl-AG was found to increase the stability of cohesin binding to polytene chromosomes. Our data suggest that increasing cohesin stability interferes with PcG silencing at genes that are co-regulated by cohesin and PcG proteins. In collaboration with Karl Pfeifer, we are making a conditional mutant in mouse Wapl. We will investigate whether mutations in mouse Wapl similarly disrupt PcG–regulated silencing at some loci. Genome-wide studies in Drosophila show that cohesin and PRC1 components co-localize at many locations throughout the genome (Reference 4). Functional studies suggest that cohesin binding may control the availability of PRC1 components for gene silencing (Reference 4). Our Wapl mouse mutant will provide a valuable reagent to test whether similar PRC1–cohesin interactions are important regulators of gene expression in mammals.
Click image to enlarge.
This waplAG pharate adult male has sex comb teeth on all three legs (arrows). The second leg has eight sex comb teeth, and the third leg has two sex comb teeth.
Enhancers are often located tens or even hundreds of kb away from their promoter, sometimes even closer to promoters of genes other than the one they activate. We showed that en enhancers can act over large distances, even skipping over other transcription units, choosing the en promoter over those of neighboring genes (Kwon et al., Development 2009;136;3067). Such specificity is achieved in at least three ways. First, early-acting en stripe enhancers exhibit promoter specificity. Second, a proximal promoter-tethering element is required for the action of the imaginal disk enhancer(s). Our data point to two partially redundant promoter-tethering elements. Third, the long-distance action of en enhancers requires a combination of the en promoter and sequences within or closely linked to the promoter-proximal PREs. The data show that several mechanisms ensure proper enhancer-promoter specificity at the Drosophila en locus, providing one of the first detailed views of how promoter-enhancer specificity is achieved.
As a follow-up to these studies, we located all the enhancers that regulate the transcription of engrailed (en) and the closely-linked co-regulated gene invected (inv) (Reference 5). Our dissection of inv/en–regulatory DNA showed that most enhancers are spread throughout a 62kb region. We used two types of constructs to analyze the function of this DNA: P-element–based reporter constructs with small pieces of DNA fused to the en promoter driving lacZ expression (Figure 5); and large constructs with HA–tagged en and inv inserted in the genome with the phiC31 system. In addition, we generated deletions of inv and en DNA in situ and assayed their effects on inv/en expression. Our results support and extend our knowledge of inv/en regulation. First, inv and en share regulatory DNA, most of which is flanking the en transcription unit. In support of this finding, a 79-kb HA-en transgene can rescue inv en double mutants to viable, fertile adults. In contrast, an 84-kb HA-inv transgene lacks most of the enhancers for inv/en expression. Second, there are multiple enhancers for inv/en stripes in embryos; some may be redundant but others play discrete roles at different stages of embryonic development. Finally, no small reporter construct gave expression in the posterior compartment of imaginal discs, a hallmark of inv/en expression. Robust expression of HA-en in the posterior compartment of imaginal discs is evident from the 79-kb HA-en transgene, while a 45-kb HA-en transgene gives weaker, variable imaginal disc expression. We suggest that the activity of the imaginal disc enhancer(s) is dependent on the chromatin structure of the inv/en domain. We are currently investigating the properties of the inv/en imaginal disc enhancer(s) using a variety of methods, including deleting them from the endogenous inv/en domain using Crisper/Cas9.
Click image to enlarge.
A. P-element vector (P[en]), used to assay the function of en regulatory DNA, contains the en promoter, 396bp of upstream sequences, and an untranslated leader fusion between the en transcript and the Adh-lacZ reporter gene. inv/en DNA fragments were added to this vector at the location of the triangle. B. The extent of each fragment cloned into P[en] is shown as a black line with a letter above the inv/en genomic DNA map (indicated by a long black line with hatch marks at 10kb intervals; numbers are coordinates on chromosome 2R, genome release v5). Expression pattern in embryos or the wing imaginal disc (wd) are shown above or below the genomic DNA, with arrows pointing to the fragment(s) that generate the pattern (Reference 5).
- Kassis JA, Brown JL. Polycomb group response elements in Drosophila and vertebrates. Adv Genet 2013; 81:83-118.
- Brown JL, Kassis JA. Architectural and functional diversity of polycomb group response elements in Drosophila. Genetics 2013; 195:433-441.
- Cunningham MD, Gause M, Cheng Y, Noyes A, Dorsett D, Kennison JA, Kassis JA. Wapl antagonizes cohesin binding and promotes Polycomb group silencing in Drosophila. Development 2012; 139:4172-4179.
- Dorsett D, Kassis JA. Check and balances between cohesin and Polycomb in gene silencing and transcription. Curr Biol 2014; 24:R535-R539.
- Cheng Y, Brunner AL, Kremer S, DeVido SK, Stefaniuk CM, Kassis JA. Co-regulation of invected and engrailed by a complex array of regulatory sequences in Drosophila. Dev Biol 2014; 395:131-143.
- Dale Dorsett, PhD, Saint Louis University, St. Louis, MO
- Maria Gause, PhD, Saint Louis University, St. Louis, MO
- James A. Kennison, PhD, Program in Genomics of Differentiation, NICHD, Bethesda, MD
- Karl Pfeifer, PhD, Program in Genomics of Differentiation, NICHD, Bethesda, MD