Retrotransposons as Models for the Replication of Retroviruses
- Henry L. Levin, PhD, Head, Section on Eukaryotic Transposable Elements
- Angela Atwood-Moore, BA, Senior Research Assistant
- Tracy Ripmaster, PhD, Research Assistant
- Atreyi Chatterjee, PhD, Visiting Fellow
- Hirotaka Ebina, PhD, Visiting Fellow
- Yabin Guo, PhD, Visiting Fellow
- Young-Eun Leem, PhD, Visiting Fellow
- Anasua Majumdar, PhD, Visiting Fellow
- Robin Cutler, BA, Postbaccalaureate Fellow
- Adam Evertts, BS, Postbaccalaureate Fellow
- Jung Min Park, BA, Postbaccalaureate Fellow
- Yujin Cui, BA, Volunteer
- Peter Mueller, High School Volunteer
The increasing prevalence of retroviral diseases such as AIDS and leukemia has intensified the need to understand the mechanisms of retrovirus replication. Our primary objective is to investigate how retroviral cDNAs are integrated into the genome of infected cells. Given their similarities to retroviruses, long terminal repeat (LTR) retrotransposons are important models for retrovirus replication. The retrotransposon under study in our laboratory is the Tf1 element of the fission yeast Schizosaccharomyces pombe. We are particularly interested in Tf1 because of its strong preference for integrating into pol II promoters. This choice of target site is similar to the strong preference of human immunodeficiency virus 1 (HIV-1) and murine leukemia virus (MLV) for integration into pol II transcription units. Little is known about how these viruses recognize their target sites. We therefore study the integration of Tf1 as a model system from which we hope to uncover mechanisms general to the selection of target sites. Understanding the mechanisms responsible for targeted integration may lead to new approaches for blocking the replication of HIV-1.
GPY/F is a highly conserved domain of integrase that mediates multimerization and catalytic activity
Ebina, Chatterjee, Judson,1 Levin
The C-terminal domains in the integrases (IN) of LTR-retrotransposons and retroviruses are not well conserved. However, close examination of C-termini revealed one motif that is present in a wide variety of INs. This module, termed the GPY/F motif, is found in the INs of a diverse set of LTR-retrotransposons in the Metaviridae family (formally Ty3/gypsy) and in the gamma class of retroviruses. The function of the motif has not been studied. Given that the IN of Tf1, as a recombinant protein, is highly soluble and possesses robust catalytic activity, we decided to use it as a model for studying the function of the GPY/F domain.
In solution, the INs of HIV-1, MLV, and avian sarcoma virus (ASV) form a dimer-tetramer equilibrium. To measure the ability of Tf1 IN to multimerize, we tested full-length IN for interactions with the individual portions of the protein. Using a precipitation procedure and recombinant proteins, we found that the N-terminal domain, the central core, and a 71–amino acid fragment containing the GPY/F domain all bound to the full-length IN. To test directly for stable multimers, we performed gel filtration with Superdex™ 200. IN eluted as a single peak with an observed molecular weight of 126.5 kDa, a size consistent with a dimer. We also observed that the central core by itself formed a stable dimer. The ability of the full-length proteins and the central core to dimerize was typical of other INs.
To investigate the contribution of the C-terminal domain to the multimerization of IN, we subjected the GPY/F fragment to gel filtration with Superdex™ 75. The profile produced three major peaks with apparent sizes of monomer, dimer, and trimer. To test whether the GPF residues that define the motif contributed to multimerization, we generated alanine substitutions. Both substitutions G364A and P365A completely disrupted multimerization of the GPY/F fragment. The data indicate that the GPY/F residues play a role in promoting multimerization. In separate experiments to test for multimers, we subjected the GPY/F fragment to the chemical cross-linker dithiobis-(succinimidyl propionate). The results indicated that the protein was in an equilibrium of monomers, dimers, trimers, and tetramers. We propose that the full-length IN in its synaptic complex with the donor and target DNA may form a tetramer. Our hypothesis is supported by the recent finding of R. Craigie and colleagues that HIV-1 IN does form a tetramer when purified in a synaptic complex.
The C-terminal domains of INs are known to bind to DNA without sequence specificity. To map which sections of Tf1 IN interact with DNA, we assayed each of the individual domains for DNA binding by mixing labeled oligonucleotides with the individual domains and cross-linking the mixtures with UV. The full-length IN, core, and GPY/F fragment exhibited substantial DNA binding activity, which corresponded well to what has been described for other INs. In additional experiments, the G364A and P365A substitutions in the GPF residues of the GPY/F fragment did not reduce DNA binding, indicating that other sequences in the GPY/F fragment mediated the DNA-binding activity and that the GPY/F residues appear to be specific for promoting multimerization. Our finding that the G364A and P365A mutations did not reduce DNA binding also indicated that multimerization was not required for DNA binding.
We tested the contribution to catalysis of the GPY/F residues by generating recombinant IN with the substitution G364A. This mutation significantly reduced strand transfer activity—a result consistent with the model that tetramer formation is required for strand transfer activity. The widespread conservation of the GPY/F motif suggests that the motif may also promote multimerization and strand transfer activity in other INs.
- Ebina H, Judson R, Levin H. The GP(Y/F) domain of Tf1 integrase multimerizes when present in a fragment, and substitutions in this domain reduce enzymaticactivity of the full-length protein. J Biol Chem 2008;283:15965-15974.
Integration preference of Tf1 for Pol II promoters
Leem, Ripmaster, Kelly,2 Ebina, Heincelman,3 Levin; in collaboration with Grewal, Hoffman, Zhang
Our analysis of the genome sequence of S. pombe revealed large numbers of pre-existing Tf transposons that are positioned 200 to 400 bp upstream of open reading frames (ORFs). The production of new integration events revealed that the position of Tf1 upstream of ORFs results from integration preference. The similarity of Tf1’s integration pattern to HIV-1’s preference for integration into pol II transcription units and the MLV’s preference for transcription start sites motivated us to study Tf1 as a model for retrovirus integration. To define the determinants of the target sites recognized by Tf1, we developed an in vivo assay for integration, using a plasmid that contained potential targets of integration. In addition to the target plasmids, the assay strains of S. pombe contained a plasmid with an inducible copy of Tf1. We expressed a version of Tf1 that contained neo, a gene that causes cells with integration events to be resistant to the antibiotic G418.

Figure 2.2
The positions of insertions in target plasmids are shown with the plasmid coordinates. The plasmids contained the genes bub1-ade6 (A), fbp1 (B), nup124 (C), SPCC4F11.03c (D), SPBC365.14c (E), and the 173 bp insertion window of bub1-ade6 (F).
To determine which type of sequence would function as targets of integration, we placed five genes on target plasmids and introduced each into the S. pombe strain expressing Tf1. In each target plasmid, the insertions of Tf1 clustered at specific positions upstream of the ORFs (Figure 2.2). For example, 50 insertions were isolated in a plasmid containing ade6, with 95 percent of these insertions occurring within a 160 nt window upstream of the ORF. To determine which sequences of ade6 were required for efficient integration, we created a series of deletions within the target plasmid. Our analysis revealed that the 160 nt region was the only sequence required for efficient integration. To determine whether promoter activity was essential for integration, we measured transcript levels of ade6. Deletions of sequence on either side of the 160 nt region caused 5- to 10-fold reductions in ade6 mRNA. However, the deletions caused no reduction in integration efficiency. Our results indicate that the transcription activity of the promoters per se did not play an important role in targeted integration.
Another plasmid tested for integration activity contained the gene fbp1. Eighty-six percent of the insertions in the plasmid occurred in sequence upstream of the ORF (Figure 2.2B). Of the 18 inserts upstream of the ORF, 15 clustered within a 90 bp window adjacent to UAS1, the binding site for the transcription activator Atf1p. The proximity of these insertion sites to UAS1 suggested that Tf1 recognized the promoters upstream of the ORFs. To test whether the integration upstream of fbp1 required the function of UAS1, we mutated six nucleotides in UAS1 and observed that they caused a nine-fold reduction in integration in the promoter of fbp1. To test whether Atf1p, the protein that binds to UAS1, contributed to integration, we deleted the atf1 gene. Strikingly, the absence of Atf1p caused a 15.5-fold reduction in integration at UAS1, indicating that Atf1p was required to position integration at the promoter of fbp1. One model for the specific role of Atf1p in positioning integration at the promoter of fbp1 holds that Atf1p bound to UAS1 mediates integration thereby forming a complex with IN. We tested such a possibility by immunoprecipitating Atf1p from cell extracts and showed that FLAG-Atf1p precipitated a substantial amount of IN, indicating that the IN formed a specific interaction with Atf1p. The results support the model that Atf1p mediates integration at the promoter of fbp1. Interestingly, integration in the promoter of ade6 was independent of Atf1p. Results of chromatin immunoprecipitation experiments confirmed the binding of Atf1p upstream of fbp1 but revealed that Atf1p did not bind to the sequences upstream of ade6. As a result, a distinct factor or factors likely mediate integration upstream of ade6. Nevertheless, the role of a DNA-bound transcription factor in the targeting of integration is strong evidence that Tf1 recognizes the pol II promoters. This finding is significant in that previous work demonstrated only that integration occurred upstream of ORFs.
- Ebina H, Levin H. Stress management: how cells take control of their transposons. Mol Cell 2007;27:180-181.
- Leem Y, Ripmaster T, Kelly F, Ebina H, Heincelman M, Zhang K, Grewal S, Hoffman C, Levin H. Retrotransposon Tf1 is targeted to pol II promoters by transcription activators. Mol Cell 2008;30:98-107.
- Levin H. Metaviruses. In: Mahy BWJ, Van Regenmortel M, eds. Encyclopedia of Virology, Third Edition. Elsevier, 2008;3:301-311.
The integration of retrotransposons into heterochromatin
Ebina, Levin; in collaboration with Gao, Voytas
Mobile elements constitute a major portion of the genetic material in eukaryotic cells. To protect cellular genes from the disruption caused by transposition, cells have evolved strategies, such as RNA interference (RNAi), to restrict the amplification of mobile elements. These surveillance systems lead to epigenetic inactivation of transposable elements through post-transcriptional histone silencing and DNA modifications. The genetic and epigenetic impact of mobile elements is largely determined by the elements’ choice of integration sites. Many mobile elements target integration to specific chromosomal sites by forming interactions between integrase and chromatin factors. Such is the case for the Ty5 transposon that relies on a specific interaction between the chromatin factor SIR4p and the C-terminal end of its integrase to direct integration to regions of heterochromatin. An interesting lineage of retrotransposons referred to as the chromoviruses (Metaviridae family) possess C-terminal regions of integrase with a chromodomain (CHD)—a smaller than 40– to 50–amino acid sequence motif that can interact with diverse targets, including proteins, RNA, and DNA. The best-characterized CHD partners are methylated histone residues. The heterochromatin protein 1 (HP1) possesses a CHD that interacts with histone H3 methyl-K9. Consequently, when CHDs were identified in the retrotransposon integrases, it was proposed that they play a role in target specificity. In a collaborative study, we sought to understand the role of the CHD in the integrases of these chromovirus retrotransposons. We described three groups of chromoviruses based on amino acid–sequence relationships of their integrase C-termini. Genome sequence analysis indicated that representative chromoviruses from each group are enriched in gene-poor regions of the genome relative to other retrotransposons, and, when fused to fluorescent marker proteins, the chromodomains target proteins to specific subnuclear foci coincident with heterochromatin. The chromodomain of the fungal element MAGGY interacts with histone H3 dimethyl- and trimethyl-K9, and, when the MAGGY chromodomain is fused to integrase of the Schizosaccharomyces pombe Tf1 retrotransposon, new Tf1 insertions are directed to sites of H3 K9 methylation. Repetitive sequences such as transposable elements trigger the RNAi pathway resulting in the elements’ epigenetic modification. The results suggest a dynamic interplay between retrotransposons and heterochromatin, whereby mobile elements recognize heterochromatin at the time of integration and then perpetuate the heterochromatic mark by triggering epigenetic modification.
- Gao X, Hou Y, Ebina H, Levin H, Voytas D. Chromodomains direct integration of retrotransposons to heterochromatin. Genome Res 2008;18:359-369.
Host genome surveillance of retrotransposons
Ebina, Levin; in collaboration with Grewal, Cam, Noma
Transposable elements and their fragments constitute a substantial portion of eukaryotic genomes. Host genomes have evolved defense mechanisms, including the use of chromatin modifications and RNA interference, to inhibit transposable elements. In a collaborative study, we identified a genome surveillance mechanism for retrotransposons that is mediated by the transposase-derived centromeric protein CENP-B homologues of S. pombe. CENP-B homologues of S. pombe silence TP2 retrotransposons by localizing and recruiting histone acetylases to the transposon sequences. CENP-Bs also repressed the transcription of solo LTRs and LTR-associated genes. Tf2 elements were clustered into “Tf” bodies, the organization of which depended on CENP-Bs and displayed discrete nuclear structures. Furthermore, CENP-Bs inactivate the Tf1 retrotransposon by blocking its recombination with pre-existing copies of Tf2; CENP-Bs silenced and immobilized a Tf1 integrant that became sequestered into Tf bodies. The results reveal a probable ancient retrotransposon surveillance pathway important for host genome integrity and highlight potential conflicts between DNA transposons and retrotransposons.
- Cam H, Noma K, Ebina H, Levin H, Grewal S. Host genome surveillance for retrotransposons by transposonderived proteins. Nature 2008;451:431-436.
The activity of the hermes transposon in S. pombe
Evertts, Guo, Levin
Tf1 integrates primarily into pol II promoters. Although we currently believe that such preference results from a mechanism that actively targets Tf1, it is possible that the insertion bias is attributable to greater accessibility at the promoter sequences. We are currently testing such a possibility by studying the integration pattern in S. pombe cells of hermes, a “cut and paste” transposon that was isolated from the house fly. Given that hermes propagates in a host that is evolutionarily distant from S. pombe, it is unlikely that an existing mechanism would actively position insertion sites. Thus, any integration of hermes in S. pombe would likely occur at positions accessible to the transposase. In addition, unbiased activity of a transposon in S. pombe could be widely adapted as a tool for random mutagenesis. As no methods currently exist for transposon mutagenesis of S. pombe, a method for insertional mutagenesis would be a significant contribution to the field.
We expressed the transposase of hermes in S. pombe by fusing its gene to the promoter of nmt1, using three versions of the nmt1 promoter to express various levels of the transposase. Immunoblots of cell extracts demonstrated that the transposase was expressed in a stable form and in amounts corresponding to the strength of the promoter. To permit measurement of transposition activity, the cells expressing the transposase also contained a plasmid encoded copy of neo flanked by the terminal inverted repeats (TIR) of hermes. We tested the ability of the transposase to cut out neo with the TIRs and insert this plasmid DNA into the pombe genome. Once the transposase was expressed, we grew cells on medium containing 5-fluoro-orotic acid, a treatment that removed the plasmid carrying the hermes TIRs and neo. We then grew the strains on agar plates containing G418 to select for cells with transposed copies of hermes. The strains expressing the transposase generated surprisingly high levels of resistance to G418. We analyzed for potential insertion events 26 independent strains that became G418-resistant. It was significant that each strain had acquired a copy of hermes as the result of a bona fide integration event. Analysis of the inserted copies revealed that 54 percent of them disrupted ORFs. Given that 60 percent of the pombe genome is coding sequence, our results indicate that the insertion of hermes did not discriminate between coding and noncoding sequences, a finding that stands in strong contrast to the integration of Tf1, where virtually none of the inserts occurs in ORFs.
To measure accurately the level of transposition of the hermes system, we conducted a quantitative assay that allowed us to calculate the number of transposition events per generation. We observed similar transposition frequencies in three experiments. After approximately 25 cell generations, 1.5 to 2.75 percent of cells contained insertions. Such a rate of transposition is ideal for generating mutation libraries because it produces a substantial number of cells with insertions while keeping the frequency of double insertions to a minimum.
In pilot experiments, we screened cultures of mutagenized cells for mutations in two genes, ade6 and ade7. Of 106,000 cells screened, seven produced red colonies, the phenotype expected for mutations in either ade7 or ade6. This number of ade− colonies was consistent with randomly distributed integration. Together, the data indicate that hermes can readily be used as a tool for the random disruption of pombe genes.
- Evertts A, Plymire C, Craig N, Levin H. The hermes transposon of Musca domestica has robust activity in Schizosaccharomyces pombe that disrupts open reading frames. Genetics 2007;177:2519-2523.
1 Robert Judson, BA, former Postbaccalaureate Fellow
2 Felice Kelly, BA, former Postbaccalaureate Fellow
3 Marc Heincelman, BA, former Postbaccalaureate Fellow
Collaborators
- Hugh P. Cam, Laboratory of Biochemistry and Molecular Biology, NCI, Bethesda, MD
- Nancy Craig, PhD, The Johns Hopkins University School of Medicine, Baltimore, MD
- Xiang Gao, PhD, Iowa State University, Ames, IA
- Shiv Grewal, PhD, Laboratory of Biochemistry and Molecular Biology, NCI, Bethesda, MD
- CharlesHoffman, PhD, Boston College, Boston, MA
- Ken-ichi Noma, PhD, Wistar Institute,Philadephia, PA
- Daniel Voytas, PhD, University of Minnesota, Minneapolis, MN
- Ke Zhang, PhD, Laboratory of Biochemistry and Molecular Biology, NCI, Bethesda, MD
For further information, contact henry_levin@nih.gov or visit http://sete.nichd.nih.gov

