Endogenous retroviruses (ERVs) are long terminal repeat (LTR)-retroelements of the Retroviridae genus. ERVs reside in vertebrate genomes, are particularly abundant in mammals, and still actively retrotranspose in mice. This entry describes their relationship to other LTR-retroelements and how they replicate. A brief introduction into epigenetic reprogramming of ERVs is followed by a few examples of how reprogramming of ERVs assists embryonic development in mouse and human.
Endogenous retroviruses (ERVs) are one type of long terminal repeat (LTR)-retroelements. LTR-retroelements include Pseudoviridae (Ty1, Copia), Metaviridae (Ty3/Gypsy), and Retroviridae such as ERVs, human immunodeficiency virus (HIV), Rous sarcoma virus (RSV), mouse mammary tumor virus (MMTV), human T cell leukemia virus 1 (HTLV-1) and other retroviruses . With few exceptions, LTR-retroelements utilize host tRNAs to prime reverse transcription of their RNA into DNA within viral particles (infectious) or virus-like particles (endogenous) before integrating into the genome. ERVs are endogenous Retroviridae that reside in vertebrates. They are particularly abundant in mammals and are actively mutagenizing mouse genomes .
Reverse transcription and long terminal repeat (LTR) retroelements are ancient components of eukaryotic genomes [1]. In fact, reverse transcriptase (RT) is one of the most abundant genes in organisms with high copy numbers of retroelements such as mammals [1][2][3]. LTR-retroviruses encode envelope proteins to form virus particles and infect neighboring cells or other organisms, while LTR-retrotransposons that lack functional envelope proteins replicate within viral-like particles (VLPs) to integrate into the same cell. The majority of LTR-retrotransposons in mammals are closely related to known infectious LTR-retroviruses and are therefore called endogenous retroviruses (ERVs). Based on the phylogenetic relationship of their RT genes, mammalian ERVs belong to the Retroviridae genus, while LTR-retrotransposons prevalent in other phyla such as the Gypsy and Copia superfamilies are Metaviridae and Pseudoviridae, respectively [4][5][6]. All three genera include infectious, viral elements with an envelope gene as well as endogenous transposons that proliferate in a strictly intracellular fashion. Endogenous LTR-retrotransposons that lost a functional envelope gene are inherited vertically but may in principle, at low frequency, spread to other organisms by horizontal transfer, a process used by all transposable elements to enter new host species [7][8]. ERVs have become resident aliens in mammalian genomes and many of them were co-opted by their hosts to fulfill essential cellular functions, for example, during placentation and imprinting [9][10][11]. ERV promoter and enhancer activities as well as their protein domains have been useful building blocks during evolution, while their repetitive ends induce recombination and mobility of intact, full-length elements is highly mutagenic [12][13][14][15][16][17]. Hence, their expression needs to be carefully monitored by the cell.
With few exceptions, LTR-retroelements use host tRNA to prime reverse transcription and copy their RNA into DNA for insertion into the genome (Figure 1) [18][19][20][21]. Retroviral proteins bind specific tRNAs with high affinity and recruit them to the virus particle or VLP where their 3′-end initiates reverse transcription at the tRNA primer binding site (PBS). Small RNAs derived from the 3′-end of tRNAs (3′-tRFs) target LTR-retroelements at the PBS and control their mobility and expression [22][23][24]. These highly conserved sequence motifs are a prerequisite for replication and allow host defense mechanisms to identify active LTR-retroelements. ERV sequences make up ~10% of the mouse and human genome. While they are no longer mobile in humans, a number of murine ERVs are highly active and retrotranspose themselves as well as non-autonomous, non-coding family members [and 2 = Lander].
Figure 1. Model of reverse transcription of long terminal repeat (LTR)-retrotransposons and -viruses. LTRs encode promoter elements and termination signals. The RNA transcript contains a region repeated at either end (R), a 5′ unique segment (U5), and a segment only included at the 3′-end of the RNA (U3). The 3′-end of cellular tRNAs (blue cloverleaf) primes reverse transcription by hybridizing to the primer binding site (PBS). While this segment is being copied into first-strand cDNA (light blue line), also called minus (−) strong stop DNA, the RNaseH activity of reverse transcriptase (RT) degrades the template RNA. The elongating cDNA is transferred to the 3′-end of the retrotransposon transcript hybridizing to the R region. The remaining RNA is partially degraded by RNaseH leaving behind primers for second-strand, plus (+) cDNA synthesis. In Retroviridae, the plus strand PBS is a copy of the tRNA primer, while the minus strand is a copy of the original PBS sequence. After another transfer event, first (−) and second (+) strand synthesis are completed to result in a full-length, double-stranded retroviral DNA that will be integrated into the host genome.
ERVs are usually embedded in repressive heterochromatin, but importantly become active during epigenetic reprogramming in development and disease. Mammals undergo genome-wide epigenetic reprogramming in the embryo right after fertilization and in the germline to obtain totipotency and set aside cells for the next generation [25][26][27]. ERV transcription is repressed by DNA and histone methylation. The histone methyltransferases G9a/GLP, SETDB1, EZH2, histone demethylase KDM1A, as well as the de novo DNA methyltransferases DNMT3a/b, DNMT3L, and DNMT3C establish heterochromatin at different classes of ERVs as discussed in detail elsewhere [28][29]. DNA methylation status can directly correlate with ERV transcription [29][30][31][32][33]. However, absence of DNA methylation does not necessarily lead to ERV expression, as long as histone H3 lysine K9 tri-methylation (H3K9me3) can be maintained [34][35][36][37]. The histone H3K9me3 methyltransferase SETDB1 is acting in complex with KAP1/TRIM28 on fully methylated or fully unmethylated DNA, but not hemi-methylated DNA that is occupied by NP95 [37]. This finding resolves why ERV expression is not always observed in stable methylation knock-outs but is observed in inducible knock-outs that undergo temporary hemi-methylation, and most importantly during epigenetic reprogramming in vivo which includes a hemi-methylated state [32][33][37]. Like any gene, transposon expression depends on multiple layers of repressive and permissive control on the RNA, DNA, and protein level. Removal of silent chromatin marks allows transcription factors (TFs) to bind DNA and promote or inhibit ERV transcription [38][39][40]. LTR sequences, for example, contain species-specific TF binding sites that promote temporary expression of ERVs and neighboring genomic sequences during development [38]. After reprogramming, chromatin patterns at transposable elements need to be re-established through DNA and RNA recognition. KRAB zinc finger proteins (ZFPs) have co-evolved with their transposon targets and guide heterochromatin formation by SETDB1, TRIM28/KAP1 through binding to highly conserved DNA sequence motifs in ERVs [41]. For example, some KRAB-ZFPs bind to the PBS of select ERVs or the polypurine tract that primes second strand reverse transcription during ERV replication [42].
Once transcribed, ERV expression, translation, and reverse transcription must be restricted by the cell, and small RNAs have the ability to recognize and target transposon RNA for silencing. PIWI-interacting RNAs (piRNAs) guide silencing of ERVs in the male germline of mammals [72][73][74]. In the female germline of Muridae, endogenous small interfering RNAs (endo-siRNAs) target transposon mRNA and protect oocytes [26][79][80][81]. Small RNAs derived from the 3′-end of mature tRNAs (3′-tRFs) are expressed in pre-implantation mouse stem cells and potentially protect tissues by targeting the highly conserved tRNA primer binding site (PBS) of ERVs [22]. Small RNA-mediated silencing does not only prevent mutagenic damage from transposition, but importantly regulates repetitive elements that have been co-opted by the host to serve essential functions. For example, silencing of the paternally imprinted Rasgrf1 locus in mouse is mediated by piRNAs that target an ERV sequence [43], and a micro RNA (miRNA) regulates the retrotransposon-like 1 (Rtl1) imprinted gene in mouse placenta [44].
The propensity of ERVs to attract diverse silencing machineries that act upon specific transposon families at different stages of development make them ideal epigenetic switches [17]. An estimated 6–30% of transcripts in mouse and human embryonic and somatic tissues are driven by retrotransposon promoters in a highly tissue-specific manner [45]. ERV families define gene-regulatory networks throughout development [12][13]. Transcription of murine MERV-L elements marks the totipotent two-cell stage in early embryos [46]. Human HERV-H expression is indicative of the naive embryonic stem cell state and essential for pluripotency [47][48]. In addition, ERV LTR promoter-enhancer activity drives non-coding, stem-cell specific transcripts that maintain the undifferentiated state and are crucial for cell identity [49][50][51][52]. More than 800 LTRs from the ERV-L and mammalian apparent LTR-retrotransposon (MaLR) families act as alternative promoters and first exons to drive stage-specific gene expression in mammalian oocytes and the developing zygote [25][53]. Taken together, temporary release of transposon silencing during reprogramming affects the transcriptome through (i) expression of potentially mobile, mutagenic, intact transposons, (ii) expression of transposon-derived, long non-coding regulatory RNAs (lncRNAs), and (iii) expression of neighboring genes or lncRNAs driven by promoter-enhancer activities of the LTRs.
The epigenetic state of ERVs and transposable elements in general can not only lead to developmental stage- and cell-type-specific expression but also establish epigenetic alleles or “epialleles” that result in differential expression between isogenic offspring [54][55]. Epialleles can be stable and inherited if they entirely escape reprogramming or “metastable” and lead to stochastic changes of the epigenetic state in the offspring [55]. The most famous example of an ERV-induced metastable epiallele is the differential methylation of an intracisternal A-particle (IAP) insertion upstream the mouse Agouti gene which results in varying fur color and obesity in siblings [30]. In fact, such metastable epialleles of IAP are extremely abundant genome-wide, but few of them affect neighboring gene expression [31]. Select ERVs, particularly a set of IAP elements, are protected from reprogramming in the early embryo and the germline, and therefore inherit their epigenetic state as stable epialleles [56][57]. Human ERV (HERV) methylation varies between individuals that could be metastable epialleles, but it is hard to exclude genetic variation [55][58]. Notably, many imprinted genes are derived from LTR-retrotransposons. Imprinted genes of the sushi-ichi-related retrotransposon homologs (SIRH) are common to placental mammals and derived from Metaviridae gypsy-elements [59]. Lineage-specific Retroviridae ERV insertions mediate imprint establishment at murine loci such as retrotransposon-like 1 (Rtl1), Rasgrf1, Impact, and Slc38a4 [60][44][43]. The murine ERVK family drives non-canonical, histone-dependent imprinting in the extraembryonic lineage [61]. Imprinted loci are established during epigenetic reprogramming of the germline and persist in the early embryo [56][62][63]. In contrast to epialleles, heterochromatin induction at imprinted loci is not stochastic but established at either the paternal or maternal allele, respectively, and is essential for proper development.
Similar to epigenetic reprogramming in development, ERV reactivation has been observed in other tissues with high epigenetic plasticity, particularly in the course of disease [17][64][65]. The role of ERVs in cancer extends beyond their value as diagnostic markers for aberrant reprogramming. They are frequently epigenetically reactivated as cryptic promoters in cancer and drive oncogene expression [49][66][67][68]. Indeed, LTR-retroviruses were originally identified as the causative agents of transmissible tumors in chicken, mice, and humans [69]. Those ‘RNA tumor viruses’ include Rous sarcoma virus (RSV), mouse mammary tumor virus (MMTV), and human T cell leukemia virus 1 (HTLV-1). However, expression of endogenous HERV proteins can also tip the scales and trigger an immune response that drives tumor cells into apoptosis [70].
This entry is adapted from the peer-reviewed paper 10.3390/v12080792