Long Noncoding RNA HCP5

Subjects: Cell & Developmental Biology View times: 239
Created by: Jerzy Kulski

The HCP5 RNA gene (NCBI ID: 10866) is located centromeric of the HLA-B gene and between the MICA and MICB genes within the major histocompatibility complex (MHC) class I region. It is a human species-specific gene that codes for a long noncoding RNA (lncRNA), composed mostly of an ancient ancestral endogenous antisense 3′ long terminal repeat (LTR, and part of the internal polantisense sequence of endogenous retrovirus (ERV) type 16 linked to a human leukocyte antigen (HLA) class I promoter and leader sequence at the 5′-end. Since its discovery in 1993, many disease association and gene expression studies have shown that HCP5 is a regulatory lncRNA involved in adaptive and innate immune responses and associated with the promotion of some autoimmune diseases and cancers. The gene sequence acts as a genomic anchor point for binding transcription factors, enhancers, and chromatin remodeling enzymes in the regulation of transcription and chromatin folding. The HCP5 antisense retroviral transcript also interacts with regulatory microRNA and immune and cellular checkpoints in cancers suggesting its potential as a drug target fornovel antitumor therapeutics.


The human major histocompatibility complex (MHC), also known as the human leukocyte antigen (HLA), covers 0.13% of the human genome and spans ~4 Mbp on the short arm of chromosome six at position 6p21 within a region that contains more than 250 annotated genes and pseudogenes [1,2]. The classical class I and class II regions within the MHC have extensive patterns of linkage disequilibrium (LD), and a high degree of single nucleotide polymorphisms (SNPs) at the HLA genes can differentiate worldwide populations [1,3-5]. HLA polymorphisms are a crucial determinant of the adaptive immune response to infectious agents, allograft success, or rejection and self/nonself immune recognition that can contribute to more autoimmune diseases than any other region of the genome [1,2,6-8]. Apart from the adaptive immune response, MHC class I molecules have a role in brain development, synaptic plasticity, axonal regeneration, and immune-mediated neurodegeneration [9-12]. At least half of the molecules encoded by this highly polymorphic locus are involved in antigen processing and presentation, inflammation regulation, the complement system, and the innate and adaptive immune responses, highlighting the importance of the MHC in immune-mediated autoimmune and infectious diseases [1,2]. Polymorphisms expressed by the MHC genomic region influence many critical biological traits and individuals’ susceptibility to the development of chronic autoimmune diseases such as type I diabetes, rheumatoid arthritis, celiac disease, psoriasis, ankylosing spondylitis, multiple sclerosis, Graves’ disease, schizophrenia, bipolar disorder, inflammatory bowel disease, and dermatomyositis [2,6-8]. Furthermore, different viral infections and cancers are associated strongly with the suppression of MHC genomic expression activity, particularly in the region of the MHC class I and class II loci [6,13-15].

There are tens of thousands of genomic loci that express microRNA (miRNA) [16] and lncRNA [17,18,19,20], but only about 50 have been investigated in any great detail with respect to their role in the regulation of the immune system and disease [21,22,23,24]. Although there are many miRNA and lncRNA loci within the MHC genomic region, they have been ignored largely in favor of studies on polymorphisms of the HLA class I and class II gene loci in health, disease, and transplantation cell/tissue/organ typing [25,26]. This review focuses on the structure and function of only one of these HLA lncRNA, the HCP5 lncRNA, which is located between the MICA and MICB genes and ~105 kb centromeric of the HLA-B gene.

In 1993, Vernet et al. [27] discovered a novel coding sequence belonging to a new multicopy pseudogene family P5 that they mapped within the HLA class I region and named P5-1 (alias for HCP5). They found that it expressed a 2.5-kb transcript in human B-cells, phytohemagglutinin-activated lymphocytes, a natural killer-like cell line, normal spleen, hepatocellular carcinoma, neuroblastoma, and other non-lymphoid tissue, but not in T-cells. HCP5 (P5-1) appeared to be a hybrid sequence created by nonhomologous recombination between two pseudogenes or nonmobile genetic elements that possibly produced a protein comprising 219 amino acids (aa’s) [28]. A few years later, the HCP5 (P5-1) gene was mapped precisely to a region between the MICA and MICB genes and downstream, at the centromeric end, of the two classical HLA class I genes HLA-B and HLA-C (Figure 1) [29]. In 1999, Kulski and Dawkins [30] used the computer programs Censor and RepeatMasker and dot-plot DNA and RNA sequence analyses to demonstrate that the HCP5 gene sequence and its transcripts were composed mainly of the 3′LTR and pol sequences of an ancient HERV16 insertion, which was a member of the HERVL or class III category of endogenous retroviruses (ERVs) in the human and mammalian genomes [31,32].

Figure 1. Location of the HCP5 gene (d) within the ERV16 element (c) and the human leukocyte antigen (HLA) class I region of the beta-block (b) on chromosome 6 at 6p21.33 (a). The HLA class I promoter region (the green rectangle labeled as Pr) for the 2630 bp HCP5 gene (d) initiates transcription of the 2547 bp lnc HCP5 RNA (e). The 91 bp intron in the HCP5 gene is represented by the thin grey line between the violet rectangular lines (d). The AluSp, THE1B, and LTR162B insertions within the 6173 bp ERV16 sequence [30] (c) are indicated by the labeled circle, triangle, and rectangles, respectively (see Table 1 for more details). The H3K27Ac binding (orange curve) and Dnase I hypersensitivity region (grey horizontal rectangle) associated with the HCP5 gene sequence (d) and sourced from the University of California, Santa Cruz (UCSC) genomic browser (Table S1) are shown on line (f).Figure 1. Location of the HCP5 gene (d) within the ERV16 element (c) and the human leukocyte antigen (HLA) class I region of the beta-block (b) on chromosome 6 at 6p21.33 (a). The HLA class I promoter region (the green rectangle labeled as Pr) for the 2630 bp HCP5 gene (d) initiates transcription of the 2547 bp lnc HCP5 RNA (e). The 91 bp intron in the HCP5 gene is represented by the thin grey line between the violet rectangular lines (d). The AluSp, THE1B, and LTR162B insertions within the 6173 bp ERV16 sequence [30] (c) are indicated by the labeled circle, triangle, and rectangles, respectively (see Table 1 for more details). The H3K27Ac binding (orange curve) and Dnase I hypersensitivity region (grey horizontal rectangle) associated with the HCP5 gene sequence (d) and sourced from the University of California, Santa Cruz (UCSC) genomic browser (Table S1) are shown on line (f).

Table 1. The chromosomal location of HCP5 and the ERV16 (LTR16B and ERV3-16A3) retroelement between the pseudogene HLA-X and the HCG26 lncRNA gene within the HLA class I region [30].
Sequence Name Location on Chr6 * Length bp Orient. Feature
MERC21 31,461,669–31,462,057 389 + LTR fragment
HLA-X 31,461,846–31,462,490 645 + Silent pseudogene
MERC21 31,462,544–31,462,625 82 + LTR fragment
HLA-UTR 31,462,625–31,463,419 794 + HLA-5′UTR + exon 1
HCP5-5′ncr 31,462,464–31,463,413 950 + HLA class I promoter homologs
DNase region 31,463,001–31,463,190 190 + 100% reliability score
HCP5 31,463,180–31,465,809 2630 + lncRNA gene
3′LTR16B 31,463,420–31,463,715 296 3′LTR (nt 150–446 in HCP5 RNA)
ERV3-16A3 31,463,716–31,464,933 1218 Internal (nt 447–2547 in HCP5 RNA)
AluSp 31,465,920–31,466,225 306 Insertion within ERV3-16A3
ERV3-16A3 31,466,237–31,467,707 1471 Internal
THE1B 31,467,708–31,468,050 343 + Fragmented insertion within ERV3-16A3
ERV3-16A3 31,468,051–31,468,201 151 Internal
ERV3-16A3 31,468,362–31,468,644 283 Internal
5′LTR16B2 31,468,881–31,469,043 163 5′LTR
AluSx 31,469,044–31,469,282 239 Insertion within LTR16B
5′LTR16B2 31,469,283–31,469,592 310 5′LTR
L2 31,469,593–31,470,361 769 L2 LINE
MLT1E2 31,470,734–31,471,317 584 ERL-MaLR, LTR fragment
L2 31,471,325–31,472,047 723 L2 LINE
HCG26 31,471,229–31,472,408 1180 + LncRNA
* Chr6:31461669-31472408, GRCh38.p12 assembly (annotation release 109 in March 2018). 5′ncr is 5′ noncoding region. Orientation of the sequence is on the +ve DNA strand or the −ve (complementary) DNA strand.
Because HCP5 expressed an antisense transcript that was complementary to retrovirus pol mRNA sequences and a 3′LTR, Kulski and Dawkins [30] suggested that it might have a role in immunity to retrovirus infection. They considered that the lncRNA of HCP5 might hybridize with retroviral sense mRNA sequences to suppress viral transcription, translation, and transport. Eight years later, a single-nucleotide polymorphism (rs2395029) in the HCP5 gene was associated with HLA-B*57:01 and correlated with a lower HIV-1 viral set point [33], indicating that these two alleles within a particular haplotype may have a role in viral control [34]. However, when Yoon et al. [35] tested the antisense/antiviral hypothesis for HCP5 by infecting TZM-bl cells in vitro with HIV-1 and plasmids expressing high levels of HCP5 transcripts, they observed no restriction with infectivity throughout the viral life cycle. They concluded from their findings that the HCP5 gene had no direct antiviral effect, and that the association of an HCP5 variant with viral control most likely was due to an HLA-B*57:01-related effect or other functional variants in the haplotype or both. In fact, it appears that the role of HCP5 in immunity and human disease is far more complex than previously envisioned, and that its antiviral affects might occur by way of some secondary mechanisms such as the possible involvement of miRNA inhibition rather than by hybridization of the HCP5 transcript with the complementary viral pol transcripts.
During the last two decades, HCP5 SNPs have been associated with many different diseases in genome-wide association studies, gene expression studies, and cancer studies investigating tissue and cellular biomarkers of tumor progression and inhibition. To better understand the genetics, molecular biology, and functions of HCP5, this paper reviewed the available data and literature on the genomic organization, structure, and function of the HCP5 gene (HLA complex P5 (non-protein coding), HGNC:21659) in health and disease (MIM:604676), particularly its association with autoimmune diseases, cancer, and infections by way of its endogenous interactions with miRNA and various gene targets. 
HCP5 is a unique human-specific gene within the MHC class I genomic region that encodes a hybrid HLA class I endogenous retroviral lncRNA with peptide coding potential. It has functional relationships with many other genes within or outside the MHC genomic region that are involved with antigen processing and presentation, the interferon regulatory pathway, and epigenomic and ceRNA networks; however, many of these functional interactions between multiple genetic variants are still poorly understood. HCP5 gene SNVs and neighboring upstream and downstream SNVs have been associated with HIV viral load, HPV infection, autoimmune diseases, disease relapse after transplantation, and various cancers. Much still needs to be determined about the defensive and pathological functions of HCP5 and its structural and functional role in RNA editing and signaling to enable epigenetic plasticity and immune response pathways. Judging from the recent new findings about the possible oncogenic role of HCP5 as a ‘sponge’ for sequestering regulatory miRNA in cancer, new insights about its diverse mechanisms and functions in health and disease undoubtedly will continue to emerge and surprise in the near future.