Long Noncoding RNA HCP5
The HCP5 RNA gene (NCBI ID: 10866) is located centromeric of the HLA-B gene and between the MICA and MICB genes within the major histocompatibility complex (MHC) class I region. It is a human species-specific gene that codes for a long noncoding RNA (lncRNA), composed mostly of an ancient ancestral endogenous antisense 3′ long terminal repeat (LTR, and part of the internal polantisense sequence of endogenous retrovirus (ERV) type 16 linked to a human leukocyte antigen (HLA) class I promoter and leader sequence at the 5′-end. Since its discovery in 1993, many disease association and gene expression studies have shown that HCP5 is a regulatory lncRNA involved in adaptive and innate immune responses and associated with the promotion of some autoimmune diseases and cancers. The gene sequence acts as a genomic anchor point for binding transcription factors, enhancers, and chromatin remodeling enzymes in the regulation of transcription and chromatin folding. The HCP5 antisense retroviral transcript also interacts with regulatory microRNA and immune and cellular checkpoints in cancers suggesting its potential as a drug target fornovel antitumor therapeutics.
The human major histocompatibility complex (MHC), also known as the human leukocyte antigen (HLA), covers 0.13% of the human genome and spans ~4 Mbp on the short arm of chromosome six at position 6p21 within a region that contains more than 250 annotated genes and pseudogenes [1,2]. The classical class I and class II regions within the MHC have extensive patterns of linkage disequilibrium (LD), and a high degree of single nucleotide polymorphisms (SNPs) at the HLA genes can differentiate worldwide populations [1,3-5]. HLA polymorphisms are a crucial determinant of the adaptive immune response to infectious agents, allograft success, or rejection and self/nonself immune recognition that can contribute to more autoimmune diseases than any other region of the genome [1,2,6-8]. Apart from the adaptive immune response, MHC class I molecules have a role in brain development, synaptic plasticity, axonal regeneration, and immune-mediated neurodegeneration [9-12]. At least half of the molecules encoded by this highly polymorphic locus are involved in antigen processing and presentation, inflammation regulation, the complement system, and the innate and adaptive immune responses, highlighting the importance of the MHC in immune-mediated autoimmune and infectious diseases [1,2]. Polymorphisms expressed by the MHC genomic region influence many critical biological traits and individuals’ susceptibility to the development of chronic autoimmune diseases such as type I diabetes, rheumatoid arthritis, celiac disease, psoriasis, ankylosing spondylitis, multiple sclerosis, Graves’ disease, schizophrenia, bipolar disorder, inflammatory bowel disease, and dermatomyositis [2,6-8]. Furthermore, different viral infections and cancers are associated strongly with the suppression of MHC genomic expression activity, particularly in the region of the MHC class I and class II loci [6,13-15].
There are tens of thousands of genomic loci that express microRNA (miRNA)  and lncRNA [17,18,19,20], but only about 50 have been investigated in any great detail with respect to their role in the regulation of the immune system and disease [21,22,23,24]. Although there are many miRNA and lncRNA loci within the MHC genomic region, they have been ignored largely in favor of studies on polymorphisms of the HLA class I and class II gene loci in health, disease, and transplantation cell/tissue/organ typing [25,26]. This review focuses on the structure and function of only one of these HLA lncRNA, the HCP5 lncRNA, which is located between the MICA and MICB genes and ~105 kb centromeric of the HLA-B gene.
In 1993, Vernet et al.  discovered a novel coding sequence belonging to a new multicopy pseudogene family P5 that they mapped within the HLA class I region and named P5-1 (alias for HCP5). They found that it expressed a 2.5-kb transcript in human B-cells, phytohemagglutinin-activated lymphocytes, a natural killer-like cell line, normal spleen, hepatocellular carcinoma, neuroblastoma, and other non-lymphoid tissue, but not in T-cells. HCP5 (P5-1) appeared to be a hybrid sequence created by nonhomologous recombination between two pseudogenes or nonmobile genetic elements that possibly produced a protein comprising 219 amino acids (aa’s) . A few years later, the HCP5 (P5-1) gene was mapped precisely to a region between the MICA and MICB genes and downstream, at the centromeric end, of the two classical HLA class I genes HLA-B and HLA-C (Figure 1) . In 1999, Kulski and Dawkins  used the computer programs Censor and RepeatMasker and dot-plot DNA and RNA sequence analyses to demonstrate that the HCP5 gene sequence and its transcripts were composed mainly of the 3′LTR and pol sequences of an ancient HERV16 insertion, which was a member of the HERVL or class III category of endogenous retroviruses (ERVs) in the human and mammalian genomes [31,32].
Figure 1. Location of the HCP5 gene (d) within the ERV16 element (c) and the human leukocyte antigen (HLA) class I region of the beta-block (b) on chromosome 6 at 6p21.33 (a). The HLA class I promoter region (the green rectangle labeled as Pr) for the 2630 bp HCP5 gene (d) initiates transcription of the 2547 bp lnc HCP5 RNA (e). The 91 bp intron in the HCP5 gene is represented by the thin grey line between the violet rectangular lines (d). The AluSp, THE1B, and LTR162B insertions within the 6173 bp ERV16 sequence  (c) are indicated by the labeled circle, triangle, and rectangles, respectively (see Table 1 for more details). The H3K27Ac binding (orange curve) and Dnase I hypersensitivity region (grey horizontal rectangle) associated with the HCP5 gene sequence (d) and sourced from the University of California, Santa Cruz (UCSC) genomic browser (Table S1) are shown on line (f).
|Sequence Name||Location on Chr6 *||Length bp||Orient.||Feature|
|HLA-UTR||31,462,625–31,463,419||794||+||HLA-5′UTR + exon 1|
|HCP5-5′ncr||31,462,464–31,463,413||950||+||HLA class I promoter homologs|
|DNase region||31,463,001–31,463,190||190||+||100% reliability score|
|3′LTR16B||31,463,420–31,463,715||296||−||3′LTR (nt 150–446 in HCP5 RNA)|
|ERV3-16A3||31,463,716–31,464,933||1218||−||Internal (nt 447–2547 in HCP5 RNA)|
|AluSp||31,465,920–31,466,225||306||−||Insertion within ERV3-16A3|
|THE1B||31,467,708–31,468,050||343||+||Fragmented insertion within ERV3-16A3|
|AluSx||31,469,044–31,469,282||239||−||Insertion within LTR16B|
|MLT1E2||31,470,734–31,471,317||584||−||ERL-MaLR, LTR fragment|