Long Non-Coding RNAs at the GWAS Risk Loci: Comparison
Please note this is a comparison between Version 1 by Jyotsna Batra and Version 2 by Yvaine Wei.

Long non-coding RNAs (lncRNAs) are emerging as key players in a variety of cellular processes. Deregulation of the lncRNAs has been implicated in prostate and breast cancers. Recently, germline genetic variations associated with cancer risk have been correlated with lncRNA expression and/or function. In addition, single nucleotide polymorphisms (SNPs) at well-characterized cancer-associated lncRNAs have been analyzed for their association with cancer risk. These SNPs may occur within the lncRNA transcripts or spanning regions that may alter the structure, function, and expression of these lncRNA molecules and contribute to cancer progression and may have potential as therapeutic targets for cancer treatment. Additionally, some of these lncRNA have a tissue-specific expression profile, suggesting them as biomarkers for specific cancers.

  • long non-coding RNA
  • prostate cancer
  • breast cancer
  • single nucleotide polymorphisms
  • genome-wide association studies

1. Introduction

Genetic predisposition has been identified as one of the factors contributing to the risk of these cancers. Genome-wide association studies (GWASs), analyzing common low-penetrance variants, have identified specific risk loci for these cancers [1][2][2,3]. Most of the risk-associated single nucleotide polymorphisms (SNPs) identified through GWAS are present in non-protein-coding DNA [3][4][5][6][4,5,6,7]. This non-coding DNA can regulate the expression of protein-coding genes and maintain the 3D structure of the genome by serving as a scaffold for transcription factors. Alternatively, some non-coding DNA is now found to be transcribed as non-protein-coding RNA (ncRNA) using high-throughput next-generation sequencing platforms [7][8]

Both ncRNAs and protein machinery involved in the development of diseases have become targets of novel therapeutic approaches [8][9][10][13,14,15]. Based on transcript size, these ncRNAs are grouped into two major classes: small non-coding RNAs (<200 bp) and long non-coding RNAs (lncRNAs) (>200 bp). The small ncRNA class comprises miRNAs, tRNAs, snRNAs, siRNAs, and piRNAs [2][11][12][3,16,17]., LncRNAs have recently been identified as important mediators in many diseases, including cancer [11][13][14][16,18,19]. Long non-coding RNAs (lncRNAs) are RNA transcripts that lack translational potential into functional proteins. The biogenesis of lncRNAs is similar to mRNAs. Most lncRNAs are transcribed by RNA polymerase II while some are also transcribed by RNA polymerase III. Most of the lncRNAs undergo post-transcriptional modifications, such as splicing, polyadenylation, and 5′ capping-like protein-coding RNAs [15][20]. However, these molecules have several short open reading frames (sORFs) and have very little protein coding potential, which discriminates them from mRNA [16][21]. Based on their origin, these lncRNAs can be classified as intronic, exonic, intergenic, intragenic, antisense, 3′ and 5′ UTR, promoter-associated (paRNA), and enhancer-associated (eRNA) [17][22]

It is evidenced that many lncRNAs are deregulated in prostate and breast cancer and some of their expression has been significantly associated with different stages of cancer. These lncRNAs are proposed to be involved in cancer development by playing functional roles in chromatin remodeling, transcriptional regulation, or post-transcriptional regulation. They show tumor-suppressive or oncogenic potential, emphasizing their potential in targeted therapeutics for prostate and breast cancer [18][19][26,27]. In addition, lncRNAs show tissue- and cancer-specific expression patterns, enabling them to be better diagnostic and prognostic tools for cancer therapies [18][20][26,28]. Moreover, SNPs could affect the expression and molecular function of lncRNAs, for instance, by disrupting their secondary structure and playing critical roles in tumorigenesis [21][29].

2. Prostate Cancer Risk-Associated SNPs Modulating lncRNAs

As a multifactorial disease, prostate cancer has several aspects contributing to its etiology, comprising both modifiable and non-modifiable factors [22][31]. Diet and environmental exposure disruptors, such as bisphenol A, chlordecone, and pesticides [22][31], are reported as modifiable prostate cancer risk factors. Age is a well-known non-modifiable risk factor for prostate cancer, where the risk of developing cancer increases with age [23][24][32,33].
There is a considerable amount of evidence for a genetic basis (up to ~57%) contributing to the risk of prostate cancer [25][26][34,35]. Recently, a large prostate cancer GWAS identified novel risk loci making it to a total of 269 risk loci to date [27][36] and the study led to the identification of a genetic risk score of prostate cancer predisposition. Nevertheless, identification of the causal genes has been a major challenge, given the location of a large proportion of these variants are in the non-coding regions. Functional studies are known to complement GWAS results to identify specific genes whose expressions are associated with disease phenotype. One such approach is by expression quantitative trait locus (eQTL), which can identify the association between risk genotype and gene expression, and transcriptome-wide association studies (TWASs), which can assess the association with disease risk throughout the transcriptome.
One of the few studies to explore prostate cancer GWAS SNPs-associated lncRNAs identified that the prostate cancer-associated SNPs are less polymorphic in the flanking regions, but the SNP density was similar in protein-coding and lncRNA gene regions, indicating the sequences of lncRNA are evolutionarily conserved [28][37]. This study reported that 52 loci were located within the lncRNA genes, including a new prostate cancer risk-related SNP rs3787016 in a predicted lncRNA AC1127096.1 [28][37]
Prostate cancer risk-associated SNPs, rs11672691 and rs887391, were identified to regulate two PCAT19 lncRNA isoforms with two distinct transcription start sites, PCAT19-short and PCAT19-long, through a promoter-to-enhancer switching mechanism [29][63]. The rs11672691 SNP on chromosome 19 was identified to be associated with both non-aggressive and aggressive prostate cancer risk [30][64], prostate cancer-specific mortality [31][65], and poor prognosis after diagnosis [29][63]PCAT19-long promoted prostate cancer progression by interacting with a nuclear riboprotein, Heterogeneous Nuclear Ribonucleoprotein A/B (HNRNPAB), to upregulate a subset of cell-cycle genes [29][63], suggesting a novel mechanism for the HNRNPAB role in prostate cancer progression.

3. Breast Cancer Risk-Associated SNPs Modulating lncRNAs

Breast cancer is the commonly diagnosed cancer in females worldwide. It is a heterogeneous disease on a molecular and clinical level, and has four distinct subtypes: Luminal A, Luminal B, human epidermal growth factor receptor 2 (HER2) overexpression, and triple-negative, based on the status of estrogen receptor (ER), progesterone receptor (PR), and HER2 [32][33][69,70]. Breast cancer GWASs have identified more than 200 risk loci, including differential associations with ER+, ER−, or triple-negative breast cancer [6][34][35][7,71,72]. A transcriptome-wide association study by Wu et al. identified 26 lncRNAs through eQTL analysis of breast cancer risk loci [36][73]. The functional role of three of these lncRNAs: RP11–218M22.1RP11–467J12.4, and CTD-3032H12.1, was confirmed by the significant reduction in cell proliferation on lncRNA knockdown in three breast cancer cell lines, 184A1, MCF7, and T47D, and reduced colony-forming efficiency in MCF7 cells. RP11–467J12.4, also known as PR-lncRNA-1, is mainly localized in the nucleus, and regulated by P53 in human and mouse cells [37][74]. LncRNA CTD-3032H12.1 is predicted to interact with another lncRNA RP11-20F24.2 and mRNA of ANKRD30A, a transcription factor implicated in breast cancer progression, using a tissue-specific co-expression regulatory network model [38][75]. Another well-known cancer-associated lncRNA, H19, a maternally inherited imprinted gene, is reported to be overexpressed in breast cancer, and associated with poor prognosis in breast cancer patients, especially in the triple-negative molecular subtype [39][40][89,90]

4. Conclusions

Recently, there has been remarkable progress in the understanding of the multifaceted role of lncRNAs and the genetic variants impacting lncRNA expression and function, recognizing them as critical players in prostate and breast cancer progression. Although a majority of these cancer risk-associated genetic variants are found in non-coding RNA loci, only a few studies have focused on uncovering the role of these SNPs in modulating the structure and function of lncRNAs in cancer progression. Emerging sequencing techniques and bioinformatic analysis are helpful in predicting the putative function of the lncRNA. Databases, such as lncRNASNP2 [41][94] and LincSNP 3.0 [42][95], provide information on how these SNPs modulate the lncRNA structure and function. Some of these lncRNAs are differentially expressed in disease progression models and cancer subtypes, highlighting their potential to be used as a diagnostic and prognostic biomarker.