Genomics of Endometriosis: History
Please note this is an old version of this entry, which may differ significantly from the current revision.
Contributor:

Endometriosis is a gynecologic disease affecting women of reproductive age. Its precise prevalence is in fact unknown, but classically estimated at around 10%. It is characterized by two major clinical manifestations: pain and infertility.

  • endometriosis
  • genome-wide association studies
  • exome sequencing
  • missing heritability
  • infertility

1. Introduction

Endometriosis is a gynecologic disease affecting women of reproductive age. Its precise prevalence is in fact unknown, but classically estimated at around 10%. It is characterized by two major clinical manifestations: pain and infertility. When a patient consults for chronic pelvic pain and infertility, endometriosis is detected in one fourth and one third of the patients, respectively [1].

Pain associated with endometriosis generally increases in intensity at specific moments (during menstruations, dysmenorrhea, dyschezia, lower urinary tract symptoms and/or during intercourse, dyspareunia) or occurs continuously, albeit pain can also be absent [2].

Ovarian follicles of the endometriotic women could themselves be less able to undergo normal zygotic development. This could be due to a specific accumulation of inflammatory molecules in the oocytes of endometriotic women. A recent metabolomic study using proton nuclear magnetic resonance (1H-NMR), compared 50 patients with Deep Infiltrating Endometriosis (DIE) versus patients with tubal obstruction infertility as control, and highlighted a molecular composition pointing to mitochondrial anomalies and oxidative stress [4]. Finally, pelvic inflammation, caused by the presence of endometriotic lesions, may impair spermatic progression throughout the fallopian tubes and fecundation [3].

Histologically, endometriosis is characterized by the presence of endometrial glands and stromal tissue that develop as endometrium-like structures outside the uterus. Generally, the organs affected are the ovary (endometrioma OMA), the peritoneum (superficial endometriosis SUP), the retroperitoneum and the anatomic structures located near the uterus, as for instance: uterosacral ligaments, bladder, rectum and ureters (Figure 1). In addition, the lesions can affect the myometrium and then constitute adenomyosis lesions that share many features with ‘classical’ endometriosis, and were once calledendometriosis interna.

Three well recognized phenotypes occur: superficial peritoneal lesions (SUP), ovarian endometriomas (OMA) and deep infiltrating endometriosis (DIE). DIE, is defined as subperitoneal lesions that penetrate deeper than 5 mm under the peritoneal surface (such as the uterosacral ligaments) or as lesions that infiltrate the muscularis propria of the organs that surround the uterus (for example bladder or intestine). Less frequently, endometriosis can occur at extragenital locations. Endometriosis is stratified by the American Society for Reproductive Medicine (ASRM) classification into four stages (I, II, III and IV) according to surgical evaluation of the size, location, severity of endometriotic lesions (superficial or deep) and the extension of adhesions [5,6].

The heterogeneity of the disease, with three endometriosis phenotypes, and the possibility of asymptomatic disease as well as the potential comorbid presence of adenomyosis can complicate diagnosis. History-taking by patient interviews is essential for diagnosing endometriosis [11]. Pelvic pain is the cardinal symptom of endometriosis in different forms (for example, dysmenorrhea, dyspareunia or chronic pelvic pain), with the potential for overlapping symptoms [12]. Moreover, such pain can be associated with non-gynaecological symptoms (particularly urinary and/or digestive)

Of note, the cyclic nature of the pain is a key feature of the disease [14]. Moreover, during clinical examination, health-care professionals should check for the following abnormalities: visible bluish lesions on the vaginal fornix; palpable sensitive nodules. However, a normal physical examination does not rule out endometriosis [15] while physical examination during menstruation may improve detection, but is not any more used today [16].

Medical imaging has led to substantial improvements in the diagnosis of endometriosis. Importantly, transvaginal ultrasound (TVUS) and MRI are not only suitable for diagnosing the disease but also to distinguish at least two phenotypes of endometriosis (OMA and DIE) TVUS should be the first-line imaging approach for the evaluation of suspected endometriosis [21,22]. An updated vision of endometriosis diagnosis tends to consider that history-taking and medical imaging are sufficient to propose a therapeutic strategy [11].

The origin of endometriosis lesions are discussed, but the most classical hypothesis has been posited by J.A. Sampson in 1927 [24], who, besides, coined the term ‘endometriosis’. Today, it can be interpreted as the presence of progenitor cells in the retrograde menses that ‘memorized’ the uterine development program, and that will be able to ‘restart’ it at ectopic positions. For instance, while it is estimated that >90% of women have retrograde menstruations, only 10% develop endometriosis lesions, suggesting that specific mechanisms differentiate the patients. Also, in rare cases, endometriosis lesions occur in organs located above the diaphragm, such as the lungs or even the brain [25].

The heritability of endometriosis has been estimated at 50%, indicating that genes are an important explanation of the disease etiology [26,27]. Identifying such genes and gene variants is a prerequisite for understanding the physiopathology and paves the way to a personalized medicine approach. Finding these genes remains a considerable challenge for many complex human diseases.

Seemingly, the most straightforward approach to identify such genes is to analyze the variation occurring in candidate genes, genes of which function is supposedly associated to the fields of the pathology, estimated at large. Another approach will use a familial positional cloning approach: the idea is to screen families with polymorphic markers covering the chromosomes, these markers having been for a long time microsatellites, are now almost completely supplanted by Single Nucleotide Polymorphisms (SNPs), which, while harboring a much lower PIC (Polymorphic Information Content), are considerably more numerous (the human genome is estimated to encompass several tens of thousands of dinucleotide repeat microsatellites, while SNPs account by the millions). For about ten years now, the drop in genotyping costs (less than $0.0001 per SNP in 2020), triggered the idea of applying these approaches to non-familial situations, leading to the concept of GWAS (Genome-Wide Association Study), where two large populations (cases versus controls) are compared statistically for a large set of SNPs (in the range of one million). Besides SNP accessible in GWAS, sources of variations are Copy Number Variation (CNV), and indeed three CNV were reported as associated to endometriosis but will not be the topic of this review [28,29].

2. Genome-wide Association Studies

A list of significant SNPs from the different studies is provided asTable 2.

This study was based upon 1907 cases and 5292 controls and revealed an association between CDKN2B-AS1 and endometriosis, on chromosome 9p21, and a trend towards association with WNT4, a gene of the WNT-Beta catenin cascade previously directly involved in female sex determination [50]. A second one starting from Caucasian women (Australia and UK) with 3194 cases and 7060 controls led to the discovery of rs12700667 on chromosome 7p15.2, in the intergenic region located between NFE2L3 Atp< 10−5, several SNPs were found inside and nearby the gene IL1A, and downstream of RHOU (Ras Homolog Family Member U); despite the limited size of the sample set in this paper, at the threshold chosen, there was a slight increase in the number of putatively significant SNPs compared to that expected by mere chance (36 vs. 28). The increased size of later meta-analyses allowed to find systematically SNPs that were robust enough to reach genome-wide significance (generally established atp

The metanalysis of 11 GWAS of endometriosis has been performed in 2017 [41] and made it possible (through the mechanical increase in sample size: 191,596 controls and 17,045 endometriosis cases) to enrich the list of genes with five additional loci (FN1, CCDC170, ESR1, SYNE1 and FSHB), leading in 2017 to a list of 19 SNPs associated with the risk of endometriosis at the genome wide level (i.e. apvalue below 5 × 10−8). For the five novel genes, the relative risks were 1.06, 1.09, 1.10, 1.15, and 1.11, respectively. When the analysis was carried out on stage III-IV endometriosis, the relative risks were comprised between 1.15 and 1.35. Again, the effect of each variants explained only a small portion of the variance.

Amongst the large GWAS studies, starting from partially consanguineous populations is an interesting alternative, since isolation and genetic drift may lead to cumulate the advantages of linkage studies in families with those of large-scale GWAS based upon the genetic analyses of populations. As such, the Icelandic population has been a paradigmatic tool for decrypting the fundamental genetic bases of single gene as well as polygenic diseases, with the systematic collection of DNA samples from the entire population that was maintained at less than 50,000 individuals for more than 1000 years [51]. One large study of endometriosis was undertaken in the Iceland population [52], with 1840 cases and 129,016 control women. Interestingly, 9 out of 11 of the loci previously identified from the previous published GWAS at this date, were confirmed at a nominalp-value ofp< 0.05, far below the Genome-Wide significance, but suggestive as replication findings (and thus based upon much less multiple testing, WNT4, GREB1, ETAA1, NFE2L3 [2 SNPs], CDKN2B-AS1[2 SNPs], VEZT), while RND3 and ID4 could not be confirmed in this 2016 study.

The Sardinian population is another isolated human population of particular interest for genetic studies, allowing theoretically to find relevant associations with a limited number of samples. In the Angioni study [53], the DNA of 72 women was collected (41 with symptomatic endometriosis and 31 controls). A more systematic search for coding variants (11) of VEZT was undertaken in Australian women in 2016 [54], in connection with the expression of this gene (located in 12q22). The level was also examined in endometrial glands, showing an increase connected to the secretory phase of the uterus in the glandular epithelium.

Curiously, besides the large scale GWAS carried out from tenths or hundreds of thousands of DNA samples, some small-scale studies were performed with only hundreds of DNA samples, but that may be interesting as based upon specific populations or specific phenotypic criteria. On chromosome 6, a series of SNPs in strong linkage disequilibrium were associated to endometriosis, especially in C2 and HLA-DRA. Strikingly, none of the previously found 22 SNPs from larger scale studies and other previous studies [34,35,41,55,56,57,58,59,60] could be retrieved in this specific paper. A pooled approach of Brazilian samples pointed out to KAZN and LAMA5 (394 infertile women with endometriosis and 650 fertile controls) and was reproduced in a relatively limited number of human samples [45].

Since endometriosis symptoms are not specific (pain, infertility), one can hypothesize that the underlying genetic risk factors would be shared with other diseases (albeit this may also be due to shared environmental effects). This has been recently explored by crossing GWAS analyses for other phenotypes, especially pregnancy disorders, including infertility manifestations, such as Recurrent Implantation Failure (RIF) or Recurrent Pregnancy Loss (RPL), as shown in a recent meta-analysis of gene profiling studies [61]. There are common alterations of endometrial functions, such as cell-cycle alterations, but also in ciliogenesis, and in RIF and RPL anomalies of expression of genes involved in mitochondrial function. While this study is not a GWAS, it shows that common deregulations could be at work to explain at least partly the connections between infertility and endometriosis.

Nevertheless, Obesity (that given this observation could be considered as protective), which was found as a causal risk factor in uterine endometrial cancer as shown through Mendelian Randomization [63], was not directly associated (negatively or positively) with endometriosis, and gene expression alterations did not differ between obese endometrial women and others [64]. The association between endometriosis and maternal BMI has also been addressed through the analysis of common susceptibility loci [36]. Another study based upon Mendelian Randomization by Two-Sample comparison attempted recently to identify associations of specific biological parameters with endometriosis, using three methods (weight median-WM, MR-Egger-MRE, and Invers Variance Weight-IVW). Besides, significant SNPs linking endometriosis with phenotypes were found for sex-hormone levels, age at menopause, at menarche and again, the length of the menstrual cycle.

Amongst the unexpected associations that have now been found through the cross analysis of various GWAS data, endometriosis and depression were found to share similar risk loci, linked with gastric mucosa abnormality [66]. Other associations link Endometriosis with migraine [46]. In this case, the analysis of concordant SNPs revealed highly significant overlaps, especially for the 1% most significant SNPs together present in Endometriosis and Migraine (85) compared to those that were discordant (43), revealing a 3.61 Odds Ratio compared to the null hypothesis (p= 7.2 × 10−4).

Somehow, Leiomyomas are like adenomyosis lesions, where the lesion occur through the uterine wall, rather than from material transiting through the Fallopian tubes. The systematic analysis by GWAS of leiomyoma-associated variants was undertaken in 2019 [48] starting with 16,595 cases and 523,330 controls, and led to the identification of 21 variants in 16 loci associated to the disease. In 208 patients with symptomatic leiomyoma, histologically proven, 181 had concomitant endometriosis lesions. More recently, a GWAS analysis including 35,474 leiomyoma cases and 267,505 controls identified 8 loci with a genome-wide significant level (p< 5 × 10−8), adding up with the 21 reported loci.

The association between cancer and endometriosis has been addressed systematically in [68], strengthening the idea of an association with clear cell ovarian carcinoma (OR = 3.44), endometrioid cancer (OR = 2.33), thyroid cancer (OR = 1.39), and marginally to breast cancer (OR = 1.04), while no association was found with endometrial cancer, and other cancers. Concerning the genetic link with ovarian cancer, the top 38 endometriosis-associated SNPs identified in the Nyholt study in 18 regions were tentatively associated with this type of cancer [44]. The strongest burden statistic was on chromosome 1p36 (the region encompassing ZBTB40, WNT4 and CDC42), with two types of ovarian cancer (clear cell carcinoma and high-grade serous carcinoma, while epidemiologically, endometriosis does not constitute a risk factor for this last type of cancer). Interestingly, a Loss Of Heterozygosity (LOH) was detected in the PTEN region at 10q23.3 in endometriosis lesions, which revealed somatic variants, as well as preexisting germline variants, putatively associated with decreased expression and development of a common risk situation for ovarian cancer and endometriosis [70].

Another recent paper attempted to link gynecologic diseases and endometriosis in the Japanese population without finding significant SNP in endometriosis probably due to the limited sample size of endometriosis cases in this study [72]. The other gynecologic diseases analyzed include fibroids, ovarian cancer, and uterine endometrial cancer, which in this study allowed to identify significant SNPs, despite relatively limited sample sizes, as well.

In 2016, a study by Horikoshi and coworkers [73] tried to connect birth weight-influencing genes (60 genes identified) with various human diseases or parameters, such as blood pressure, diabetes, coronary heart disease, but also endometriosis. (~5 × 10−8), suggesting that birth weight is not connected to the risk of developing endometriosis later. A similar GWAS study aiming at identifying gene of reproductive behaviour failed as well to be found in common as at risk for endometriosis, albeit GATAD2B and ESR1 were found as most likely causal and are indeed increased in expression in endometriosis lesions compared to the eutopic endometrium [74]. The ESR1 region is well-known as associated with reproductive disorders, genes of the ESR1 region are correlated with ESR1/PGR genes expression level and PG concentration [75].

Another study determined associations between endometriosis significant GWAS SNPs and other reproductive phenotypes [76]. This study found significant SNPs associated with dysmenorrhea that were also identified in endometriosis. This was true for CCDC170-ESR1 (rs6557160) , a locus previously identified with the SNP rs1971256 by Sapkota and coworkers [41] and confirmed IL1A

Recently, in the Iceland population, Olafsdottir and coworkers [77], revealed an association of rs3820282 (in an intron of WNT4, probably encompassing the response to estrogen signaling) with Pelvic Organe Prolapse, thus allying this gene with endometriosis, leiomyoma, gestational duration, and as mentioned above, up to the early stages of female sex determination [50].

These results attempting to connect endometriosis with other diseases led therefore to various results in term of finding actual intersections (strong for leiomyomas and weak for gynecologic cancer predisposition). This suggests that a clear phenotypic definition is warranted to have GWAS that perform better in finding relevant genes. As a mirror vision, it clearly suggests that ‘endometriosis’ is rather a compendium of symptoms having similar manifestations, hiding subtle, different and maybe complementary subjacent genetic causes (as gene or genome variations).

In complex traits and complex diseases, variants associated to alterations of gene expression located either nearby the gene with its expression modifed (in cis) or at long distances or even from other chromosomes (in trans) have been systematically studied when expression data were available together with the genotypes. One recent study in Taiwan [78], presented novel candidate genes (PTPRD on chromosome 9 and two other non-coding regions at chromosomes 14 and 15). Associating expression profiling with SNP also led to the identification of eQTL and this information, collated with the position of genes identified by GWAS, allowed to connect genetic variation and classify the patients according to gene expression levels and to find eQTL located in cis or in trans (45,923 and 2.968, respectively, corresponding to 417 genes and 82 genes, respectively [79]). In this paper, the association between endometrial eQTL signals (associated with expression alterations during the menstrual cycle) were tentatively connected with endometriosis, but also with PCOS and endometrial cancer.

Recent progresses in the determination of gene/protein cascades specifically altered in disease is a novel ‘system biology’-based approach to enrich the knowledge database, for instance in endometriosis. Interestingly, a 2017 study revealed a relatively strong association of rs144240142, inside the MAP3K4 gene (OR = 1.71), but specifically with the mildest forms of the disease (rAFS I/II), and MAP3K4 was differentially expressed according to the stage of the endometriosis. This signalling cascade plays multiple roles in cell physiology (cell division, gene expression, cell movement and survival), and was not pointed out before despite the relatively limited size of the experimental setting (3194 cases and 7060 controls), compared to the original dataset encompassing almost 200,000 controls [40].

3. Functional Studies

Finding significant SNPs using GWAS approaches is currently straightforward, given the huge number of available SNPs that can be genotyped simultaneously (in the million range), given a collection of DNAs is available from enough control and cases. More precisely, it has been estimated in 2012 that the total variation tagged by frequent SNPs as used in GWAS was 0.26, i.e., about half of the total genetic variation [55]. Missing heritability is currently explained by various hypotheses, one of the most prominent relies in the idea that GWAS are carried out using microarray platforms encompassing SNPs having a relatively high Minor Allele Frequency (MAF) and hence, will miss rare alleles that may be the one indeed associated to the disease. This latest question was systematically addressed in a 2015 review by Fung and co-workers, and showed that once the variant is identified, a long and tedious stroll commences, from a refined mapping with additional SNPs in the region identified, a study of existing functional annotations (which is difficult when non-coding or intergenic SNPs are found, a case encountered for the chromosome 7 rs12700667 in endometriosis ), a measure of cell-type specific gene expression and protein levels, analysis of the cell-type specific local epigenetic regulation, cell models and animal models, eventually, if available, a complicated issue in the case of a human-specific disease such as endometriosis [92].

In this context, vezatin (VEZT) has been validated in several replication studies through the validation and further analyses of rs10859871 and rs14121 SNPs. In 2016 Holdsworth-Carson and coworkers demonstrated that this SNP acts as an eQTL especially in endometrial tissue of endometriosis patients, with the A allele at rs14121 VIIA.Adherens junctionsgenes are generally strongly up-regulated in endometriosis compared with eutopic endometrium [10], suggesting that it could contribute to a relatively low potential of development of the endometriotic lesions. This led sometimes to coin endometriosis as a benign metastasic disease, an oxymoronic formulation.

GREB1 However, the protein quantification did not reveal obvious differences between endometriosis and control patients. Several explanations are proposed by the authors, such as the presence of mixed populations in the tissue sample. The figures shown in the paper show a massive dispersion of the normalized signals (mRNA and proteins, probably barring the identification of statistical differences).

One of the most thorough mechanistic analysis in endometriosis has been published for the locus proximal to 9p21 and encompassing the gene CDKNB2-AS identified firstly by Uno and coworkers and duly confirmed later [30,34,42,52]. from the original study [30], and 7 SNPs and 1 indel in total linkage disequilibrium with rs1537377, 49 kb downstream [55], this one being informative in both Japanese and European populations. The DHS overlap with binding sites for TCF7L2, as well as H3K27ac, EP300 and TBP, and these interactions were validated by ChIP seq experiments, that also allowed to pinpoint an excess of binding of G versus T allele at rs17761446. Further, induction of the WNT cascade with a pharmacological treatment with CHIR99021 led to an induction of ANRIL, and to the decrease of cell cycle inhibitors (p16INK4Aand p15INK4B).

The locus has been robustly, and several times confirmed in endometriosis. The SNP rs3820282, located in the first WNT4 intron, plays the role of a cisQTL strongly affecting the expression of LINC00339, in the blood and the endometrium. In the Powell study, the authors studied two putative regulatory elements located inside CDC42 (PRE2) and inside WNT4 (PRE1) through 3C approaches enabling to materialize distant interacting regions. In the case of PRE2, the insertion of specific variants at rs12038474, induced either a super-activation of CDC42 promoter activity or a decrease activity of the same promoter.

In summary, the validation of SNPs is only starting in the endometriosis context. Many approaches involving genome editing, systematic sequencing, and the use of animal models or organoids [96,97] will in the future be consistently used to solve the mechanistic issues raised by the identification of the SNPs.

4. Exome Sequencing

This original approach allowed in 2016 to detect hemizygous deletions in two genes UGT2B28 and USP17L2 using three generation families [98]. The two genes harbour hemizygous deletions that were traced to the grandmother of the family. The first gene intervenes in reaction where glucuronic acid is conjugated to lipophilic substrates, while the second is a de-ubiquitinase and acts therefore probably in reversing the trajectory of proteins programmed to degradation by the proteasome. The ultimate validation of these approaches is terribly challenging, since proving the involvement of genes requires the use of animal models or at least strong cell biology cues from cell culture experiments.

Matalliotakis attempted to evaluate five GWAS-identified variants in a familial structure (inside WNT4, VEZT, FSHB and two inside IL16). This data suggests that the genetic determination of SNPs through GWAS approaches may point to genes or SNPs that are not so important in familial cases. It could be hypothesized that in familial forms of endometriosis (that are in fact the basis of the estimation of heritability), the variants at risk have a major determining effect, and is located high inside the upstream cascade leading to the disease. On the contrary, GWAS points to robustly identified (often replicated) genes but that may have a marginal effect in the aetiology.

In 2014, Li and coworkers undertook an analysis of endometriosis patients from an Exome-seq approach [100]. The authors used blood, eutopic endometrium and ectopic endometrium DNA from 16 endometriosis patients, and normal endometrium from 5 healthy women. Given its limited sample size, the aim of this study was not to discover predisposing genes inherited through the germline but rather of course to identify genes that are prone to somatic mutations associated to the pathogenesis. Apparently, no overlap could be found with the GWAS-identified genes so far.

This entry is adapted from the peer-reviewed paper 10.3390/ijms22147297

This entry is offline, you can click here to edit this entry!