To increase the efficiency of assisted reproductive techniques (ART), molecular studies have been performed to identify the best predictive biomarkers for selecting the most suitable germ cells for fertilization and the best embryo for intra-uterine transfer.
Infertility is a global concern affecting an estimated 10% of reproductive-age couples 
. When sperm motility, morphology, or counts are affected, intracytoplasmic sperm injection (ICSI) must be used 
. Despite technological advances, successful live birth rates are far from 100%. Several markers/parameters are investigated to monitor the entire process and to select the best embryos for intra-uterine transfer to increase the rate of live births. These potential markers are linked to the biological process of follicular and oocyte development, maturation, and ovulation and include a variety of types: (A) endometrium wall thickness and volume 
, (B) hormone levels 
, (C) follicle number and size 
, (D) gene expression in granulosa 
or mural granulosa cells 
, (E) gene expression in cumulus cells 
, (F) microRNA expression in cumulus cells 
, (G) gene expression in follicular fluid cells 
, (H) zona pellucida morphology parameters 
, (I) analysis of polar body chromosome number and structure 
, and (J) morphological appearance of the oocyte 
Of the above-mentioned approaches, gene expression analysis (categories D to G) offers a promising methodology for monitoring oocyte development due to its relative simplicity. However, different studies have failed to identify universal markers linked with good developmental potentials (Supplementary Table S1
). This is possibly due to the heterogeneity of samples and to differences in their preparation because sample sources could be mural cells, cumulus cells (inner or outer), granulosa cells (precursor of mural and cumulus cells), or follicular fluid derived cells. The heterogeneity of samples in biological developmental stage and purity of sample preparation, further contributes to the inconsistency of results. Moreover, the degree of maturity of the retrieved oocyte has been found to be associated with a specific expression signature that could further cause discrepancies when comparing results between different studies 
. Likewise, differences in methodology also contribute to the variability of results, as a number of studies have applied broad genomic-based approaches to identify genome-wide expression using microarrays or RNA-seq, while others have focused on the expression of a limited number of genes related to specific pathway(s) that were shown to play important roles in oogenesis. Additionally, different treatment protocols further increase the variability of results.
Several studies performed controlled experiments to study the effect of parameters/factors that could introduce systematic bias into the results. Indeed, age 
, stimulated in vitro fertilization cycles rather than natural cycles 
, differences in treatment protocols, as well as medications used during ovarian stimulation 
were seen to have an effect. Additionally, a patient’s genetic constituents also affect the clinical output and success of treatment. For example, the influence of polymorphism in the follicle stimulating hormone receptor (FSHR) and the luteinizing hormone (LH) beta subunit 
and alternative skipping of exon 2 or 3 in the FSHR 
have been reported to affect the ovarian responses to the FSH.
2. Current Insights
One of the aims of improving the clinical results of ICSI cycles is to predict the developmental potential of the transferred embryo, thus choosing the best for transfer and leading to less psychological stress for the affected couples and cost reductions for the health system. The use of biomarkers in cumulus cells surrounding the oocyte to predict the clinical outcome is a possible non-invasive approach. To this end, we performed molecular profiling, including, for the first time, simultaneous RNA and DNA methylation profiling on outer cumulus cells retrieved from ICSI cycles.
In order to obtain informative biomarker(s) without false positive influence of interfering factors, we determined, using literature-based knowledge and our experimentally obtained data, conditions that could potentially influence the results (i.e., choice of biomarkers). The main factors influencing the expression and methylation markers were inter-individual differences and differences in treatment protocol: short vs. long.
The first could be a reflection on the genetic setup of individuals and their response to treatment. The inter-individual differences could be clearly observed (Figure 1
A) in the unsupervised PCA samples and heatmap where the samples derived from the same individual, even for two different stimulations using the same protocol, were interconnected and clustered close together. This could be explained by genetic polymorphisms affecting general expression patterns or a response to hormone stimulation treatments. It is documented that genetic variations affect the global gene expression patterns 
and that any response to induced hormonal treatment could be modulated by polymorphisms in the FSH receptor 
. Therefore, an influence of the individual genetic make-up on the treatment outcome cannot be avoided. However, it could be predicted from previously known polymorphisms, where the treatment regimens had been precisely adopted for individual genetic setup. Unfortunately, in this study, we could not investigate whether the individual-specific SNPs were directly associated with cumulus expression profile, as patient DNA (from peripheral blood) was unavailable for this study. The second observed influence was the hormonal stimulation protocol, as the differences in the unsupervised classification of samples were clearly influenced by the protocol used, short or long, which reflected the use of different stimulating/treatment regimens (Supplementary Figure S5A
). Therefore, we stratified the samples according to treatment protocol but could not do the same for the genomic/polymorphism because such information was not available.
Figure 1. Inter-individual differences in expression and methylation. (A) 3D-PCA (upper part) and heatmaps (lower part) representing expression and CpG methylation profiling of 24 individual cumulus cells derived from eight different women. Left and right parts represent expression and methylation, respectively, unsupervised and ANOVA for multiple comparisons at FDR < 5% analysis are shown. (B) Correlation between expression and methylation. Right upper part shows Venn diagram of intersection/overlap of DEG and DMC; in total 659 genes overlapped with 1043 investigated CpGs. The correlation of the overlaps is shown at the left upper part as −log10 (p-values) vs. the relative position to the transcription start site. In addition, the representative density/fitting curve is shown as a dotted line (red or blue represent negative and positive correlation, respectively). The lower panel shows correlation graphs of the highest 14 correlations (three positive and 11 negative) (The color of the dots represents the individual treatment, and only transferred samples were used in this analysis). (C) Ontology analysis of the group of genes showing strong inter-individual differences (ANOVA analysis at FDR < 0.5); both biological process and cellular component are shown. The vertical arrow indicates the TSS; others arrows are linking to the dots with the number(s). (D) Top-10 affected canonical pathways as determined by IPA. The lower x-axis represents the −log(B–H p-value) for enrichment, while the upper x-axis represents the percentages of variable genes in a given pathway. (E) Top-10 upstream regulators (predicted by IPA) are listed with the overlapping p-values and numbers of regulated genes in the data; FSH regulated proteins are represented with symbols reflecting their functions and in their subcellular localization. (F) The top categories in disease and biofunction (predicted by IPA). The top-10 significant subcategories according to p-values are listed below together with the p-values.
We identified individual single markers that were differentially expressed where NME6 and ASAP1 were under expressed in positive samples (i.e., continuing pregnancy). NME6 is the nucleoside diphosphate kinase 6, which specializes in the phosphorylation of nucleotide diphosphate, which regulates on cell growth. NME6 is ubiquitously expressed at low levels in most human tissues but is abundant in the kidney, prostate, ovary, intestine, and spleen 
. It has also been found to be particularly abundant in zebrafish ovaries 
. To date, it has not been experimentally proven whether NME6 is active in human cells. However, it seems to be localized in the mitochondria 
, suggesting a possible role in signaling by contributing to nucleoside metabolism and the ratio of different forms. The latter could itself regulate the growth of cumulus cells around the oocytes and probably the oocyte itself. ASAP1 stands for ArfGAP with SH3 Domain, Ankyrin Repeat and PH Domain 1. It plays a role in shaping the actin cytoskeleton and induces proliferation 
. It is also overexpressed in many tumors and could trigger cellular movement and the ability of cells to move and thus promote metastasis 
. As the process of oocyte maturation and ovulation necessitates cellular remodeling surrounding the oocyte it is not surprising that ASAP1 is considered one of the players indicating a proper maturation stage. However, we also saw a correlation of these two markers with endometrium size in the positive samples (Supplementary Figure S6; Supplementary Table S3
); therefore, it is questionable whether this is an effect of endometrium development with an effect on implantation that was also detected in cumulus cells or is a marker for oocyte development. It could be that endometrium development is the main determining factor to be considered and that it overwrites any oocyte factor. Future studies that group large samples based on endometrium size should be performed to determine if these two markers are endometrium or cumulus cell specific.
When comparing our gene marker results to previous results obtained from cumulus cell global expression, very little overlap could be identified (p
< 0.05) (Supplementary Figure S7 and Supplementary Table S1
). Assidi et al. (2011) 
identified seven markers as the top differentially expressed between positive and negative samples (NRP1
, DPP8, HIST1H4C
), most of which were reconfirmed by a second study by Assidi et al. (2015) 
, except for HIST1H4C
. Feuerstein et al. (2012) 
identified three markers (PLIN2
), while Xu et al. (2015) 
determined nine that were confirmed by RT-PCR (BAI2
) and Demiray et al. (2019) 
identified five markers (ZFP57
) according to clinical pregnancy and >2-fold change. Here, two observations must be highlighted: (1) among the above-mentioned top markers, there was no single marker overlap between two research sites, and (2) Assidi et al. clearly confirmed their initial study. To explain this, we proposed that laboratory-specific factors may affect results, by making single-gene markers non-transferable from one site to another. This could be related to (1) the cellular heterogeneity of isolated materials, (2) specific ethnicity of patient cohorts and genetic make-up, (3) the stimulation and triggering protocol of hormone treatment, (4) experimental procedures of isolating the tested materials, (5) data generation methods (i.e., RNA seq, microarrays or even RT-PCR), and (6) genetic heterogeneity of the patients. The three overlaps from the above-mentioned potential markers within our data are TOM1
= −0.28; p
= 0.00157), HDAC1
= 0.35; p
= 0.00239), and PTGS2
= −0.61; p
Although no significant overlap in the single-marker genes from cumulus cells across different studies was observed, the use of gene expression as a biomarker for embryo quality was also questioned by others 
. However, the increasing trend of using pathway analysis showed common affected pathways. Indeed, IPA and KEGG pathway analyses are able to show which pathways are controlled by differentially expressed genes and which distinguish positive from negative samples. In this respect, our data are in accordance with the general trend of considering apoptotic markers as predictive markers 
. Our results also showed similarly affected pathways, such as cell death and survival in the category of bio function (for inter-individual differences see Figure 1
C,D,F; for differences between positive and negative samples see Figure 2
Figure 2. Differences in expression and CpG methylation profiles between pregnant positive and pregnant negative samples of the long protocol. (A) 3D-PCA plots and heatmaps of expression and CpGs methylation profiling. Result of unsupervised and significance at p = 0.05 and FDR < 5%, when applicable, are shown. (B) Correlation between differentially expression genes and differentially methylated CpGs. The correlation of the two data overlaps is shown on the left as −log10 (p-values) vs. the relative position to transcription start site (TSS = 0; every unit is 1 Kb), representative density/fitting curve is also shown in dotted line (red or blue represents negative or positive correlation, respectively). The right panel shows individual data for the highest six correlations (three positive and three negative correlations) (the color of the dots represents the pregnancy test result with red (positive) and green (negative)). (C) Comparison between results of successive analyses of two treatments of the same individual (VS1 negative vs. VS6 positive; shown in Figure 2) on one side and comparison between all pregnancy negative and pregnancy positive samples on the other (part A above). The left part represents the 3D variables of the PCA expression data (part A above), where the overlap of the overexpressed and the underexpressed genes between the two comparisons is in a near-complete phase. The right part represents a Venn diagram showing the overlap of the differentially expressed genes for the two comparisons in question. (D) Ontology analysis of the genes with difference in expression at p = 0.05 for biological process and cellular component are shown. IPA analysis (at p < 0.05%) for top-10 affected (E) canonical pathway, (F) upstream regulators (details of FSH-affected targets are shown on the left), and (G) diseases and biofunction.