1000/1000
Hot
Most Recent
Biological systems respond to perturbations through the rewiring of molecular interactions, organised in gene regulatory networks (GRNs). Among these, the increasingly high availability of transcriptomic data makes gene co-expression networks the most exploited ones. Differential co-expression networks are useful tools to identify changes in response to an external perturbation, such as mutations predisposing to cancer development, and leading to changes in the activity of gene expression regulators or signalling. They can help explain the robustness of cancer cells to perturbations and identify promising candidates for targeted therapy, moreover providing higher specificity with respect to standard co-expression methods.
Biological systems are complex in nature, their behaviour being governed by the interactions of many molecular components (e.g., coding and non-coding RNAs, proteins), through several regulatory layers [1] (e.g., promoter binding, miRNA–mRNA interaction, post-translational modifications). Cancer is no exception, being the result of multiple perturbations within a single cell that also affect cell–cell and cell–microenvironment communication. Each perturbation does not act in isolation but is influenced and in turn influences the whole system, with reciprocal relationships occurring between most components [2].
Therefore, accurately depicting a biological system such as a cancer cell requires knowledge about elements’ interactions, and in particular about the regulatory layer described as gene regulatory networks (GRNs). Gene regulatory networks, like all biological networks, represent biological components as nodes and their interactions, either physical or functional, as edges (Figure 1). GRNs are the ideal reconstruction of interactions between genetic elements, comprising the activity of transcription factors (TFs) on their targets’ expression, post-translational modifications influencing a protein’s impact on other elements of the network, epigenetic modifications altering transcription and many additional levels of regulation. While the overall goal of most biological network studies is the inference of GRNs, this is a complex and laborious task that is generally approached by setting some simplifying assumptions and by analysing one kind of relationship at a time, based on physical or other kinds of interactions, as described below.
Figure 1. Example of a network, indicating nodes, edges, centrally located genes (hubs) and groups of tightly connected genes (modules).
One caveat is that, despite the general assumption that physical or genetic interactions indicate shared functions or belonging to the same molecular pathway(s), this is not necessarily true, since the information used for edges’ inference is not a direct measure of a functional relationship. Moreover, each network type estimates only part of the overall GRN structure (e.g., transcriptional regulation, protein–protein interactions), losing the information hidden to the specific data type used for its construction, which could be revealed by combining the results obtained with additional data sources.
The most frequently studied biological networks based on physical interactions are protein–protein interaction (PPI) networks, where nodes are proteins and links indicate direct binding, but the binding of a TF to its targets’ promoters can also be represented in a network. A particular type of biological network is pathways, curated and deposited in repositories such as KEGG (Kyoto Encyclopedia of Genes and Genomes) [3], where nodes can be either proteins or small molecules and edges indicate a variety of interactions, among which are enzymatic reactions. Genetic interactions such as synthetic lethal interactions can also be studied as networks [4], assuming them to indicate that the two genes belong to the same pathway. Additionally, methods have been developed to build a variety of biological networks based on metabolic, single nucleotide polymorphisms (SNPs) and phenotypic data (reviewed in [5]).
High-throughput gene expression assays are often used to infer functional relationships between genes from correlations between their expression levels, building the so-called gene co-expression networks (Figure 1). The potential relevance of this method to estimate GRNs is supported by the knowledge that genes with similar transcriptional expression profiles are likely to be regulated through the same mechanisms and to participate in the same functions, or to physically interact [6][7][8][9]. This information can also be combined with other information such as, for example, transcription factor binding and/or PPIs, to obtain a more complete and accurate representation of molecular elements’ interrelationships [10]. Specific open-access databases have been created to store physical and functional networks.
From the analysis of topological differences between cancer and normal tissue co-expression networks, some general principles have emerged. In particular, network entropy (disorder), measured with different metrics, has been shown to increase in cancer [11][12][13], paralleling an overall decreased connectivity [14]. Interestingly, entropy is even higher in tumours that metastasise, at least in the breast [15]. This observation may help in explaining the underlying principles of cancer adaptability and resistance to perturbations, such as treatments with drugs and hypoxia. In fact, the entropy is correlated with the system’s robustness [16]. In line with this idea, cancer cell lines resistant to three different tyrosine kinase inhibitors have been shown to display higher network entropy than their sensitive counterparts [17]. This could be interpreted as cancer displaying a higher number of interconnections and possible regulatory relationships for each gene, which makes the whole network resistant to single nodes’ and edges’ disruptions. Moreover, the relationship between entropy and robustness can inform about promising drug targets, represented by low-entropy genes [11][18]; interestingly, entropy usually decreases for up-regulated genes in a cancer’s networks [11]. A second important concept arising from these studies is the differential usage of nodes and edges in cancer: a cancer’s networks tend to be less hub dependent, displaying signalling shortcuts in comparison with normal tissues [19][20]. These features, observed in 13 different cancer types, are suggestive of facilitated crosstalk between biological processes that are usually not interconnected, again supporting a higher robustness of cancer networks.
Although both higher entropy and connectivity between pathways can be interpreted as a weakening of tight regulatory rules, improving tumour adaptability, they could also reflect higher cellular heterogeneity. This idea has been confirmed by Park et al. [21], who assessed network entropy related to cells’ heterogeneity and the number of subclones, making use of single-cell data, tumour purity estimates and clonal evolution in xenograft models. Additionally, signalling entropy has been shown to be an estimate for tumour stemness [22], and to be a prognostic measure across several epithelial cancers.
Interpreting network entropy as linked with tumour heterogeneity would help in explaining why it often decreases in cancer at advanced stages [17], confirmed by the observation that initial tumour heterogeneity is subsequently reduced by clonal selection and expansion in the process of metastasis [23].
The detection of co-expression modules has been widely applied to retrieve gene categories relevant to cancer, identifying modules shared across cancer types [24]. Nevertheless, condition-specific modules detected through differential co-expression allow the study of features characterising, for example, a disease state or different stages of the same disease, and have been shown to outperform single-condition co-expression modules in identifying characterising features of the studied biological system [25]. The analysis of the differential co-regulation of groups of genes (modules or pathways) has been carried out either exclusively using unbiased network properties or feeding prior knowledge-related gene lists to the network structure. This latter approach, despite its higher robustness and interpretability, has been employed only in a few studies [26][27][28][29][30] and needs to be better explored, while the unbiased analysis has also been amply applied to cancer, as described thoroughly below.
DC modules in cancer biology have been applied to the comparison of tumour and normal tissue, identifying tumour-specific modules in hepatocellular carcinoma, uveal melanoma and ovarian and prostate cancer [31][32][33]. In 12 cancer types, cancer-specific modules were shown to have prognostic value [34], while three independent studies reported immune response-related modules to be differentially co-regulated between ER+ and ER- breast cancer [35][36][37]. Immune-related modules are also differentially connected in non-small cell lung cancer when compared to normal tissue [38], possibly regulated through a miRNA-mediated mechanism. Interestingly, an independent report found an enrichment for targets of miRNAs related with cancer in two co-expression modules more strongly connected in lung cancer than in normal tissue [39]. Methods to identify modules that are differentially co-expressed across multiple networks find a particularly interesting application in the study of cancer stage-specific regulatory relationships [40][41][42]. In breast cancer, dynamically co-regulated modules improve the prediction of stage, and their hubs are enriched in signalling protein domains [41]. This observation confirms the idea that context-specific hubs are signalling or regulatory molecules that tune the activity of constitutive hubs, grouped in modules of genes with similar functions [43]. Only in a few cases has validation of in silico predictions been provided. Recently, however, the potential of differential co-expression in driving testable hypotheses has been demonstrated in an inclusive analysis of astrocytoma progression that integrated mRNA expression, ChIP-seq and copy number variation (CNV) data [44]. Indeed, the authors were able to identify a cell cycle-enriched module predicted to be affected by resveratrol and to experimentally validate their prediction.
Finally, specific comparisons allow for the investigation of regulatory differences between tumours classified according to various parameters: survival time [45], angiogenic features [46], type of treatment [47] and genomic stress [48]. A comprehensive study compared high and low genomic stress tumours in 15 cancer types, identifying 101 modules activated by genomic stress based on CNV, expression data, and PPI networks [48]. Within these, up-regulated hubs have been proposed as non-oncogene addiction genes for further functional studies. In the same vein, the differential co-expression module multivariate analysis method MultiDCox has been applied to breast cancer, revealing gene sets associated with mutant p53, ER status and grade [49].
In cancer biology, “single-gene” approaches have been applied to prostate [50][51], gastric [52], liver [31][53], bladder [54][55], thyroid [56] and lung [57] tumours and glioblastoma [58], comparing the connectivity of genes between normal and cancerous tissue. This led to the identification of several gene lists not shared between different cancer types, as also confirmed by a systematic study performed on 12 cancer types [34]. Despite the gene-centred approach, in all studies, the genes with the strongest evidence for differential connectivity were analysed as a whole and often corresponded to previously known and druggable targets [31][34][52]. Despite the lack of a systematic study to compare the functional enrichment of DC genes across cancer types, recurrent Gene Ontology categories comprise cell cycle, apoptosis, and immune system-related genes [50][31][53][56][57][59]. Depending on the researcher’s interest, the search for DC genes can be restricted to selected gene lists [60][61]. For example, focussing on metabolic genes, a signature of genes suggestive of mitochondrial dysfunction was found to be differentially connected in seven cancer types [61]. Again, this approach can inform about differences in connectivity, allowing the grouping of samples based on any feature of interest. The comparison of patients responsive and non-responsive to a specific treatment is particularly promising to understand the rearrangement of gene networks leading to drug resistance. Through this means, well-known genes have been confirmed to confer platinum resistance (e.g., CCNE2, AKT1 and MYC), and the additional role of FGFR1 and TSC2 has been proposed [62].
The “gene-specific” approach finds a particularly interesting application in investigating the gene network neighbourhood of a pre-specified gene of interest, such as the tumour suppressors p53 or PTEN [35][51]. The opposing roles of the same gene in different cellular contexts, a widespread feature of cancer genes (e.g., Wnt5a, TGFbeta or p63 context-specific oncogenic and tumour-suppressive activity [63][64][65]) can be elucidated by this means. Surprisingly, only NOTCH1 has been investigated by differential co-expression, in lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC) [66], predicting and then validating it as a pro-proliferative factor in LUAD, while acting as a tumour suppressor in LUSC.
Differential co-expression indicates changes in regulatory relationships between genes. This change can be mediated by a transcriptional regulatory mechanism that can be directly appreciated in transcriptional data (e.g., expression levels of a transcription factor or cofactor), as assessed in a set of works aiming at reconstructing GRNs [67][68][69][70]. Alternatively, changes can pertain to a different layer of molecular regulation (e.g., microRNA, DNA methylation, genomic mutations) and are hence not directly derived from transcriptional data. In any case, the integration of diverse data types regarding different regulatory layers can greatly improve the discovery and interpretation of altered gene regulatory networks, as reported in a comparative study showing an improved performance of almost all tested methods when integrating mRNA and miRNA data [71].
Indeed, the most explored of these layers is the regulatory activity of miRNAs on mRNAs, lncRNAs and mRNA–lncRNA crosstalk. In fact, it has been shown that RNA molecules sharing miRNA response elements (MREs) can communicate with each other by competing for common miRNAs (competing endogenous RNAs–ceRNAs-) [72]. The mRNA–miRNA–lncRNA connection implies a correlation of mRNA–lncRNA in the presence of a common regulatory miRNA, and no correlation in its absence.