Exposure to Endocrine Disrupting Chemicals (EDCs) has been linked with several adverse outcomes. Transcriptome-wide analyses using RNAseq provide snapshots of cellular, tissue, and whole organism transcriptomes under normal physiological and EDC perturbed conditions. A global view of gene expression provides highly valuable information as it uncovers gene families or more specifically, pathways that are affected by EDC exposures, but also reveals those that are unaffected. Hypotheses about genes with unknown functions can also be formed by comparison of their expression levels with genes of known function.
1. Introduction
In the twentieth century a large quantity of contaminants, both organic and inorganic, has been released into the environment . Many of these industrial chemicals have sufficient structural similarity to steroid hormones to be able to bind to steroid receptors or enzymes that regulate steroid hormone concentrations and, thus, perturb normal endocrine physiology in aquatic species, animals, and humans
[1][2][3][4][5][6]. Those chemicals capable of interfering with the endocrine system, mimicking the action of endogenous messengers (hormones) with their specific receptors are defined as endocrine-disrupting chemicals (ECDs)
[7]. This exposure of humans, fish, and other animals to E
ndocrine D
Cs isrupting Chemicals (EDCs) is a global concern and the subject of regulatory activity by various environmental agencies
[3][8].
EDCs interfere with many hormone-regulated physiological pathways and have complex effects on human and fish physiology due to the diversity of the hormone receptors and enzymes that they bind
[9]. The detailed mechanisms by which ED compounds act is not completely understood, but many interact with nuclear receptors, such as the retinoid X receptor (RXR)
[10][11][12][13], peroxisome proliferator-activated receptors (PPARs)
[13][14][15][16][17] and estrogen receptors (ERs)
[18][19]. The binding of EDCs to these and other receptors, as well as steroidogenic enzymes, can perturb normal physiological processes. The wide range of health effects attributed to EDCs include reproductive dysfunction (e.g., distorted male/female sex ratios in fish, elevated or reduced hormone levels, diminished fertility, and male and female reproductive tract abnormalities)
[20][21]; premature puberty
[22]; neurological
[23][24] and behavioral
[25][26][27][28] complications; immune dysregulation
[29][30]; cancer
[31][32][33][34][35] and metabolic diseases
[29][36][37][38][39][40][41][42][43][44][45]. EDCs include personal care products, plasticizers, detergents, insecticides, and pharmaceuticals. Many of these chemicals enter rivers and the ocean, where they directly interact with aquatic life and disrupt endocrine responses. Humans are exposed to these EDCs from drinking water and dietary consumption of fish
[46].
2. Genomics and Bioinformatics Approaches in Endocrine Disruptor Research
Over the past two decades, the field of ecotoxicology has adapted genomics technologies to better understand how exposure to compounds perturbs gene expression on a molecular level, facilitating transcriptome analysis. Historically, two tools have been used for this purpose: DNA microarrays and RNA sequencing (RNA-Seq) although microarrays are becoming an obsolete technology
[47]. A challenge is to comprehend the effects of these chemicals on the health of humans as well as other animals. Much research has been carried out elucidating what concentrations of EDCs are toxic for regulatory purposes
[48][49]. Critical for obtaining this information is the identification of biomarkers and genomic signatures of the response to different chemicals in humans, fish, and other animals. This is a challenging task because the actions of each chemical are complex, involving the regulation of many genes and interactions between many physiological pathways. The methods for investigating the molecular effects of exposure to EDCs must sample a broad molecular response. A special concern regarding the effects of EDCs is that exposure at different stages in life: embryonic, postnatal, juvenile, and adult are likely to have different effects. Thus, although molecular analyses taken soon after exposure to a chemical are useful in diagnosis the response to a chemical, it also is necessary to perform these analyses over a generation and if possible, for more than one generation, to uncover delayed effects. A good example is the diethylstilbestrol (DES) exposure which manifests in cancer development many years following the exposure and even affects later generations
[50].
Genomics technologies have enabled the assessment of the effects of xenobiotics in the environment on human health
[51]. Ecological exposure and risk assessment models alone, however, are not adequate to examine the effect of EDCs
[51]. Producing exposure-to-outcome databases from diverse ecotoxicogenomic datasets and executing systems toxicology approaches, are required to comprehensively explain the risk of these toxicants and to assuage the ambiguities that currently exist with risk assessment. Recent work in sediment and dredged material assessment has advocated for the inclusion of biomarkers in acute bioassays
[52]. Although chemical analyses have enabled the development of standards for regulating contaminant occurrence in the environment, there is a need for sensitive assays that can detect their bioavailability in fish, animals, and humans. Chronic exposure to compounds with endocrine targets, particularly at low chemical levels, has been shown to disrupt vertebrate development
[2][5][53][54]. Recent advances in molecular methods, have substantially improved the sensitivity for assessing the biological effects of
chemical contaminants of emerging concern (CECs
) on animal physiology
[1][55][56][57], but more pertinent, these tools provide chemical signatures of bioavailability and exposure and allow the development of risk assessment databases.
2.1. RNA Sequencing
From 2005, several massively parallel sequencing technologies termed Next-Generation Sequencing (NGS) emerged, resulting in increased throughput and accuracy, and reducing sequencing costs to less than a thousand dollars for a human genome. These high throughput parallel DNA sequencing platforms launched a new era of genomics and molecular biology. A key application is RNA sequencing (RNA-Seq)
[58]. With RNA-Seq, the RNA is extracted from the biological samples and converted to cDNA
[58] capturing the sequences and abundance of all mRNA transcripts at a given time point. RNA-Seq can be used to obtain per-base expression profiles
[58], with several advantages when compared with microarrays including better sensitivity for those genes that are expressed at low or very high levels, splice variants, and non-coding transcripts (such as miRNA and lncRNA)
[58][59].
Microarray analyses of the transcriptome have waned in the past five years as there are limitations to this method compared with RNAseq: the inability to detect novel transcripts that do not have specifically designed probes, the appearance of cross-hybridization artifacts in analyses of similar sequences, and decreased accuracy for transcripts present only at low levels
[60][61][62][63][64][65][66]. Sequencing-based approaches to transcriptome profiling overcome these issues and are designed to directly determine transcript sequences
[62]. In comparison to microarrays, RNA-Seq has minimal background technical noise. The sequence reads can be unequivocally mapped to unique regions in the genome and the technology has a greater dynamic range to quantify gene expression level. RNA-Seq can produce highly reproducible results, leveraging both biological and technological replicates, and requires less input RNA material
[67][68][69].
Some of the drawbacks of RNAseq have been centered on data management, ease of use, and the total number of databases provided by both the scientific community and commercial vendors
[70]. HTS experiments generate FASTQ files, massive raw data files containing the nucleotide sequence, and quality score information. Considered “raw” data, FASTQ files are subject to secondary analysis, often including alignment to a reference genome or de-novo assembly, which generates secondary and intermediate files as massive as the initial, “raw” data. These derived files, in turn, may be stored, filtered, annotated, or analyzed in several ways that generate more data; in short, storage and organization of RNA-Seq data provide a challenge. In terms of the user experience, many current HTS tools require knowledge of intricate command-line instructions for their operation. This provides a considerable barrier for nontechnical audiences—who have little experience in computer science—from effectively utilizing RNA-Seq for their analyses. Finally, the number of databases and knowledge bases is notoriously expansive and increases in size every year; the most recent count, according to Nucleic Acids Research, is 1641 databases as of January 2021
[71]. Keeping up with these databases is a daunting challenge, and recent efforts have focused on improving these standards of knowledge sharing
[72].
2.2. RNAseq for the Study of Endocrine Disruption
Leet et al. investigated the effects of early life exposures to the herbicide atrazine or EE2 on sexual differentiation and gene expression in gonadal tissue
[73]. They used largemouth bass (
Micropterus salmoides) from 7 to 80 days post-spawn to concentrations of 1, 10, or 100 µg atrazine/L or 1 or 10 ng EE2/L and monitored histological development and transcriptomic changes in gonad tissue. They noted an almost 100% female sex ratio in fish exposed to EE2 at 10 ng/L, likely as a result of sex reversal of males
[73]. Not surprisingly many gonad genes were differentially expressed between the sexes.
Wang et al. examined the effects of 17α-methyltestosterone (MT), an artificial androgenic compound, used to induce masculinization of both secondary sex characteristics and gonads in aquatic studies. The Stone moroko (
Pseudorasbora parva) was exposed to MT, and the growth and development of fish were delayed by exposure to MT at 200 ng/L. RNAseq analyses revealed 7758 and 11,543 DEGs in females and males. MT had more obvious disruption effects on males than females, and this was primarily reflected in the immune system
[74].
Renaud et al., examined the effects of EE2 exposure on the Pacific sardine (
Sardinops sagax) and chub mackerel (
Scomber japonicus). RNA sequencing (RNAseq) was performed on liver RNA harvested from wild sardine and mackerel exposed for 5 h under laboratory conditions to a concentration of 12.5 pM EE2. This revealed that environmental levels of EE2 disrupted basic biological processes and pathways in both male sardine and mackerel, leading to molecular signatures of metabolic, hormonal, and immune dysfunction, as well as carcinogenesis in exposed fish
[75].
Bertucci and colleagues examined chronic exposure to wastewater treatment plant and stormwater effluents at the whole-transcriptome level in the Asian clam (
Corbicula fluminea) and evaluated the physiological outcomes. They uncovered a set of 3181 transcripts with altered abundance in response to water quality. The largest differences in transcriptomic profiles were observed between clams from the reference clean site and those exposed to wastewater treatment plant effluents. Most of the differentially expressed transcripts were involved in signaling pathways involved in energy metabolism suggesting an energy/nutrient deficit and hypoxic conditions in response to the pollutants in the effluents
[76].
Legrand and co-workers exposed copepods (
Eurytemora affinis) to sublethal concentrations of the pesticide pyriproxyfen (PXF) and insecticide chlordecone (CLD). After 48 h, males and females (400 individuals each) were sorted for RNA extraction. In total, 2566 different genes were differentially expressed after EDC exposures compared to controls with similar numbers of DE genes with both compounds. More genes were differentially expressed in males than in females after both exposures
[77].
Guo et al. evaluated the effects of effluents containing phenolic compounds from the Ba River on the ovary of the Sharpbelly (
Hemiculter leucisculus), a freshwater fish using transcriptomic and metabolomic analyses. In fish collected near wastewater discharge, oocyte development was activated, compared to upstream and remote sites. Histopathological alterations were found in the fish ovaries likely a result of upregulated steroid hormone biosynthesis, as suggested by the differentially expressed genes from the RNAseq
[78].
3. The Adverse Outcomes Framework
The Adverse Outcomes Framework (AOF), developed in 2007 with the release of a National Research Council report entitled “Toxicity Testing in the 21st Century,” has become a popular model in the study of toxicology. This model defines an Adverse Outcome Pathway (AOP) as a linear pathway comprised of a molecular initiation event (MIE), key events (KE), and the Adverse Outcome (AO) itself causally linked together (
Figure 1). Using cancer as an example, the MIE would be a single mutation in a gene associated with, for example, control of the cell cycle. This, alone, does not cause cancer but may trigger pathway-level perturbations, so-called key events, that could reduce the expression of tumor-suppression genes and mutate proto-oncogenes into oncogenes. The goal of this framework is to transform current toxicology testing by supporting less animal-intensive alternatives to toxicity testing and predictive ecotoxicology
[79][80].
Figure 1. The adverse outcome pathway (AOP) is a linear pathway composed of a Molecular Initiating Event (MIE), Key Events (KE), and an Adverse Outcome (AO) causally linked together. Example AOPs are illustrated.
Adverse Outcomes are not always the result of a single event. Cancer, for instance, is more common among adults as they age due to genomic damage including point mutations and insertion-deletions that disrupt normal cell function, allowing for the uncontrolled growth and division of cells. The first of these mutations is defined as the MIE, which is then followed by other mutations—KE—that are linked by KE relationships (KERs) and further alter expression in such a way that an adverse outcome is the result. Going beyond the simple, linear AOPs are AOP Networks, quantitative AOPs (qAOPs), and qAOP Networks. An AOP network is a system of interacting AOPs with shared KEs, while qAOPs are AOPs in which the KERs (qKERs) are given a weight/value that quantitatively characterizes the relationship between KEs in an AOP. The AOF identifies qAOPs as the “ideal” form to be used in risk assessment due to this quantification
[79][80].
To help make the AOF easier to comprehend, Villeneuve et al., developed a set of principles that summarize the development of this model
[81]: (1) because any external stress or chemical that can trigger an MIE has the potential to activate a chain of KEs leading to the AOP; AOPs are not specific to given chemicals. (2) AOPs are composed of two fundamental units: KEs and KERs that are usually shared between AOPs. (3) AOPs are not a thorough interpretation of biological processes, but a basic and organized framework for organizing toxicological evidence; this defines individual AOPs as rational units of AOP development. (4) Those networks compiled of numerous AOPs are likely the most functional units of prediction in everyday scenarios, as the prediction of adverse outcomes will often involve contemplation of multiple AOPs. Finally, (5) AOPs are not stationary and will evolve as new insights are offered
[79][81].
As the AOP is a developing concept there are some limitations to the framework that must be addressed to improve its viability. One of the most critical considerations is the ability of AOPs to accurately define scientifically robust connections between the MIE, KEs, and adverse outcomes
[79]. The application of AOP networks and qAOPs, described above, will facilitate more accurate predictions, but at present, there remain gaps between these events. It has been suggested that, for AOPs to be useful as more than a categorizing tool, they must increase trust for risk decisions more than existing approaches, and must be able to reduce animal usage as well
[80]. In short, the AOP framework will need to demonstrate, beyond a shadow of a doubt, that it is effective at predicting adverse outcomes in order for it to be accepted by the scientific community
[79].
As outlined above, EDCs have become common in the environment; though there are efforts to reduce these compounds humans are still potentially exposed to these chemicals. The studies above utilize two approaches for determining the toxicity of these EDCs: using toxicological concentrations of XEs or using levels found in the environment, in other words, the concentrations most people will be exposed to daily. The AOP framework establishes that the development of an adverse outcome is not necessarily completed overnight; rather, a linear series of key events following the initial MIE create a stockpile of genetic damage that, later in life, impact expression in such a way that the adverse outcome is the result.
4. Conclusions
To conclude, endocrine disrupting compounds are ubiquitous in modern life. BPA, for example, is an industrial chemical used in the fabrication of plastics and found in the urine samples of up to 96% of Americans
[82]. It is estimated that 1000 of the 85,000 synthetic chemicals in existence may be EDCs, however, the majority of these have not undergone sufficient testing to confirm this, so the true number may be much higher
[83]. In addition, there are a large number of chemicals that have yet to be fully profiled to determine their capacity to act as EDCs. The cost of profiling each chemical is not insignificant. Genomic and bioinformatics approaches have revolutionized toxicology testing, permitting deep insights into the transcriptomes of cells and tissues following exposure to EDCs and reducing the costs that are associated with traditional toxicity testing. Genomic profiling linked to computational approaches, such as The Open TG-GATES database presents an opportunity to avoid the cost associated with profiling each chemical for its efficacy as an EDC and to accelerate the process using high-throughput informatics
[84][85]. The Adverse Outcome Pathway (AOP) framework integrates these genomic and computational outputs and allows a molecular initiation event, key events resulting from this initial insult, and the Adverse Outcome (AO) itself to be causally linked together in the context of predictive toxicology assessments.