Exposome and Asthma: Comparison
Please note this is a comparison between Version 1 by Alicia Guillien and Version 2 by Vicky Zhou.

Asthma is a widespread respiratory disease caused by complex contribution from genetic, environmental and behavioral factors. For several decades, its sensitivity to environmental factors has been investigated in single exposure (or single family of exposures) studies, which might be a narrow approach to tackle the etiology of such a complex multifactorial disease. The emergence of the exposome concept, introduced by C. Wild (2005), offers an alternative to address exposure–health associations. 

  • asthma
  • exposome

1. Introduction

Asthma is a heterogeneous chronic respiratory disease characterized by an inflammation of the airways and which manifests by variable respiratory symptoms (wheeze, shortness of breath, chest tightness and/or cough) and variable expiratory airflow limitation [1]. Asthma affects approximately 300 million children and adults worldwide [2]. The prevalence of asthma has dramatically increased over the last decades [3]. The huge research efforts in identifying the causes of asthma led to the identification of genetic (such as the 17q21 ORMDL3/GSDML region for early childhood onset asthma), environmental (such as urban vs. rural area) and lifestyle risk factors (such as tobacco smoking). It also highlighted the complex etiology of this multifactorial disease, e.g., by the identification of specific windows of susceptibility and complex gene-by-environment interactions [4]. The exposome concept, introduced in the recent years to complement the genome for a better understanding of the development of complex diseases [5], offers new avenues in environmental epidemiology. In this review, the main objective was to present how the new methodological framework represented by the exposome has been applied to asthma research to date. After presenting an overview of the concept of exposome, we will review different statistical approaches to study the exposome–health associations. Finally, recent studies linking multiple families of exposures to asthma-related outcomes will be discussed.

2. Exposome-Health Associations in Practice

Exposome studies imply collection of a large number of exposures. This can be done relying on different methods of assessment (e.g., self-reported questionnaire, exposure biomarkers, geographic information system-based (GIS) models, personal sensors, …), for different time windows (pre-natal, early postnatal, during childhood, adolescence, adulthood), and, for external factors, with different locations (home, school, work) and spatial resolutions (e.g., urban indicators measured for various buffers (100, 300, 500 m)). From a methodological point of view, this large number of variables (possibly larger than the size of the study population) raises issues in terms of statistical power and false discovery rate [6][32]. Indeed, the multiplicity of tests implies the rise of the alpha risk, and methods developed to correct the p-value of an association for multiple hypothesis testing [7][8][9][33,34,35] lead to a decreased statistical power. Therefore, exposome–health association studies deserve a sufficient sample size to achieve adequate statistical power to detect associations of low to moderate associations sizes, as expected for most exposures [10][11][36,37]. Several other statistical challenges specifically linked to exposome studies have to be taken into account, such as the increased false discovery rate related to the high level of correlation between exposures and the difficulty to consider “mixture” effects [12][13][38,39]. Until now, no consensus establishing which statistical methods are to be used in exposome–health association studies has been reached [14][40]. However, some simulations have allowed the comparison of the performance of various methods in the exposome research context under some specific settings. For example, simulation studies compared i) the efficiency of various regression-based approaches in terms of false positive rate and sensitivity, with and without interactions between exposures [6][13][32,39]; ii) the performance of variable selection models in case-control studies [15][41]; iii) the performance of variables and function selection methods in the case of nonlinear effects of correlated exposures [14][40]; and iv) methods to correct for classical-type exposure measurement error [16][31]. Using the findings of these studies and a review of the literature, we summarized in TableTable 1 1 the strengths and weaknesses of the main statistical approaches used in exposome studies in the field of respiratory health.

Table 1. Main statistical methods used in exposome-health association studies.

Type of Analysis Examples of Methods Strengths Weaknesses Reference of the Method Use of the Method in Exposome or Asthma Field
Single-exposure regression-based method Exposome-Wide Association Study (ExWAS)
  • Standardized method
  • High sensitivity to identify true predictors
  • Simple interpretation
  • Easy to summarize the results in a figure (e.g., volcano plot)
  • Interaction between exposures is not tested
  • Results do not account for confounding effect by co-exposures
  • High false discovery rate
Patel et al., 2010 [17]Patel et al., 2010 [42] Sbihi et al., 2017 [18]; North et al., 2017 [19]; Lepeule et al., 2018 [20]; Agier et al., 2019 [21]; Vrijheid et al., 2020 [22]; Agier et al., 2020 [23];

Warembourg et al., 2019 [24];

Nieuwenhuijsen et al., 2019 [25];

Granum et al. [26]
Sbihi et al., 2017 [43]; North et al., 2017 [44]; Lepeule et al., 2018 [45]; Agier et al., 2019 [46]; Vrijheid et al., 2020 [47]; Agier et al., 2020 [48];

Warembourg et al., 2019 [49];

Nieuwenhuijsen et al., 2019 [50];

Granum et al. [51]
Multiple-exposures regression-based methods Deletion–Substitution–Addition (DSA) algorithm
  • All exposure variables are considered in a unique model with possibility to include interactions
  • The selected model is able to account for confounding effect by co-exposures
  • Low false discovery proportion to identify true predictors
  • Moderate sensitivity to identify true predictors
  • Instability
  • Time-consuming and thus not adapted for exposome of more than a few hundred variables
Sinisi and van der Laan 2004 [27]Sinisi and van der Laan 2004 [52] Agier et al., 2019 [21]; Vrijheid et al., 2020 [22]; Agier et al., 2020 [23]; Warembourg et al., 2019 [24]; Nieuwenhuijsen et al., 2019 [25] Granum et al. [26]Agier et al., 2019 [46]; Vrijheid et al., 2020 [47]; Agier et al., 2020 [48]; Warembourg et al., 2019 [49]; Nieuwenhuijsen et al., 2019 [50] Granum et al. [51]
Elastic Net (ENET) and Least Absolute Shrinkage and Selection Operator (LASSO)
  • Able to deal with correlated variables
  • The selected model is able to account for confounding effect by co-exposures
  • Good prediction performance
  • Moderate sensitivity to identify true predictors
  • Instability
Zou and Hastie 2005 [[28]; Tibshirani 1996 [29]Zou and Hastie 2005 [53]; Tibshirani 1996 [54] Pries et al., 2019 [30]; Cowell et al., 2019 [31]Pries et al., 2019 [55]; Cowell et al., 2019 [56]
Weighted Quantile Sum (WQS) regression
  • Able to deal with multicollinearity
  • The use of quantiles reduces the impact of outliers
  • Not able to consider categorical exposures
  • All exposures must be associated with the outcome in the same direction (i.e., all protective or all risks factors)
Carrico et al., 2015 [32]Carrico et al., 2015 [57] -
Supervised clustering approaches Latent Class Analysis (LCA)
  • Suitable for longitudinal data (Latent Transition Analysis [33][58])
  • Able to consider the outcome in a supervised approach
  • Not able to deal with continuous exposures
  • Model requires low correlation between variables
  • Interpretation of results may be difficult in case of large number of clusters
  • Limited dimension of the exposome (in relation to the sample size)
Goodman et al., 1974 [34]Goodman et al., 1974 [59] Buck Louis et al., 2019 [35]; Harmouche-Karaki et al., 2019 [36]Buck Louis et al., 2019 [60]; Harmouche-Karaki et al., 2019 [61]
Bayesian Profile Regression (BPR)
  • Consider all exposure variables in a unique model
  • Able to determine the number of clusters minimizing the least-squared distance to the probability matrix
  • Able to deal with combined continuous and categorical variables
  • Computing time
  • Interpretation of results may be difficult in case of large number of clusters
  • Unstable method
Molitor et al., 2020 [37]Molitor et al., 2020 [62] Berger et al., 2020 [38]; Belloni et al., 2020 Berger et al., 2020 [63[39]]; Belloni et al., 2020 [64]
Analysis accounting for the hierarchical structure of the data Meet-in-the-Middle (MITM)
  • Considers the hierarchical layers in the exposome and the causal link between them to better document the causality in exposome–health associations
  • Needs an a priori selection of intermediate layers
Chadeau-Hyam M et al., 2011 [40].Chadeau-Hyam M et al., 2011 [65]. Vineis et al., 2020 [41]; Jeong et al., 2018 [42]; Cadiou et al., 2020 [43]Vineis et al., 2020 [66]; Jeong et al., 2018 [67]; Cadiou et al., 2020 [68]
Bayesian Kernel Machine Regression (BKMR)
  • Use of a smooth kernel function able to deal with non-monotonic exposure-outcome relationship
  • Able to deal with a priori knowledge about group of exposures
  • Able to deal with multicollinearity
  • Not able to deal with categorical outcomes
  • The hierarchical variable selection option can select only one variable per group
Bobb et al., 2015 [44]Bobb et al., 2015 [69] Berger et al., 2020 [38]Berger et al., 2020 [63]

3. Conclusions

Asthma is a widespread multifactorial disease, which deserves a comprehensive approach to better understand its etiology and development. Although most of previous studies in environmental epidemiology focused on a single exposure (or single exposure family), with the recent emergence of the exposome concept, several studies and European projects have started to assess the effect of multiple exposures on respiratory health. These studies are expected to contribute to a better understanding of the associations between the environment and health by using various holistic approaches. Although the first association studies between the exposome and asthma-related outcomes conducted so far mainly rely on the ExWAS method for successive single-exposure analysis and the DSA algorithm for multi-exposures analysis [21][23][24][25][26][45][46,48,49,50,51,76], further studies on larger sample size should attempt to apply more comprehensive statistical approaches, either able to account for the hierarchical structure of the multiple layers of the exposome or to account for the possible mixture effects in order to be more consistent with the complex structure of exposure data.