1.1. Human Evolution and Pathogens
Since our origin and migration out of Africa, humans have colonized many new environments and encountered different types of selective pressures to which we have adapted. Pathogens are recognized as one of the strongest selective agents that humans have faced through our recent evolutionary history. Especially from the Neolithic transition, infectious diseases have greatly influenced our innate and adaptive immune defense systems. While epidemic infectious diseases could probably not sustain themselves efficiently in hunter-gatherer small groups, evidence suggests that they started to cause major effects on sedentary and overcrowded agricultural sites
[1]. In turn, animal domestication in Neolithic sites facilitated close contact with animals and higher risk for zoonoses. Several diseases such as malaria, measles, tuberculosis and smallpox are thus likely to have spread with this cultural and environmental shift. Despite the huge health improvements facilitated by the discovery of antibiotics and the development of modern vaccination programs, several recent epidemic outbreaks, such as those caused by several coronaviruses, the Zika virus or Ebola, reminds us, even nowadays, of the importance of our immune response to pathogens and how such infectious agents continue to exert important health pressures among humans
[2].
At a local scale, our past adaptation to these pathogen-driven selective pressures has generated important differences in our immune response across modern human populations that often explain local differential susceptibilities to autoimmune, inflammatory-related traits and cancer
[3]. Moreover, the enrichment of signatures of positive selection found in loci associated with common inflammatory disorders
[4] has been interpreted to support the hygiene hypothesis, which states that the increasing incidence of both autoimmune and allergic disorders we observe today will be partly due to the huge contrast between the environmental pathogen load in which our immune system evolved and the more sterile world in which modern societies live today
[5]. Within this context, the identification of genomic signatures of natural positive selection that are related to our immune system is the first step to not only elucidate potential adaptations to pathogen exposure but also to identify functional variation affecting a wide range of immune-related phenotypes.
1.2. Detecting Local Adaptation in the Human Genome
Natural selection leaves distinctive footprints or signatures in the patterns of variation around adaptive genetic variants, which can be detected when compared with background genome-wide patterns of variation in the genome and/or by considering demographic simulations of populations. The development of new statistical tools and approaches for the detection of selection in the recent years, together with the increasing availability of large catalogues of genetic variability in different human populations, has allowed the identification of hundreds of loci with signatures of selection in our genome
[6][7][8][9]. Most genome-wide scans of positive selection in humans have focused on detecting the selection signatures expected from hard sweeps, where a new mutation is favored and raises rapidly its frequency, sweeping its linked variation. In such a scenario, several statistical tools using intraspecific variation have been developed to capture high population differentiation, site frequency spectrum skews and unusually long-range linkage disequilibrium
[7]. However, over the last few years, new strategies have also been developed to capture the patterns of variation of the soft sweeps resulting from either multiple de novo mutations or from standing variation, as well as on detecting polygenic selection and adaptive introgression
[10]. Multiple immune-related genes have been reported as candidates for positive selection when applying these methodologies
[4]. However, for most of the detected signatures, the individual infectious agents driving the selective pressures are not well understood, neither are the underlying adaptive variants and immunological adaptive responses associated to them. Multidisciplinary strategies integrating functional annotations in our genome such as those available in ENCODE, with results from genome-wide association studies (GWAS), expression quantitative trait loci (eQTLs), cell count quantitative trait loci
[11], cytokine quantitative trait loci (cQTLs)
[12], immune-responsive regulatory variation
[13], in silico predictions from protein structure modeling
[14][15], etc.
[16] have been shown to facilitate the identification of adaptive variants related to our immune response.
1.3. Examples of Positive Selection at Immune Response Receptors
The mammalian immunity relies on receptors from both the innate and adaptive immune systems. The innate immune receptors—also called pattern-recognition receptors (PRRs)—are adapted to the recognition of conserved and broadly shared components of microbial surfaces, which are absent from the host and are essential for microbial viability—the so-called pathogen-associated molecular patterns (PAMPs). There are several functionally distinct classes of PRR, which can be expressed in either secreted or membrane-bound form by innate immune cells (macrophages, dendritic cells or natural killer cells). The best studied PRRs are the Toll-like receptors (TLRs), but additional relevant families include C-type lectin receptors, scavenger receptors, collectins, pentraxins, NOD-like receptors (NLRs) or RIG-like receptors (RLRs), among others. The recognition in the adaptive immune system is mediated by highly polymorphic clonotypic receptors expressed by T and B cells (TCR and BCR, respectively). These receptors recognize highly specific microbial details, which in the case of T cells need to be presented by major histocompatibility complex (MHC) class I or II molecules. As expected, natural selection has acted on both the innate and the adaptive immune receptors, and we briefly discuss some examples.
The region encompassing a cluster of three TLRs in chromosome 4 (
TLR10-
TLR1-
TLR6) provides a notable example of how our adaptation to microbes favored several parallel innate immune responses during our recent evolutionary history. Among others, a non-synonymous substitution (Ile602>Ser) on TLR1 has been shown to impair signaling with a drastic decrease of NF-κB activity and reported as a potential target of positive selection detected in the
TLR10-TLR1-TLR6 cluster in Europeans
[17]. Convergent signals of very recent positive selection in European and Roma populations have also been suggested to result from the plague, since several non-synonymous changes on the
TLR10-TLR1-TLR6 cluster present in Europeans modulate
Yersinia pestis-induced cytokine responses
[18]. In addition, independent events of adaptive introgression from both Neanderthals and Denisovans in non-African populations have also been described in the
TLR10-TLR1-TLR6 region. Notably, the adaptive alleles resulting from these admixture events with archaic humans are associated with increased expression of TLR6, TLR1 and TLR10 in white blood cells, reduced
Helicobacter pylori seroprevalence and increased susceptibility to allergies
[19]. Whereas an impaired TLR-mediated response resulted beneficial in the case of the Ile602>Ser substitution on TLR1, the introgressed alleles of the
TLR10-TLR1-TLR6 cluster probably reinforced innate immune surveillance and reactivity against certain pathogens
[19]. In any case, the selected functional variation resulting from these past local adaptive events, with or without the actual presence of the driving infectious agent, has undoubtedly the potential to influence many distinct inflammatory and allergenic susceptibilities across present-day populations.
It has been suggested that mutations causing deficiency of the scavenger receptor CD36 in African and East Asian populations may have been selected to protect against malaria
[20]. For instance, patterns of extended haplotype homozygosity compatible with the action of recent positive selection and a frequency of up to 0.26 have been observed for a non-sense mutation in exon 10 of
CD36 (Thr1264>Gly; rs3211938) in west central Africans. Although no association with severe malaria was found in that case, alternative evolutionary scenarios were suggested to explain the prevalence of the CD36 deficiency caused by Thr1264>Gly
[20]. Notably, CD36 is not only involved in immunological recognition and molecular adhesion but also in lipid metabolism, angiogenesis and metastasis in cancer
[21]. Thus, adaptive variation on
CD36 facilitating host survival to pathogens could in turn influence a variety of traits and conditions. Within this context, CD36 deficiency has been shown to reduce atherosclerotic lesion formation
[22] but also to cause dyslipidemia, subclinical inflammation and metabolic disorders
[23].
The MHC is the most important region in the human genome influencing our response to infection, inflammation and autoimmunity. The extremely high levels of allelic diversity observed and the sharing of ancestral polymorphisms with other hominoid taxa at the MHC class I and II genes have been suggested to result from overdominant selection, a model in which heterozygote individuals have a higher biological fitness than the homozygotes
[24]. However, frequency-dependent selection and selection that varies over time and space have also been proposed to act on the human leukocyte antigen (HLA) genes
[25]. Interestingly, between-population variation at the HLA class I genes were found to be positively correlated with local pathogen richness (notably for the
HLA-B gene), thus supporting the hypothesis that pathogen-driven selection may have created the high polymorphism levels observed in this MHC complex
[26]. Notably, unusually high frequency extended haplotypes comprising variants associated with systemic lupus erythematosus (SLE), multiple sclerosis (MS) and type I diabetes were identified for the
HLA-DR2,
HLA-DRB1 and
HLA-C loci in samples of European descent
[27], illustrating again how the identification of immune loci with signatures of recent positive selection may be a good strategy to identify functionally relevant variation and suggest candidates to test for association to immune related diseases
[4][28]. Furthermore, single nucleotide polymorphisms (SNPs) on the HLA region associated to type I diabetes, SLE, psoriasis (Ps) and rheumatoid arthritis (RA) have also been shown to present values of the integrated haplotype score (iHS) statistic, indicative of recent positive selection in samples of European origin
[4].