Circum-Saharan Prehistory through the Lens of mtDNA Diversity

Circum-Saharan Prehistory through the Lens of mtDNA Diversity: History

Please note this is an old version of this entry, which may differ significantly from the current revision.

Subjects: Evolutionary Biology | Archaeology | Anthropology

Contributor: Viktor Černý

African history has been significantly influenced by the Sahara, which has represented a barrier for migrations of all living beings, including humans. Major exceptions were the gene flow events that took place between North African and sub-Saharan populations during the so-called African Humid Periods, especially in the Early Holocene (11.5 to 5.5 thousand years ago), and more recently in connection with trans-Saharan commercial routes. The research indicates that maternal gene flow must have been important in this circum-Saharan space, not only within North Africa and the Sahel/Savannah belt but also between these two regions.

Sahel/Savannah belt
North Africa
mtDNA diversity
population history

1. Introduction

The out-of-Africa event, during which a relatively small group of anatomically modern humans spread from East Africa into Eurasia [1], was a defining moment in the evolution of modern humans. Although paleoanthropology has detected several older waves of Middle Pleistocene migrations from Africa to Eurasia [2], genetic studies show that contemporary non-sub-Saharans are descendants of an ancestral population that spread from Africa only about 60 ka (thousands of years ago) [3]. Leaving aside the long-term isolation of Khoisan populations in southern Africa and of the Pygmies in the tropical rain forests of central Africa, the out-of-Africa event is nowadays considered to be the most significant restriction of gene flow between two groups of anatomically modern humans: the sub-Saharans and non-sub-Saharans had been separated by the Sahara Desert throughout most of prehistory. Differentiation between these groups is apparent in both mitochondrial (mtDNA) [4] and nuclear [5] DNA diversity. Due to this separation, we can detect different mtDNA haplogroups, which can be assigned either a sub-Saharan or Eurasian ancestry [6]. While the basis of the sub-Saharan mtDNA gene pool is classified as macro-haplogroup L, the rest of the world nowadays traces its maternal ancestry from haplogroup N or M [7].

That, however, does not mean that after the out-of-Africa event migration had stopped. The genetic structure of inhabitants of the Sahel/Savannah belt was analyzed by researchers with respect to their linguistic affiliation, subsistence structure, and geographic localization of local populations [8,9,10], and both population genetics and phylogeographic studies highlighted the significance of gene flow. The Sahel/Savannah belt has therefore been called a “bidirectional corridor of migrations” [11] and evidence of gene flow was also detected across the Sahara in populations inhabiting regions between the Sahel/Savannah belt and North Africa [12,13].

Interestingly, while some migrations may have had an ethnic association, others did not. For instance, the origin of the Chadic-speaking peoples living in the Lake Chad Basin was traced to East Africa based on linguistic evidence [14]. According to this theory, the ancestors of current Chadic-speaking peoples migrated, still as nomadic herders, from the Nile Valley through Wadi Howar to the Ennedi Mountains, and further through Wadi Hawash up to the Lake Chad Basin. Genetically, the Chadic-speaking peoples nowadays harbor mtDNA sequences belonging to the L3f haplogroup with East African ancestry, especially a private branch L3f3, which formed during their westward expansion of about 8 ka [15]. On the other hand, another mtDNA haplogroup, called L3e5, which was also detected in populations living today in the Lake Chad Basin but not only in Chadic-speaking populations, is also present in the Maghreb and its origin can be traced to an ancestral population that crossed the green Sahara during the Early Holocene approximately 10 ka [13].

Sahelian populations also carry Eurasian mtDNA haplogroups. They are found more frequently in the nomadic pastoralists than in sedentary farmers [16] and a surprising finding showed that some sub-Saharan Africans and even Northern Eurasians share some very recent maternal ancestry. For instance, it was shown that a Saami from Scandinavia and a Yakut from Siberia share with a Berber and a Fulani mtDNA sequences belonging to haplogroup U5b1b [17]. Given the enormous geographical distances between these populations, the most plausible explanation is that the most recent common ancestor (~8.6 ka) lived probably in southwestern Europe, from where the descendants spread both to northern Eurasia and sub-Saharan Africa. A more detailed study of originally Eurasian lineages beyond the Sahara has shown that not only U5b1b but also the H1 haplogroup (which both occur mainly in the Fulani pastoralists) came to form the new and younger sub-Saharan lineages called U5b1b1b and H1cb1 [18]. Their most recent common ancestor (~4 ka) dates to the time when, according to archaeology [19,20,21], the first herders settled in the western Sahel/Savannah belt.

It may therefore seem that the pastoralist food-production strategy did not spread to sub-Saharan Africa by demic diffusion from the Near Eastern domestication center via northeastern Africa, but through the ancestors of Berbers from the Maghreb. In this context, it should be noted that the genetic architecture of the circum-Mediterranean space had undergone substantial changes since the Neolithic. For instance, ancient Near Eastern farmers are genetically better represented by the current populations of central and western Mediterranean, such as the Sardinians and the Basques [22,23], than by the current populations of the Near East.

The importance of post-Neolithic gene flow from northwestern Africa to the western part of the Sahel/Savannah was also suggested by research on lactase persistence. In fact, the Fulani pastoralists from Burkina Faso share with Europeans the extended haplotype carrying Eurasian variant −13,910*T. It was suggested that their ancestors received this haplotype via admixture with the Eurasian population two times [24]. The first event is genetically dated to ~1828 years ago and the second one to ~302 years ago, whereby it seems that the admixture involved a group related to southwestern Europeans. Moreover, the geographical distribution of lactase persistence variants in the Sahel/Savannah belt shows clear differences between the pastoralists in the east (mostly Arabs harboring variant −13,915*G) and the west (mostly Fulani harboring variant −13,910*T) [25]. In fact, a boundary between the western and eastern Sahelian genetic spaces lies somewhere near the Lake Chad Basin, as attested not only by lactase persistence but also by a genome-wide SNP study [26].

Last but not least, it was shown that Sahelian pastoralists tend to represent several mutually similar mtDNA haplotypes, which indicates either more recent origins of their diversity, isolation of their demes, lower gene flow, or lower effective size of the population [10]. Interestingly, thanks to coalescence analyses, it was possible to show there is an asymmetric gene flow between the pastoralists and the farmers in both parts of the Sahel/Savannah belt: while the western (Fulani) pastoralists are losing their mtDNA diversity, the eastern (Arabs) pastoralists are gaining it by admixture with local sub-Saharan agricultural populations [27]. This is further supported by the presence of various sub-Saharan mtDNA haplotypes in the gene pool of Arabic-speaking populations [9], mostly non-carriers of the lactase persistence −13,915*G variant [28]. Interestingly, this genetic observation might correspond to a process of Arabization and/or language shift after the expansion of Arabs and their culture from North Africa into the Lake Chad Basin, from the 14th century AD onwards [29].

The above-mentioned studies show that inclusion of newly collected local populations, especially from the Sahel/Savannah belt, has significantly contributed to our knowledge of the peopling of Africa north of the equator by discoveries of not only new variants—which happens quite commonly when a new dataset of a sub-Saharan population is presented [30]—but even of entire new mtDNA haplogroups. In fact, since all new sub-Saharan population studies published so far revealed new genetic variants, one ought to admit we are so far aware of merely a fraction of the genetic diversity of sub-Saharan populations [31,32].

Because sub-Saharan Africa is still underrepresented in population genetic and genomic studies [30], we compiled a large mtDNA database composed of both newly collected and previously published mtDNA sequences and produced an updated survey of migration patterns in the circum-Saharan space. Additionally, we performed a complete mtDNA sequencing of the N1 haplogroup from sub-Saharan Africa, with most samples from the Sahel/Savannah belt but some also from East Africa. The N1 haplogroup’s southwestern Asian ancestry is well known and goes as far as to ~60 ka [4], but its African phylogeny is still not well understood. We selected the N1 because this haplogroup was reintroduced back to the Sahel/Savannah belt by migration from southwestern Asia, possibly via North Africa, as became apparent when a related basal branch was recently discovered in a North-African skeleton (Takarkori rock shelter, Libya) dated to ~7 ka [33]. We can thus assume that phylogeny of this specific haplogroup could document an ancient gene flow back to Africa in the eastern circum-Saharan region.

2. Current Insights

An mtDNA dataset containing 7213 mtDNA sequences in 134 African populations—which is much more than used in a previous studies [9,56]—and covering the entire circum-Saharan space had significantly contributed to our understanding of African population history north of the equator. First of all, we were able to show that North African populations have lower values of nucleotide diversity, especially in the western part of the region. This can be attributed to their lower effective population sizes, as attested also by the Bayesian coalescent approach employed in this study. When contrasting pastoralists with farmers, we found similar distributions of diversity values, which supports our previous finding of no significant structure associated with the subsistence strategy in the Sahel/Savannah belt [10].

The lower level of differentiation among the populations of North Africa than among the populations of the Sahel/Sudan belt supports the idea of higher migration activity homogenizing the gene pool and eroding the population structure in the southern Mediterranean space. It is well possible that there was a long-range influx of population(s) from the Near East to the Maghreb already in preagricultural times [50]. In fact, this finds support in recent aDNA analyses of Iberomaurusian skeletons [57]. Interestingly, both Natufian and Iberomaurusian specimens show a high level of Basal Eurasian ancestry, which was a population isolated > 50 ka in a Late Pleistocene refugium of the Arabo-Persian Gulf without contacts with the Neanderthals [58]. Further immigration to North Africa took place in the Neolithic and in later times both from the Near East [59,60,61] and from Europe via the Strait of Gibraltar [62,63,64,65]. According to our results, it seems that this expansion through populations of farmers reached all the way to the western Sahel/Savannah belt. It should also be noted that the general genetic homogeneity of North African populations is reflected in the linguistic homogeneity of Afro-Asian languages.

Our results which suggest that the eastern part of North African populations received immigrants from the Sahel/Savannah belt do not correspond with research on autosomal SNP variants, which had shown that the populations of Egypt and Libya are composed predominantly of a Near Eastern genetic component with very low input from sub-Saharan Africa [50]. However, that may be due to the fact that the last-mentioned study worked with limited sub-Saharan (and not really Sahelian) samples as the putative sources of migration to their North African datasets. In fact, a subsequent study revealed in some North African populations (e.g., in Algeria) a higher gene flow from the sub-Saharan space, especially in maternal lineages [66]. In Egypt, the importance of a migration corridor via the Nile Valley was described a number of times both in archaeology [67] and in genetics [68,69]. Moreover, many eastern Sahel/Savannah populations also have an admixture of sub-Saharan and Eurasian ancestries; especially the Arab groups have an important Eurasian component. This is thus why the sub-Saharan input in North Africa is larger in the west than in the east. Another point is that the sub-Saharan influence in North Africa was mainly via maternal lines: higher in mtDNA, almost nonexistent in the Y-chromosome, and intermediate in the autosomal DNA [70].

Migrations in the Sahel/Savannah belt were probably less important than in North Africa. When we look at the continent-wide African mtDNA diversity, the Sahel/Savannah belt can be viewed as a corridor between the Sahara and tropical rainforests which connects eastern and western Africa but also has—especially in the Lake Chad Basin—some distinctive genetic features [11]. Food-producing strategies came to play an important role in demographic expansions in this region later than in North Africa, especially in the Holocene. The first expansion may have been related to pastoralism, which is a strategy perfectly well-adapted to Sahelian dryland ecosystems [71]. It spread through the Sahel/Savannah belt from northeastern Africa during the Holocene [72,73,74]. An expansion of herders started ~8 ka in northeastern Africa but was relatively slow because it reached the western Sahel/Savannah belt much later, at about 3 ka [75].

The cultivation of cereals and tubers, which is autochthonous in the Sahel/Savannah belt (especially in and around the Middle Niger Delta [76,77,78,79] and in the Middle Nile Valley [80]) was somewhat delayed because the first fully domesticated plants were consumed in Africa only about 4.5 ka. At present, the majority of Sahelian economics is based on mixed agro-pastoralism [81] but in many Sahelian countries, we still find purely nomadic pastoralists. We can thus see that a somewhat delayed spread of a particular food-production subsistence strategy may result in a lower migration activity and higher population differentiation. Despite some morphological differences detected between present-day full-time nomadic pastoralists and sedentary farmers [82], our genetic analyses show that lifestyle cannot be considered a determinative parameter of population sub-structuring in Africa, at least when considering the entire circum-Saharan region and not just the Sahel/Savannah belt where the biological separation of lifestyles may have played a more important role.

It seems that periodically, both in a long-term and short-term view, an increase in the size of the shallow Lake Chad in the middle of the east-western Sahelian corridor presented an obstacle to gene flow, forming a cul-de-sac. This can be documented by the spread of two different populations of nomadic pastoralists: in the west the Fulani, such as Woɗaaɓe, and in the east the Arabs, such as Baggara or Shuwa. While the Arabs are of Eurasian (Arabian) ancestry and received gene flow from sub-Saharan Africans [28], the Fulani are of western African ancestry and their ancestors acquired some Eurasian ancestry by admixture with a northern African population possibly related to the Berbers [24,27]. Since the level of this admixture is relatively high (analyses show around 20% of a Eurasian component), it might be responsible for the noticeable differences between local Fulani populations and the surrounding relatively homogeneous Sahel/Savannah gene pool. On the other hand, the genetic diversity of sedentary farmers suggests that they lived in reproductively more isolated groups, which led to genetic drift and isolation by distance [10]. Interestingly, this can be associated also with the genetic diversity of their pearl millet landraces grown by different ethnolinguistic groups, especially in the western part of the Lake Chad Basin [83].

Finally, we found that the N1 haplogroup, which diversified in southwestern Asia some 55 ka [4] and its younger lineages expanded across all Eurasia, is present also in eastern Africa and the Sahel/Savannah belt. On the other hand, the age estimates of N1 mitogenomes we detected in our Sahelian populations are much younger and thus congruent with population contacts via Ba el-Mandeb or the Red Sea [84], not via the southern Mediterranean space. Moreover, two ancient samples from Takarkori rock shelter in Libya dated to ~7 ka [33] are highly distinct from our N1 Sahelian and eastern African samples because they branched off before all the current N lineages (far away from N1). We did not find any traces of local expansion of mtDNA N1a lineages into the Sahel/Savannah belt, as described for example for the Y chromosome R1b-V88 haplogroup [85,86] or for L3f3 mtDNA haplogroup [15]. This kind of (maternal) Eurasian N1 impact is visible mainly in eastern Africa, especially in Sudan, Somalia, Ethiopia, Kenya, and Tanzania, and not further in the central or even western part of the Sahel/Savannah belt close to the Lake Chad Basin. In fact, migration associated with this N1 African diversity might be associated with the Late Pleistocene/Holocene expansions in Arabia and the neighboring region and more recently also with the spread of Ethiosemitic languages into Ethiopia at ~3 ka (for example N1a3a+195C!), which continued further to South Africa [87] and not to the west.

This entry is adapted from the peer-reviewed paper 10.3390/genes13030533

© Text is available under the terms and conditions of the Creative Commons Attribution (CC BY) license; additional terms may apply. By using this site, you agree to the Terms and Conditions and Privacy Policy.