The core collection is a small subset that minimizes genetic redundancy while preserving the maximum genetic diversity of the entire population. Research on the core collection is crucial for the efficient management and utilization of germplasm resources.
1. What Is a Core Collection?
Germplasm resources serve as a crucial material basis for genetic research and help in the identification and utilization of genes and traits that are of economic and ecological importance
[1]. Therefore, the preservation and utilization of germplasm resources are of significant importance for the development of new crop varieties, and a large number of germplasm banks have been established
[2]. However, due to their vast amount, diverse structure and incomplete information on germplasm resources, the available diversity that has been collected may not be fully and effectively utilized
[3][4][5].
To obtain a germplasm bank that is both practical and representative, Australian scholars Brown
[5] and Frankel
[6] proposed the concept of core collection in 1984. Core collection refers to selecting a part of the entire germplasm resource through certain methods with the goal of representing the genetic diversity of the entire germplasm resource with a minimum number of resources. The theoretical basis supporting this concept is the theory of neutral mutations and the hierarchical structure model of genetic diversity
[7]. A good core collection should have the following characteristics: representativeness, low redundancy, manageability, data completeness, and usability
[8]. A core collection provides more reliable data and samples, makes it easier to optimize genotype/molecular marker-phenotype association studies, improves the utilization efficiency of the germplasm, and accelerates the breeding process
[9][10][11].
2. The Progress of Core Collection
In considering world experience and the formation of core collections that the literature reveals, researchers focus on the following questions:
-
Have core collections been formed for a diversity of plants?
-
How can researchers effectively construct a representative core collection?
-
How well can the core collection be utilized?
Our responses to these questions are summarized below.
3. Diversity of Core Collections
Core collections preserve the genetic diversity of the original population as much as possible, which promotes the effective use and protection of germplasm resources
[4][5][6]. Based on this, many core collection research studies have been conducted both domestically and internationally. Researchers summarizes the development and research of core collections of 146 plant species over the most recent 10 years, which are listed in
Table 1 below. The table shows that core collections have been developed mainly in economic crops and fruit trees; meanwhile, forages have been recently exploited for core collection establishment, including
Buchloe dactyloides (Nutt.)
[12],
Cynodon Rich.
[13] and
Bromus inermis Leyss.
[14]. However, core collections of endemic afforestation tree species are still limited, although some have been reported, such as those of
Cunninghamia lanceolata (Lamb.) Hook.
[15],
Robinia pseudoacacia L.
[16],
Populus tomentosa Carrière.
[17],
Pinus massoniana Lamb.
[18], etc. In addition, only a few of these studies have focused on spice crops, and the core collection that has been constructed is dominated by
Santalum album L.
[19].
Table 1. List of plant species that have been core collection-developed in recent years.
Species Category |
Name |
Grain crops |
Cereals |
maize [20][21][22][23], sorghum [24], coix [25], hulless barley [26], rice [27], wheat [9][28], oat [29], buckwheat [30], pearl millet [31], foxtail millet [32], peanut [33] |
Potatos |
sweet potato [34], cassava [35] |
Pulses |
chickpea [36], Pigeonpea [37], lima bean [38], soybean [39][40], rice bean [41], commom bean [42], faba bean [43], mung bean [44] |
Horticultural crops |
Vegetables |
cauliflower [45], rapeseed [46], Cabbage [47], tomato [48][49], spinach [50], amaranth [51], bitter gourd [52], Jerusalem artichoke [53], yam [54], cucumber [55], pumpkin [56], white gourd [57], pepper [58], sweet pepper [59], eggplant [60], radish [61], Turnip [62], oyster mushroom [63], perilla [64], Pyropia haitanensis [65] |
Fruits |
pricot [66][67][68][69], pear [70][71], jujube [72][73][74], grape [75], melon [76], watermelon [77], kiwifruit [78], pomegranate [79], litchi [80][81], olive [82], apple [83][84][85], peach [86], cherimoya [87], fig [88], sweet cherry [89], pomelo [90], persimmon [91], sugarcane [92] |
Ornamental plants |
Cymbidium ensifolium [93], Chrysanthemum morifolium [94][95], Prunus mume [96], Chimonanthus praecox [97], Rosa rugosa [98], Lilium brownii [99], Paeonia suffruticosa [100], Lagerstroemia indica [101], Helianthus annuus [102], Sophora moorcroftiana [103] |
Herbs |
Fallopia multiflora [104], Astragalus [105], Scutellaria baicalensis [106], Angelica biserrata [107], Glycyrrhiza [1], Cornus officinalis [108], Dalbergia Odorifera [109] |
Spice |
Santalum album [19] |
Teas |
Guizhou tea [110], Chinese tea [111][112] |
Beverages |
Coffee [113], Theobroma cacao [114] |
Fibers |
cotton [115], upland cotton [116], island cotton [117], ramie [118] |
Oilseeds |
safflower [119], sesame [120] |
Forages |
Buchloe dactyloides [12], Cynodon [13], Medicago truncatula [121], Bromus inermis [14] |
Trees |
Catalpa bungei [122], Catalpa fargesii [123], Saccharum spontaneum [124], Populus deltoides [125], Populus tomentosa [17], Cinnamomum camphora [126], Phoebe bournei [127], Robinia pseudoacacia [16], Torreya grandis [128], Tetracentron sinense [129], Xanthoceras sorbifolia [130], schima superba [131], Sapium sebiferum [132], Fraxinus chinensis [133], Eucommia ulmoides [134], Saccharum arundinaceum [135], Corylus avellana [136], Juglans regia [137][138], Betula platyphylla [139], Betula luminifera [140], Sinojackia huangmeiensis [141], Castanopsis hystrix [142], Morus alba [143], Castanea mollissima [144], Castanea sativa [145], Cunninghamia lanceolata [15], Cryptomeria japonica [146], Eucalyptus cloeziana [147], Eucalyptus urophylla [148], Ceratonia siliqua [149], Argania spinosa [150], Pinus massoniana [18], Pinus yunnanensis [151], Ginkgo biloba [152], Akebia trifoliata [153], Camellia oleifera [154], Cornus wilsoniana [155] |
4. Procedure of Constructing a Core Collection
The development of the core collection has been extensively studied from various perspectives, such as sampling strategies, core size determination, and analysis methods, among others. However, due to the wide variation in the growth habits and reproductive characteristics of various plants, there is no universal core collection construction method. Generally, the construction of the core collection mainly includes four steps: the collection and organization of data, the grouping of accessions, the determination of sampling strategies and the testing and evaluation of the core set
[1][8].
5. Evaluation of Core Collection
While the core collection is constructed based on available data, the important question remains: does the core set accurately represent the diversity of the original population?
Brown proposed that the core collection should represent 70% or more of the trait characteristics and genetic variations of the entire germplasm
[4]. To validate the effectiveness of the core collection, it should be evaluated from two aspects: firstly, to test the representativeness of the genetic diversity of the entire collection and, secondly, to assess its practicality in production
[58]. Generally, at the molecular level, the main genetic diversity indices include the allele number (Na), effective allele number (Ne), Shannon’s information index (I), Nei’s genetic diversity index (H), polymorphism information content (PIC), observed heterozygosity (Ho) and expected heterozygosity (He)
[156]; among these, allelic richness is considered the most relevant indicator. Maximizing allelic richness means preserving the germplasm resource with the most abundant genetic diversity. At the phenotypic level, the evaluation parameters include the mean difference percentage (MD), variance difference percentage (VD), coincidence rate of range (CR) and variation coefficient changing rate (VR)
[8][68][156]. Usually, the core collection is considered representative only when the MD is less than 20%; the CR is more than 80%
[156]. A lower value in MD and a higher value in VD, CR, and VR could be considered to indicate a more representative core collection
[13][58]. In addition, principal component analysis (PCA) plots have been widely used to compare the distribution characteristics between the core collection and the initial population
[13]. Moreover, correlation analysis is commonly conducted to infer whether the inherent relationship between traits in the original collection is well retained in the core group
[13]. Recently, Odong et al.
[8] proposed two new criteria based on genetic distance to evaluate the quality of the core collections. These criteria offer the advantage of simultaneously considering all variables describing the accessions and provide intuitive and interpretable results compared to the univariate criteria generally used in core collection evaluations. Additionally, after establishing a core collection, it is essential to establish a comprehensive management system for breeding, seed supply, and exchange as soon as possible to ensure the distribution, sharing, and effective utilization of the core set.
In short, the evaluation criteria of core collection should be variable, and flexible evaluation methods should be tried according to the new situation. The selection of the most suitable evaluation method should depend upon the purpose of core collections
[8]. Moreover, core collection establishment is a dynamic process
[147] that needs to be regularly updated by the addition of new entries and the removal of duplicates to improve representativeness and maintain dynamism
[119].