1. The Null Allele Problem
The grapevine, being one of humanity’s oldest crops, has significant economic and cultural value. Climate change is the most major challenge confronting viticulture today, and its long-term impacts are likely to be substantially minimized by new vine varieties that are better adapted to the environment. This emphasizes the significance of breeding. However, breeding success is dependent on an accurate understanding of the genetic diversity in the starting material.
A null allele is an allele that results in the complete absence of a gene product or function. The best-known example of a null allele is the human AB0 blood group system, where allele “0” is considered a null allele because it does not produce a phenotype or the presence of allele “A” or “B” masks it (people with genotype “A0” and “AA” are the same, blood group “A”)
[1].
Molecular and genetic markers based on PCR reactions (e.g., SSR) often show codominant inheritance, meaning that there may be alleles at certain loci that cannot be detected. Such alleles can be considered null alleles. However, it should be noted that null alleles are not exclusively associated with codominant inheritance. Dominant PCR-based methods (RAPDs, etc.) may also contain null alleles. Array-based SNPs also have null alleles and many of the same considerations apply, as SSRs. GWS (genome Wide Sequencing), on the other hand, does not have this problem unless conducted at low depth and then the missing data are estimated through imputation.
If A
n is a null allele, then the A
iA
n and A
iA
i genotypes are indistinguishable. If an individual is homozygous for such a null allele, no product is formed in the PCR reaction and genotyping fails
[2].
The failure to detect an allele in a PCR-based genetic marker may be due to several reasons:
-
The primers used in the PCR reaction fail to bind to the DNA because the DNA sequence is different from the conservative reference sequence on which the detection is based
[3]. This problem can also be caused by inappropriate primer design.
-
Amplification of alleles of different sizes may differ, with “longer” alleles sometimes not amplified
[4].
-
The amount of DNA in the sample can also cause the lack of detection, because using the same DNA extract yields a PCR product in some loci but not in others
[5][6].
For several reasons, null alleles are a problem in the use of genetic markers. In the most common applications in pedigree analyses or paternity tests, they can cause the erroneous exclusion of one or both parents by implying the putative parent in a locus to be homozygous when in fact it is heterozygous for a null allele
[1]. For example, crossing a mother with genotype A
iA
i and a father with genotype A
jA
n gives a 50% chance of the offspring being of genotype A
iA
n, which apparently excludes the true father
[2][7].
In population genetics studies, the presence of null alleles can result in an apparent reduction in the proportion of heterozygotes, which can greatly confound the assessment of the genetic diversity of a population. A null allele can lead to an overestimation of the frequency of detectable (non-zero) alleles, which can lead to a misrepresentation of population structure
[2].
2. The Null Allele Problem in Pedigree Reconstruction in Viticulture
2.1. The Importance of Pedigree Reconstruction from the Grape Breeder’s Point of View
Grape variety performance is a genetically based complex polyfactorial feature whose expression is greatly impacted by ecological and agronomic conditions and is eventually reflected in yield evolution and synthetic varietal value
[8][9].
Crosses in grape breeding must be designed and executed with the chosen grape types in mind. The grape varietals predicted to pass on the desired characteristic should be known to the breeder
[10][11].
Kozma
[9] authored a book on grape breeding that goes into great length about inbreeding and the heterosis effect when crossing and selfing grape cultivars. Cross breeding of grape varieties was where he saw the highest opportunity for increased heterozygosity.
Negrul
[12][13] analyzed the influence of crossing within and across ecological groups of varieties on progeny population variability. While crossing within convarietas did not result in considerable variety, he discovered that crossing across ecological groups could result in a progeny population with high diversity. The results revealed that cross between convar. pontica and convar. orientlis produced mostly intermediate offspring, with the pontica variation having a modest advantage. The hybrids grew faster than orientalis cultivars in general, but they produced tiny, luscious berries that are suited for winemaking. The traits of the occidentalis variety predominate in the progeny of crosses between orientalis and occidentalis varieties. The hybrids produced by crossing occidentalis and pontica convarietas have intermediate features. Some of them have proven to be quite prolific and produce excellent wine.
Interspecific crosses can also promote genetic variability. The resultant interspecific hybrids are mostly employed in resistance breeding, as cultivated species frequently lack or have lower resistance than wild species. The so-called Franco-American hybrids have been the most widely utilized in grapevine breeding to boost resistance to mildews, and grey rot
[14].
Vitis amurensis is also a significant source of genetic diversity. Early ripening, resistance to mildews,
Botrytis, and
Agrobacterium, and excellent frost tolerance are all significant qualities for grape breeding. This latter feature makes it a particularly useful source of genes in continental climate grape breeding
[15].
The correct selection of parent pairs is a cornerstone of combination breeding; therefore, a wrong choice can set back the breeding programme by up to 30 years. In the past (and even today), so-called combining ability tests have been (and still are) carried out to address this issue. For example, in 1978 by crossing ‘Bicane’ with 20 different varieties from the pontica, orientalis, and occidentalis ecological groups, the cultivar’s combining ability was calculated. A total of 4659 seedlings were investigated from these 20 crossings. Studies on the flower type’s inheritance and other biological and economic characteristics have revealed that ‘Bicane’ has a high combining ability and it is heterozygous for the flower type, allowing homozygous varieties of “White Riesling” and “Muscat Hamburg” to be separated. Because of the large range of variation for numerous features, as well as the heterosis seen for traits including vigor, cold hardness of buds, cane maturity, crop level, cluster size, and berry size, high-valuable genotypes were chosen to be homologated
[16].
In F1 offspring of 20 crossings between seeded and seedless table grape cultivars, the combining ability and genotype-environmental interaction were investigated in relation to average cluster weight. Significant distinctions between them were required for their employment in combination breeding to be successful. The importance of genotype–environmental interactions was clearly stated, and they should be taken into consideration in breeding practice
[17].
It is clear from the foregoing that in grape breeding, the origin and pedigree of varieties is often more important than the phenotypic characteristics of a given variety. However, experiments to test combining ability are very time-consuming. The work (and often time) can be saved by looking at the pedigree of varieties with valuable characteristics that are important to us. As recent research has shown, often valuable varieties with a wide range of ecological tolerance come from the same ancestor or ancestors such as ‘Heunish’
[18][19] or Pinots
[20][21]. However, the progeny found in the study of Bowers et al.
[22] are all historically related with northeastern France and not with any other locations, which suggests that the crossings took place in this region. It is obvious that ‘Pinot’ and ‘Gouais blanc’ make a good parental combination; on the other hand, any other varieties growing in the area are most likely to be relatives of ‘Pinot’ or ‘Gouais blanc’ and would be less fit as a result of inbreeding depression.
The breeder should then go back to the successful ancestor to save time and money.
2.2. Consequences of the Presence of Null Alleles in Pedigree Studies-Some Examples
The most important consequence of the presence of null alleles is that they can result in a true parent and its offspring appearing homozygous for different alleles. This can lead to a rejection of the real parent–offspring relationship
[1][23].
It is essential to maintain precise pedigree records whenever a grape breeding program is being carried out. There is a possibility that the breeder’s record will contain errors. It is feasible to identify and validate parent–offspring connections by the use of genetic markers, often known as “DNA Fingerprinting”. Markers known as simple sequence repeats (SSR) were applied in order to validate or rectify the pedigrees of grape varieties developed through the Cornell breeding program. In this project, ‘Ontario’ was confirmed as the parent of the ‘Glenora’, ‘Himrod’, and ‘Alden’ scoring null alleles at the VVMD25 locus. ‘Fredonia’ × ‘Black Kishmish’ were also confirmed as parents for ‘Suffolk Red’ considering the possibility of being a null allele at VVMD6
[24].
To prove that ‘Muscat of Hamburg’, a fine black table grape variety with a muscat flavor, is the progeny from the crossing of ‘Schiava Grossa’ and ‘Muscat of Alexandria,’ researchers used 2 isozymes (GPI and PGM), 30 nuclear, and 5 chloroplastic microsatellite markers. The likelihood of null alleles was calculated and found to be extremely low or absent
[25].
Fifty microsatellite markers were used to examine ancient and closely related grape cultivars from the Alps: “Cornalin,” “Humagne Rouge,” and “Goron” from Valais (Switzerland); and “Cornalin,” “Petit Rouge,” and “Mayolet” from the Aosta Valley (Italy).
The findings supported earlier research demonstrating the distinction between the Italian and Swiss “Cornalin” cultivars and the identity between “Humagne Rouge” and “Cornalin” from the Aosta Valley
[26]. ‘Goron’, ‘Petit Rouge’, ‘Mayolet’, and ‘Cornalin d’Aoste’ all share at least one allele with ‘Cornalin du Valais’, suggesting parent/offspring relationships. Forty-nine out of fifty microsatellite loci support ‘Cornalin du Valais’ as the offspring of ‘Petit Rouge’ and ‘Mayolet’, but ‘Humagne Rouge’ has genotype 257–241 at locus VVMD 8 instead of 241–241 for ‘Cornalin d’Aoste’. This clonal variant is likely caused by a null allele in ‘Cornalin d’Aoste’. This was the first grapevine paternity research to deal with discrepancy at a microsatellite locus, demonstrating that the use of progressively large numbers of loci in generating parentage decisions leads to a proportional rise in the risk of meeting a locus with intra-cultivar variability throughout the analysis. It should be assumed that a single multiple repeat unit disagreement is not sufficient to invalidate a parentage hypothesis.
At first, it was assumed that a parent–offspring link existed between the red grape cultivar known as ‘Sangiovese,’ which is the most common red grape cultivar in Italy, and the ancient Tuscan variety known as ‘Ciliegiolo’
[27]. During the process of testing ‘Sangiovese’ as a parent of ‘Ciliegiolo,’ the putative other parent was looked for in a large, private, and standardized database; however, no candidate was found. After putting ‘Ciliegiolo’ through its paces as a potential parent for ‘Sangiovese,’ a total of four candidate cultivars were discovered. Only one of the fifty microsatellites was not consistent with this paternity test, leading researchers to conclude with a high level of confidence that the grape variety known as ‘Sangiovese’ is the offspring of ‘Ciliegiolo’ and ‘Calabrese di Montenuovo’
[28]. In the same year, Staraz et al., on the basis of their studies, suggested that ‘Ciliegiolo’ was not the parent but the offspring of ‘Sangiovese’
[29]. This hypothesis was later confirmed, with the addition that the ‘Ciliegiolo’ variety was probably the offspring of the ‘Sangiovese’ × ‘Moscato violetto’ varieties
[30].
Bergamini et al. saw the discovery of two possible parents for the ‘Sangiovese’ grape that had not been mentioned earlier. The first variety that could be considered a putative parent is known as ‘Ciliegiolo’, and it has already been discussed as a relative of ‘Sangiovese’. The second variety that could be considered a putative parent is ‘Negrodolce’, which is an old local variety that was considered lost over the course of the last century but was recovered by the authors. The newly postulated parentage held up well even after a comprehensive molecular examination, with the exception of a single inconsistency detected in one of the 57 different microsatellite markers that were examined. This discrepancy is certainly due to a null allele, and as a result, it should not impair the hypothesis. However, it does point out the limitations of the microsatellites profiling as a pedigree research method, considering that this was the third different kinship that had been proposed so far for the ‘Sangiovese’ grape variety
[31][32].
In contrast to the modest number of markers required to establish the identity or non-identity of two grapevine samples, a substantially higher number of markers are required to reconstruct parentage and pedigrees across cultivars to prevent incorrect relationship assignment. The majority of parentage and kinship reconstruction research included more than 25, and in some cases more than 50 markers
[33].
Another example where the null allele caused a dilemma was in the study of Bowers et al. Microsatellite loci in 300 grape cultivars were applied to determine paternal relationships. Sixteen wine grapes grown in northeastern France, including ‘Chardonnay’, ‘Gamay noir’, ‘Aligote ‘, and ‘Melon’, have microsatellite genotypes compatible with being the descendants of a single pair of parents, ‘Pinot’ and ‘Gouais blanc’, both of which were widespread in this region in the Middle Ages. ‘Romorantin’ does not share an allele with ‘Pinot’ at locus VVS2, expressing a 129-bp allele instead. ‘Pinot fin teinturier,’ a red-juiced variety of ‘Pinot,’ has this trait, but no other cultivars do. ‘Dameron’ does not share an allele with ‘Gouais blanc’ at locus VVMD36, supposing a mutation to a 254-bp or null allele
[22].
SSR markers were utilized to identify muscadine cultivars and validate their pedigrees. Utilizing 20 SSRs from 13 linkage groups, 89
Vitis accessions were genotyped. Five SSR markers could differentiate all 81 subgenus
Muscadinia accessions. Twelve cultivars’ profiles did not match their previously reported parentage–offspring connections. MicroChecker v2.2.3
[34] was used to identify genotyping mistakes related to null alleles (nonamplified alleles), short allele dominance (large allele dropout), and stutter peaks
[35].
With the goal of analyzing genetic diversity and examining parentages, a collection of 1005 grapevine accessions were genotyped at 34 microsatellite (SSR) loci
[30]. After a preliminary simulation that permitted the estimate of crucial values of likelihood ratios, the parentage analysis was carried out using the CERVUS program. To accommodate for genotyping mistakes, the occurrence of null alleles, and mutations, mismatches at a maximum of two loci in each trio were permitted. Because of the high frequency of null alleles at locus VChr9b, the data analysis indicated that the majority of mismatches occurred there. As a result, this locus was eliminated and the analysis was rerun. In most cases, incompatible profiles were found at loci with a high frequency of null alleles, and could thus be explained by the presence of a null allele in either parent or offspring.
In Croatia, 36 nuclear SSR, 4 cpSSR, and 47 SNP investigations revealed a large number of admixed varieties and synonyms, which was attributed to complex pedigrees and migrations. The highest fixation index, divergence from the Hardy–Weinberg equilibrium, and highest prevalence of null alleles were determined for the Vchr8b and Vchr14b loci, and hence they were removed from future parentage analyses. The remaining set of markers revealed 24 full parentages and 113 half-kinships
[36].
In the analysis of identification and parentage, the condition of HW equilibrium is a key underlying assumption to have
[30]. Minor deviations from HW equilibrium or variations at a few loci do not have the potential to distort likelihood estimates; however, deviations at many loci may have this potential. In the event that this is the case, the certainty of identification and parentage designations should be interpreted with extreme caution. However, it is important to keep in mind that the discrimination power of the loci in HW equilibrium may be sufficiently strong to ensure the validity of the study as a whole.
2.3. Solutions for the Correct Pedigree Reconstruction
A novel approach was suggested by Mark R. Christi to detect parent–offspring pairs in large data sets; to allow for genotyping errors, null alleles and mutations, it is necessary to quantitatively estimate how many loci should be allowed to mismatch based upon the study-specific error rate. This approach was suggested for application to methods that determine the probability of identity among genotypes and suggested that one can additionally account for null alleles, missing data and mutation simply by adding estimates of those rates to the study-specific error rate
[37].
It was suggested that well-established maximum likelihood approaches for estimating relationship and relatedness could be modified to take into account null alleles. This would be accomplished by differentiating between an observed genotype and the set of true genotypes that could have produced that observation. For example, the probability of observing the genotype pair ii/ii was calculated by adding the probabilities that the true genotypes are ii/ii, in/ii, ii/in, or in/in—the four true genotypes that would be observed as ii/ii.
[7].
Genetic data can be used to estimate the genealogical link or relatedness of individuals of unknown ancestry.
Ml-relate is a computer software that calculates maximum likelihood estimates of relatedness and connection. This software can handle null alleles and is designed for microsatellite data. It employs simulation to identify which links are supported by genetic data and to compare suspected relationships to alternatives
[38].
Pedigrees are used in many areas of genetic research because they enable a precise resolution of genealogical ties between individuals. The estimation of the short-term effective population size (Ne), which is important in domains such as conservation genetics, is one example of how pedigree information might be used. Despite their use, pedigrees are frequently unknown parameters that must be derived from genetic data. Using Markov Chain Monte Carlo, a Bayesian technique for jointly estimating pedigrees and Ne from genetic markers was proposed. With the use of a composite likelihood, this method allows for the examination of a large number of markers and individuals within a single generation, considerably increasing computational efficiency. Simulated data were used to demonstrate that the approach can accurately determine Ne and relationships up to first cousins
[39].
This entry is adapted from the peer-reviewed paper 10.3390/horticulturae8070658