3. Genomic Selection in Polyploids
GS requires high-quality genome-wide markers to determine
GEBVs. Two types of high-throughput genotyping methods can be employed: SNP arrays and GBS. There are SNPs arrays with different marker densities in potato
[39][13] and wheat
[40][14]. Alfalfa also has an array with 9277 SNPs
[41][15]. However, its use has not been widely adopted, and GBS is currently the best option to obtain genome-wide markers. During the genotyping process by GBS, different types of markers, such as single nucleotide polymorphisms (SNPs), insertions/deletions (indels), or short tandem repeats (STRs) can be obtained. Genome-wide markers can then be arrayed in a genotypic matrix of
m samples and
n markers. The genotypic matrix can be filtered to retain only biallelic SNPs, which are the most abundant and stable markers for identifying QTLs associated with traits of interest
[42][16].
Allele dosage counts alternative allele frequency for each biallelic SNP. In diploid species the genotypic matrix is coded as
{0, 1, 2}, reflecting if a given marker is present in the homozygous reference (AA), heterozygous (AB), or homozygous alternate (BB) allelic state. For biallelic SNPs in polyploid species with ploidy
N, the biallelic dosage is
N+1 and the genotypic matrix is coded as
{0,…,N}. Genotype calling in autotetraploids requires bioinformatics tools to distinguish among five possible genotypes (AAAA, AAAB, AABB, ABBB, BBBB) with biallelic SNPs coded as {0, 1, 2, 3, 4}. There are several R packages, such as polyRAD
[43][17], superMASSA
[44][18], FitTetra 2.0
[45][19] or Updog
[46][20], with which to obtain allele dosage in numeric format from variant call format file (vcf) format. Some of these R packages, such as Updog, require users to specify genotype priors
[46][20] to accurately calculate the allele dosage and distinguish between all possible genotypes. However, the most common option is to use high depth sequence reads (e.g., ~60×) which leads to 98.4% accuracy in genotypic calls
[47][21] The effects of marker allele dosage on phenotype for genomic selection have been reported previously. Slater et al. (2016) described three different models for GS in autopolyploids: additive autotetraploid, pseudodiploid, and full autotetraploid. In the additive autotetraploid model, the allele dosage has an additive effect, and 0, 1, 2, 3, 4 corresponds to AAAA, AAAB, AABB, ABBB, BBBB, respectively. In the pseudodiploid model, all heterozygous genotypes (AAAB, AABB, ABBB) have the same effect of 1 on the genotypic variation, while the two homozygotes AAAA and BBBB have an effect of 0 and 2, respectively. Finally, the full autotetraploid model assumes that each genotype has its own effect with five possible effects per marker, assuming that the markers are fitted as random effects
[15][22]. In addition, Rosyara et al. (2016) developed GWASpoly, a software for genome-wide association studies in autopolyploids. GWASpoly has different assumptions over allele dosages and conducts the hypothesis tests for each marker using six models (general, diploidized general, diploidized additive, additive, simplex dominant, and duplex dominant models) (
Table 31).
Table 31. Coding effect assumptions of GWASpoly models according to allele dosage in biallelic SNPs.
| Allele Dosage | ¶ |
AAAA |
AAAB |
AABB |
ABBB |
BBBB |
| Numerical Code |
0 |
1 |
2 |
3 |
4 |
| GWASpoly Models |
Phenotypic Effect § |
| Diplo-additive |
0.00 |
0.50 |
1.00 |
| Diplo-general | ‡ |
0.00 |
0.00 < x <1.00 |
1.00 |
| Additive |
0.00 |
0.25 |
0.50 |
0.75 |
1.00 |
| 1-dom-ref (A > B simplex) |
1.00 |
1.00 |
1.00 |
1.00 |
0.00 |
| 2-dom-ref (A > B duplex) |
1.00 |
1.00 |
1.00 |
0.00 |
0.00 |
| 1-dom-alt (B > A simplex) |
0.00 |
1.00 |
1.00 |
1.00 |
1.00 |
| 2-dom-alt (B > A duplex) |
0.00 |
0.00 |
1.00 |
1.00 |
1.00 |
| General | † |
No restrictions |