Genomic prediction for targeted populations of environments in oat (Avena sativa)
Pablo Sandro A , Madhav Bhatta A B , Alisha Bower C , Sarah Carlson C , Jean-Luc Jannink D , David J. Waring D , Clay Birkett D , Kevin Smith E , Jochum Wiersma E , Melanie Caffe F , Jonathan Kleinjan F , Michael S. McMullen G , Lydia English C and Lucia Gutierrez A *A
B
C
D
E
F
G
Abstract
Long-term multi-environment trials (METs) could improve genomic prediction models for plant breeding programs by better representing the target population of environments (TPE). However, METs are generally highly unbalanced because genotypes are routinely dropped from trials after a few years. Furthermore, in the presence of genotype × environment interaction (GEI), selection of the environments to include in a prediction set becomes critical to represent specific TPEs.
The goals of this study were to compare strategies for modelling GEI in genomic prediction, using large METs from oat (Avena sativa L.) breeding programs in the Midwest United States, and to develop a variety decision tool for farmers and plant breeders.
The performance of genotypes in TPEs was predicted by using different strategies for handling GEI in genomic prediction models including systematic and/or random GEI components. These strategies were also used to build the variety decision tool for farmers.
Genomic prediction for unknown genotypes, locations and years within TPEs had moderate to high predictive ability, accuracy and reliability. Modelling GEI was beneficial in small, but not in large, mega-environments. The latest 3 years were highly predictive of performance in an upcoming year for most years but not for years with unusual weather patterns. High predictive ability, accuracy and reliability were obtained when large datasets were used in TPEs.
Deployment of historical datasets can be accomplished through meaningful delineation and prediction for TPEs.
We have shown the performance of a simple modelling strategy for handling prediction for TPEs when deploying large historical datasets.
Keywords: genomic best linear unbiased predictions (GBLUP), genomic prediction, genomic selection, genotype by environment interaction (GEI), genotypic performance, multi-environment trials (METs), targeted populations of environments (TPE), unbalanced dataset.
References
Aguate F, Crossa J, Balzarini M (2019) Effect of missing values on variance component estimates in multienvironment trials. Crop Science 59(2), 508-517.
| Crossref | Google Scholar |
Amadeu RR, Cellon C, Olmstead JW, Garcia AAF, Resende MFR, Jr., Muñoz PR (2016) AGHmatrix: R package to construct relationship matrices for autotetraploid and diploid species: a blueberry example. The Plant Genome 9(3), plantgenome2016.01.0009.
| Crossref | Google Scholar |
Asoro FG, Newell MA, Beavis WD, Scott MP, Jannink J-L (2011) Accuracy and training population design for genomic selection on quantitative traits in elite North American oats. The Plant Genome 4(2), plantgenome2011.02.0007.
| Crossref | Google Scholar |
Atanda SA, Olsen M, Burgueño J, Crossa J, Dzidzienyo D, Beyene Y, Gowda M, Dreher K, Zhang X, Prasanna BM, Tongoona P, Danquah EY, Olaoye G, Robbins KR (2021a) Maximizing efficiency of genomic selection in CIMMYT’s tropical maize breeding program. Theoretical and Applied Genetics 134(1), 279-294.
| Crossref | Google Scholar | PubMed |
Atanda SA, Olsen M, Crossa J, Burgueño J, Rincent R, Dzidzienyo D, Beyene Y, Gowda M, Dreher K, Boddupalli PM, Tongoona P, Danquah EY, Olaoye G, Robbins KR (2021b) Scalable sparse testing genomic selection strategy for early yield testing stage. Frontiers in Plant Science 12, 658978.
| Crossref | Google Scholar | PubMed |
Atanda SA, Govindan V, Singh R, Robbins KR, Crossa J, Bentley AR (2022) Sparse testing using genomic prediction improves selection for breeding targets in elite spring wheat. Theoretical and Applied Genetics 135(6), 1939-1950.
| Crossref | Google Scholar | PubMed |
Atlin GN, Kleinknecht K, Singh KP, Piepho HP (2011) Managing genotype × environment interaction in plant breeding programs: a selection theory approach. Journal of the Indian Society of Agricultural Statistics 65(2), 237-247.
| Google Scholar |
Bassi FM, Bentley AR, Charmet G, Ortiz R, Crossa J (2016) Breeding schemes for the implementation of genomic selection in wheat (Triticum spp.). Plant Science 242, 23-36.
| Crossref | Google Scholar | PubMed |
Bernardo R (1996) Best linear unbiased prediction of maize single-cross performance. Crop Science 36(1), 50-56.
| Crossref | Google Scholar |
Berro I, Lado B, Nalin RS, Quincke M, Gutiérrez L (2019) Training population optimization for genomic selection. The Plant Genome 12(3), 190028.
| Crossref | Google Scholar |
Bhatta M, Gutierrez L, Cammarota L, Cardozo F, Germán S, Gómez-Guerrero B, Pardo MF, Lanaro V, Sayas M, Castro AJ (2020) Multi-trait genomic prediction model increased the predictive ability for agronomic and malting quality traits in barley (Hordeum vulgare L.). G3 Genes|Genomes|Genetics 10(3), 1113-1124.
| Crossref | Google Scholar | PubMed |
Boer MP, Wright D, Feng L, Podlich DW, Luo L, Cooper M, van Eeuwijk FA (2007) A mixed-model quantitative trait loci (QTL) analysis for multiple-environment trial data using environmental covariables for QTL-by-environment interactions, with an example in maize. Genetics 177(3), 1801-1813.
| Crossref | Google Scholar | PubMed |
Brzozowski LJ, Campbell MT, Hu H, Caffe M, Gutiérrez L, Smith KP, Sorrells ME, Gore MA, Jannink J-L (2022a) Generalizable approaches for genomic prediction of metabolites in plants. The Plant Genome 15(2), e20205.
| Crossref | Google Scholar | PubMed |
Brzozowski LJ, Hu H, Campbell MT, Broeckling CD, Caffe M, Gutiérrez L, Smith KP, Sorrells ME, Gore MA, Jannink J-L (2022b) Selection for seed size has uneven effects on specialized metabolite abundance in oat (Avena sativa L.). G3 Genes|Genomes|Genetics 12(3), jkab419.
| Crossref | Google Scholar |
Brzozowski LJ, Campbell MT, Hu H, Yao L, Caffe M, Gutiérrez L, Smith KP, Sorrells ME, Gore MA, Jannink J-L (2023) Genomic prediction of seed nutritional traits in biparental families of oat (Avena sativa). The Plant Genome 16(4), e20370.
| Crossref | Google Scholar | PubMed |
Burgueño J, Crossa J, Cornelius PL, Yang R-C (2008) Using factor analytic models for joining environments and genotypes without crossover genotype × environment interaction. Crop Science 48(4), 1291-1305.
| Crossref | Google Scholar |
Burgueño J, de los Campos G, Weigel K, Crossa J (2012) Genomic prediction of breeding values when modeling genotype × environment interaction using pedigree and dense molecular markers. Crop Science 52(2), 707-719.
| Crossref | Google Scholar |
Bustos-Korts D, Malosetti M, Chapman S, van Eeuwijk F (2016) Modelling of genotype by environment interaction and prediction of complex traits across multiple environments as a synthesis of crop growth modelling, genetics and statistics. In ‘Crop systems biology: narrowing the gaps between crop modelling and genetics’. (Eds X Yin, P Struik) pp. 55–82. (Springer)
Bustos-Korts D, Boer MP, Chenu K, Zheng B, Chapman S, van Eeuwijk FA (2021) Genotype-specific P-spline response surfaces assist interpretation of regional wheat adaptation to climate change. in silico Plants 3(2), diab018.
| Crossref | Google Scholar |
Campbell MT, Hu H, Yeats TH, Brzozowski LJ, Caffe-Treml M, Gutiérrez L, Smith KP, Sorrells ME, Gore MA, Jannink J-L (2021) Improving genomic prediction for seed quality traits in oat (Avena sativa L.) using trait-specific relationship matrices. Frontiers in Genetics 12, 643733.
| Crossref | Google Scholar | PubMed |
Crespo-Herrera LA, Crossa J, Huerta-Espino J, Mondal S, Velu G, Juliana P, et al. (2021) Target population of environments for wheat breeding in India: definition, prediction and genetic gains. Frontiers in Plant Science 12, 638520.
| Crossref | Google Scholar | PubMed |
Crossa J, Cornelius PL (1997) Sites regression and shifted multiplicative model clustering of cultivar trial sites under heterogeneity of error variances. Crop Science 37(2), 406-415.
| Crossref | Google Scholar |
Dawson JC, Endelman JB, Heslot N, Crossa J, Poland J, Dreisigacker S, Manès Y, Sorrells ME, Jannink J-L (2013) The use of unbalanced historical data for genomic selection in an international wheat breeding program. Field Crops Research 154, 12-22.
| Crossref | Google Scholar |
Dumble S, Bernal EF, Villardon PG (2017) GGEBiplots: GGE Biplots with ‘ggplot2.’ R Package Version 0.1, 1. Available at https://CRAN.R-project.org/package=GGEBiplots
Endelman JB, Atlin GN, Beyene Y, Semagn K, Zhang X, Sorrells ME, Jannink J-L (2014) Optimal design of preliminary yield trials with genome-wide markers. Crop Science 54(1), 48-59.
| Crossref | Google Scholar |
Gabriel KR (1971) The biplot graphic display of matrices with application to principal component analysis. Biometrika 58(3), 453-467.
| Crossref | Google Scholar |
Gauch HG, Jr., Zobel RW (1997) Identifying mega-environments and targeting genotypes. Crop Science 37(2), 311-326.
| Crossref | Google Scholar |
Gezan S (2020) Linear mixed models: obtaining estimates of random effects with ASReml-R. Asreml-r Blog. Available at https://vsni.co.uk/case-studies/calculating-accuracy-of-random-effect-estimates-with-ASReml-R
González-Barrios P, Díaz-García L, Gutiérrez L (2019) Mega-environmental design: using genotype × environment interaction to optimize resources for cultivar testing. Crop Science 59(5), 1899-1915.
| Crossref | Google Scholar |
Haikka H, Manninen O, Hautsalo J, Pietilä L, Jalli M, Veteläinen M (2020a) Genome-wide association study and genomic prediction for Fusarium graminearum resistance traits in nordic oat (Avena sativa L.). Agronomy 10(2), 174.
| Crossref | Google Scholar |
Haikka H, Knürr T, Manninen O, Pietilä L, Isolahti M, Teperi E, et al. (2020b) Genomic prediction of grain yield in commercial Finnish oat (Avena sativa) and barley (Hordeum vulgare) breeding programmes. Plant Breeding 139(3), 550-561.
| Crossref | Google Scholar |
Haile TA, Heidecker T, Wright D, Neupane S, Ramsay L, Vandenberg A, Bett KE (2020) Genomic selection for lentil breeding: empirical evidence. The Plant Genome 13(1), e20002.
| Crossref | Google Scholar | PubMed |
Hartung J, Piepho H-P (2021) Effect of missing values in multi-environmental trials on variance component estimates. Crop Science 61(6), 4087-4097.
| Crossref | Google Scholar |
Heffner EL, Sorrells ME, Jannink J-L (2009) Genomic selection for crop improvement. Crop Science 49(1), 1-12.
| Crossref | Google Scholar |
Heslot N, Rutkoski J, Poland J, Jannink J-L, Sorrells ME (2013) Impact of marker ascertainment bias on genomic selection accuracy and estimates of genetic diversity. PLoS ONE 8(9), e74612.
| Crossref | Google Scholar | PubMed |
Hoefler R, Gonzalez-Barrios P, Bhatta M, Berro I, Nalin RS, Borges A, Covarrubias E, Diaz- Garcia L, Gutierrez L (2020) Do spatial designs outperform classic experimental designs? Journal of Agricultural, Biological and Environmental Statistics 25, 523-552.
| Crossref | Google Scholar |
Hu H, Campbell MT, Yeats TH, et al. (2021) Multi-omics prediction of oat agronomic and seed nutritional traits across environments and in distantly related populations. Theoretical and Applied Genetics 134, 4043-4054.
| Crossref | Google Scholar |
Jarquín D, Crossa J, Lacaze X, Du Cheyron P, Daucourt J, Lorgeou J, Piraux F, Guerreiro L, Pérez P, Calus M, Burgueño J, de los Campos G (2014) A reaction norm model for genomic selection using high-dimensional genomic and environmental data. Theoretical and Applied Genetics 127, 595-607.
| Crossref | Google Scholar | PubMed |
Jarquín D, Lemes da Silva C, Gaynor RC, Poland J, Fritz A, Howard R, Battenfield S, Crossa J (2017) Increasing genomic-enabled prediction accuracy by modeling genotype × environment interactions in Kansas wheat. The Plant Genome 10(2), plantgenome2016.12.0130.
| Crossref | Google Scholar |
Lado B, Barrios PG, Quincke M, Silva P, Gutiérrez L (2016) Modeling genotype × environment interaction for genomic selection with unbalanced data from a wheat breeding program. Crop Science 56(5), 2165-2179.
| Crossref | Google Scholar |
Lado B, Vázquez D, Quincke M, Silva P, Aguilar I, Gutiérrez L (2018) Resource allocation optimization with multi-trait genomic prediction for bread wheat (Triticum aestivum L.) baking quality. Theoretical and Applied Genetics 131, 2719-2731.
| Crossref | Google Scholar | PubMed |
Malosetti M, Voltas J, Romagosa I, Ullrich SE, Van Eeuwijk FA (2004) Mixed models including environmental covariables for studying QTL by environment interaction. Euphytica 137, 139-145.
| Crossref | Google Scholar |
Malosetti M, Bustos-Korts D, Boer MP, van Eeuwijk FA (2016) Predicting responses in multiple environments: issues in relation to genotype × environment interactions. Crop Science 56(5), 2210-2222.
| Crossref | Google Scholar |
Mathews KL, Malosetti M, Chapman S, McIntyre L, Reynolds M, Shorter R, Van Eeuwijk F (2008) Multi-environment QTL mixed models for drought stress adaptation in wheat. Theoretical and Applied Genetics 117, 1077-1091.
| Crossref | Google Scholar | PubMed |
Meuwissen THE, Hayes BJ, Goddard ME (2001) Prediction of total genetic value using genome-wide dense marker maps. Genetics 157(4), 1819-1829.
| Crossref | Google Scholar | PubMed |
Monteverde E, Rosas JE, Blanco P, Pérez de Vida F, Bonnecarrère V, Quero G, Gutierrez L, McCouch S (2018) Multienvironment models increase prediction accuracy of complex traits in advanced breeding lines of rice. Crop Science 58(4), 1519-1530.
| Crossref | Google Scholar |
Monteverde E, Gutierrez L, Blanco P, Pérez de Vida F, Rosas JE, Bonnecarrère V, Quero G, McCouch S (2019) Integrating molecular markers and environmental covariates to interpret genotype by environment interaction in rice (Oryza sativa L.) grown in subtropical areas. G3 Genes|Genomes|Genetics 9(5), 1519-1531.
| Crossref | Google Scholar |
Neyhart JL, Silverstein KAT, Smith KP (2022) Accurate predictions of barley phenotypes using genomewide markers and environmental covariates. Crop Science 62(5), 1821-1833.
| Crossref | Google Scholar |
Oakey H, Cullis B, Thompson R, Comadran J, Halpin C, Waugh R (2016) Genomic selection in multi-environment crop trials. G3: Genes, Genomes, Genetics 6(5), 1313-1326.
| Google Scholar |
Piepho H-P (1998) Empirical best linear unbiased prediction in cultivar trials using factor-analytic variance-covariance structures. Theoretical and Applied Genetics 97(1–2), 195-201.
| Crossref | Google Scholar |
Piepho H-P (2000) A mixed-model approach to mapping quantitative trait loci in barley on the basis of multiple environment data. Genetics 156(4), 2043-2050.
| Crossref | Google Scholar | PubMed |
Piepho HP, Möhring J, Melchinger AE, Büchse A (2008) BLUP for phenotypic selection in plant breeding and variety testing. Euphytica 161(1–2), 209-228.
| Crossref | Google Scholar |
Piepho H-P, Möhring J, Schulz-Streeck T, Ogutu JO (2012) A stage-wise approach for the analysis of multi-environment trials. Biometrical Journal 54(6), 844-860.
| Crossref | Google Scholar | PubMed |
Rebollo I, Aguilar I, Pérez de Vida F, Molina F, Gutiérrez L, Rosas JE (2023) Genotype by environment interaction characterization and its modeling with random regression to climatic variables in two rice breeding populations. Crop Science 63(4), 2220-2240.
| Crossref | Google Scholar |
Smith A, Cullis B, Thompson R (2001) Analyzing variety by environment data using multiplicative mixed models and adjustments for spatial field trend. Biometrics 57(4), 1138-1147.
| Crossref | Google Scholar | PubMed |
van Eeuwijk FA, Malosetti M, Yin X, Struik PC, Stam P (2005) Statistical models for genotype by environment data: from conventional ANOVA models to eco-physiological QTL models. Australian Journal of Agricultural Research 56(9), 883-894.
| Crossref | Google Scholar |
van Eeuwijk FA, Bustos-Korts DV, Malosetti M (2016) What should students in plant breeding know about the statistical aspects of genotype × environment interactions? Crop Science 56(5), 2119-2140.
| Crossref | Google Scholar |
VanRaden PM (2008) Efficient methods to compute genomic predictions. Journal of Dairy Science 91(11), 4414-4423.
| Crossref | Google Scholar | PubMed |
Vargas M, Crossa J, van Eeuwijk FA, Ramírez ME, Sayre K (1999) Using partial least squares regression, factorial regression, and AMMI models for interpreting genotype × environment interaction. Crop Science 39(4), 955-967.
| Crossref | Google Scholar |
Verbyla AP, Eckermann PJ, Thompson R, Cullis BR (2003) The analysis of quantitative trait loci in multi-environment trials using a multiplicative mixed model. Australian Journal of Agricultural Research 54(12), 1395-1408.
| Crossref | Google Scholar |
Wang T-C, Casadebaig P, Chen T-W (2023) More than 1000 genotypes are required to derive robust relationships between yield, yield stability and physiological parameters: a computational study on wheat crop. Theoretical and Applied Genetics 136(3), 34.
| Crossref | Google Scholar | PubMed |
Ward BP, Brown-Guedira G, Kolb FL, Van Sanford DA, Tyagi P, Sneller CH, Griffey CA (2019) Genome-wide association studies for yield-related traits in soft red winter wheat grown in Virginia. PLoS ONE 14(2), e0208217.
| Crossref | Google Scholar | PubMed |
Yan W (2015) Mega-environment analysis and test location evaluation based on unbalanced multiyear data. Crop Science 55(1), 113-122.
| Crossref | Google Scholar |
Yan W, Tinker NA (2006) Biplot analysis of multi-environment trial data: principles and applications. Canadian Journal of Plant Science 86(3), 623-645.
| Crossref | Google Scholar |
Yan W, Hunt LA, Sheng Q, Szlavnics Z (2000) Cultivar evaluation and mega-environment investigation based on the GGE biplot. Crop Science 40(3), 597-605.
| Crossref | Google Scholar |
Yan W, Kang MS, Ma B, Woods S, Cornelius PL (2007) GGE biplot vs. AMMI analysis of genotype-by-environment data. Crop Science 47(2), 643-653.
| Crossref | Google Scholar |