Register      Login
Animal Production Science Animal Production Science Society
Food, fibre and pharmaceuticals from animals
RESEARCH ARTICLE (Open Access)

The impact of reference composition and genome build on the accuracy of genotype imputation in Australian Angus cattle

Hassan Aliloo https://orcid.org/0000-0002-5587-6929 A B and Samuel A. Clark A
+ Author Affiliations
- Author Affiliations

A School of Environmental and Rural Science, University of New England, Armidale, NSW 2350, Australia.

B Corresponding author. Email: haliloo@une.edu.au

Animal Production Science - https://doi.org/10.1071/AN21098
Submitted: 19 February 2021  Accepted: 13 July 2021   Published online: 23 September 2021

Journal Compilation © CSIRO 2021 Open Access CC BY-NC

Abstract

Context: Genotype imputation is an effective method to increase the number of SNP markers available for an animal and thereby increase the overall power of genome-wide associations and accuracy of genomic predictions. It is also the key to achieve a common set of markers for all individuals when the original genotypes are obtained using multiple genotyping platforms. High accuracy of imputed genotypes is crucial to their utility.

Aims: In this study, we propose a method for the construction of a common set of medium density markers for imputation, which relies on keeping as much information as possible. We also investigated the impact of changing marker coordinates on the basis of the new bovine genome assembly, ARS-UCD 1.2, on imputation accuracy.

Methods: In total, 49 754 animals with 45 364 single nucleotide polymorphism markers were used in a 10-fold cross-validation to compare four different imputation scenarios. The four scenarios were based on two alternative designs for the reference datasets. (1) A traditional reference panel that was created using the overlapping SNP from five medium density arrays and (2) a composite reference panel created by combining SNPs across the five arrays. Each of the reference datasets was used to test imputation accuracy when the SNPs were aligned on the basis of two genome assemblies (UMD 3.1 and ARS-UCD 1.2).

Key results: Our results showed that a composite reference panel can achieve higher imputation accuracies than does a traditional overlap reference. Incorporating mapping information on the basis of the recent genome build slightly improved the imputation accuracies, especially for lower density chips.

Conclusions: Markers with unreliable mapping information and animals with low connectedness to the imputation reference dataset benefited the most from the ARS-UCD 1.2 assembly and composite reference respectively.

Implications: The presented method is straightforward and can be used to setup an optimal imputation for accurate inference of genotypes in Australian Angus cattle.

Keywords: beef cattle, imputation accuracy, ARS-UCD1.2, composite reference.


References

Berry DP, McClure MC, Mullen MP (2014) Within- and across-breed imputation of high-density genotypes in dairy and beef cattle from medium- and low-density genotypes. Journal of Animal Breeding and Genetics 131, 165–172.
Within- and across-breed imputation of high-density genotypes in dairy and beef cattle from medium- and low-density genotypes.Crossref | GoogleScholarGoogle Scholar | 24906026PubMed |

Chang CC, Chow CC, Tellier LC, Vattikuti S, Purcell SM, Lee JJ (2015) Second-generation PLINK: rising to the challenge of larger and richer datasets. GigaScience 4, s13742-015-0047-8
Second-generation PLINK: rising to the challenge of larger and richer datasets.Crossref | GoogleScholarGoogle Scholar |

Edriss V, Guldbrandtsen B, Lund MS, Su G (2013) Effect of marker-data editing on the accuracy of genomic prediction. Journal of Animal Breeding and Genetics 130, 128–135.
Effect of marker-data editing on the accuracy of genomic prediction.Crossref | GoogleScholarGoogle Scholar | 23496013PubMed |

Goddard M (2009) Genomic selection: prediction of accuracy and maximisation of long term response. Genetica 136, 245–257.
Genomic selection: prediction of accuracy and maximisation of long term response.Crossref | GoogleScholarGoogle Scholar | 18704696PubMed |

Hermisdorff IDC, Costa RB, de Albuquerque LG, Pausch H, Kadri NK (2020) Investigating the accuracy of imputing autosomal variants in Nellore cattle using the ARS-UCD1.2 assembly of the bovine genome. BMC Genomics 21, 772
Investigating the accuracy of imputing autosomal variants in Nellore cattle using the ARS-UCD1.2 assembly of the bovine genome.Crossref | GoogleScholarGoogle Scholar | 33167856PubMed |

Hozé C, Fouilloux M-N, Venot E, Guillaume F, Dassonneville R, Fritz S, Ducrocq V, Phocas F, Boichard D, Croiseau P (2013) High-density marker imputation accuracy in sixteen French cattle breeds. Genetics, Selection, Evolution 45, 33
High-density marker imputation accuracy in sixteen French cattle breeds.Crossref | GoogleScholarGoogle Scholar | 24004563PubMed |

Kong A, Masson G, Frigge ML, Gylfason A, Zusmanovich P, Thorleifsson G, Olason PI, Ingason A, Steinberg S, Rafnar T, Sulem P, Mouy M, Jonsson F, Thorsteinsdottir U, Gudbjartsson DF, Stefansson H, Stefansson K (2008) Detection of sharing by descent, long-range phasing and haplotype imputation. Nature Reviews Genetics 40, 1068–1075.
Detection of sharing by descent, long-range phasing and haplotype imputation.Crossref | GoogleScholarGoogle Scholar |

Marchini J, Howie B (2010) Genotype imputation for genome-wide association studies. Nature Reviews Genetics 11, 499–511.
Genotype imputation for genome-wide association studies.Crossref | GoogleScholarGoogle Scholar | 20517342PubMed |

Mulder HA, Calus MP, Druet T, Schrooten C (2012) Imputation of genotypes with low-density chips and its effect on reliability of direct genomic values in Dutch Holstein cattle. Journal of Dairy Science 95, 876–889.
Imputation of genotypes with low-density chips and its effect on reliability of direct genomic values in Dutch Holstein cattle.Crossref | GoogleScholarGoogle Scholar | 22281352PubMed |

Null DJ, VanRaden PM, Rosen BD, O’Connell JR, Bickhart DM (2019) Using the ARS-UCD1.2 reference genome in US evaluations. Interbull Bulletin 55, 30–34.

Rosen BD, Bickhart DM, Schnabel RD, Koren S, Elsik CG, Tseng E, Rowan TN, Low WY, Zimin A, Couldrey C, Hall R, Li W, Rhie A, Ghurye J, McKay SD, Thibaud-Nissen F, Hoffman J, Murdoch BM, Snelling WM, McDaneld TG, Hammond JA, Schwartz JC, Nandolo W, Hagen DE, Dreischer C, Schultheiss SJ, Schroeder SG, Phillippy AM, Cole JB, Van Tassell CP, Liu G, Smith TPL, Medrano JF (2020) De novo assembly of the cattle reference genome with single-molecule sequencing. GigaScience 9, giaa021
De novo assembly of the cattle reference genome with single-molecule sequencing.Crossref | GoogleScholarGoogle Scholar | 32191811PubMed |

Rowan TN, Hoff JL, Crum TE, Taylor JF, Schnabel RD, Decker JE (2019) A multi-breed reference panel and additional rare variants maximize imputation accuracy in cattle. Genetics, Selection, Evolution 51, 77
A multi-breed reference panel and additional rare variants maximize imputation accuracy in cattle.Crossref | GoogleScholarGoogle Scholar | 31878893PubMed |

Sargolzaei M, Chesnais JP, Schenkel FS (2014) A new approach for efficient genotype imputation using information from relatives. BMC Genomics 15, 478
A new approach for efficient genotype imputation using information from relatives.Crossref | GoogleScholarGoogle Scholar | 24935670PubMed |

Utsunomiya ATH, Santos DJA, Boison SA, Utsunomiya YT, Milanesi M, Bickhart DM, Ajmone-Marsan P, Sölkner J, Garcia JF, da Fonseca R, da Silva MVGB (2016) Revealing misassembled segments in the bovine reference genome by high resolution linkage disequilibrium scan. BMC Genomics 17, 705
Revealing misassembled segments in the bovine reference genome by high resolution linkage disequilibrium scan.Crossref | GoogleScholarGoogle Scholar |

van Binsbergen R, Bink MCAM, Calus MPL, van Eeuwijk FA, Hayes BJ, Hulsegge I, Veerkamp RF (2014) Accuracy of imputation to whole-genome sequence data in Holstein Friesian cattle. Genetics, Selection, Evolution 46, 41
Accuracy of imputation to whole-genome sequence data in Holstein Friesian cattle.Crossref | GoogleScholarGoogle Scholar | 25022768PubMed |

VanRaden PM (2008) Efficient methods to compute genomic predictions. Journal of Dairy Science 91, 4414–4423.
Efficient methods to compute genomic predictions.Crossref | GoogleScholarGoogle Scholar | 18946147PubMed |

Ventura RV, Miller SP, Dodds KG, Auvray B, Lee M, Bixley M, Clarke SM, McEwan JC (2016) Assessing accuracy of imputation using different SNP panel densities in a multi-breed sheep population. Genetics, Selection, Evolution 48, 71
Assessing accuracy of imputation using different SNP panel densities in a multi-breed sheep population.Crossref | GoogleScholarGoogle Scholar | 27663120PubMed |

Zimin AV, Delcher AL, Florea L, Kelley DR, Schatz MC, Puiu D, Hanrahan F, Pertea G, Van Tassell CP, Sonstegard TS, Marçais G, Roberts M, Subramanian P, Yorke JA, Salzberg SL (2009) A whole-genome assembly of the domestic cow, Bos taurus. Genome Biology 10, R42
A whole-genome assembly of the domestic cow, Bos taurus.Crossref | GoogleScholarGoogle Scholar | 19393038PubMed |