Assessment of the performance of hidden Markov models for imputation in animal breeding
MetadataShow full item record
Background: In this paper, we review the performance of various hidden Markov model‐based imputation methods in animal breeding populations. Traditionally, pedigree and heuristic‐based imputation methods have been used for imputation in large animal populations due to their computational e ciency, scalability,
and accuracy. Recent advances in the area of human genetics have increased the ability of probabilistic hidden Markov model methods to perform accurate phasing and imputation in large populations. These advances may enable these methods to be use‐ ful for routine use in large animal populations, particularly in populations where pedigree information is not readily available. Methods: To test the performance of hidden Markov model‐based imputation, we evaluated the accuracy and com‐ putational cost of several methods in a series of simulated populations and a real animal population without using a pedigree. First, we tested single‐step (diploid) imputation, which performs both phasing and imputation. Second, we tested pre‐phasing followed by haploid imputation. Overall, we used four available diploid imputation methods (fast‐ PHASE, Beagle v4.0, IMPUTE2, and MaCH), three phasing methods, (SHAPEIT2, HAPI‐UR, and Eagle2), and three haploid imputation methods (IMPUTE2, Beagle v4.1, and Minimac3). Results: We found that performing pre‐phasing and haploid imputation was faster and more accurate than diploid imputation. In particular, among all the methods tested, pre‐phasing with Eagle2 or HAPI‐UR and imputing with Mini‐ mac3 or IMPUTE2 gave the highest accuracies with both simulated and real data. Conclusions: The results of this study suggest that hidden Markov model‐based imputation algorithms are an accu‐ rate and computationally feasible approach for performing imputation without a pedigree when pre‐phasing and haploid imputation are used. Of the algorithms tested, the combination of Eagle2 and Minimac3 gave the highest accuracy across the simulated and real datasets.
Is part ofGenetics Selection Evolution, 2018, vol. 50, num. 44
The following license files are associated with this item:
Except where otherwise noted, this item's license is described as cc-by (c) Whalen, Andrew et al., 2018
Showing items related by title, author, creator and subject.
Hybrid peeling for fast and accurate calling, phasing, and imputation with sequence data of any coverage in pedigrees Whalen, Andrew; Ros Freixedes, Roger; Wilson, David L.; Gorjanc, Gregor; Hickey, John M. (BMC (part of Springer Nature), 2018-12-18)Background: In this paper, we extend multi-locus iterative peeling to provide a computationally efficient method for calling, phasing, and imputing sequence data of any coverage in small or large pedigrees. Our method, ...
Ros Freixedes, Roger; Whalen, Andrew; Gorjanc, Gregor; Mileham, Alan J.; Hickey, John M. (BMC (part of Springer Nature), 2020-04-06)Background: For assembling large whole-genome sequence datasets for routine use in research and breeding, the sequencing strategy should be adapted to the methods that will be used later for variant discovery and imputation. ...
Accuracy of whole-genome sequence imputation using hybrid peeling in large pedigreed livestock populations Ros Freixedes, Roger; Whalen, Andrew; Chen, Ching-Yi; Gorjanc, Gregor; Herring, William O.; Mileham, Alan J.; Hickey, John M. (BMC (part of Springer Nature), 2020-04-06)Background: The coupling of appropriate sequencing strategies and imputation methods is critical for assembling large whole-genome sequence datasets from livestock populations for research and breeding. In this paper, we ...