A method for the allocation of sequencing resources in genotyped livestock populations
MetadataShow full item record
Background: This paper describes a method, called AlphaSeqOpt, for the allocation of sequencing resources in livestock populations with existing phased genomic data to maximise the ability to phase and impute sequenced haplotypes into the whole population. Methods: We present two algorithms. The rst selects focal individuals that collectively represent the maximum pos‐ sible portion of the haplotype diversity in the population. The second allocates a xed sequencing budget among the families of focal individuals to enable phasing of their haplotypes at the sequence level. We tested the performance of the two algorithms in simulated pedigrees. For each pedigree, we evaluated the proportion of population haplo‐ types that are carried by the focal individuals and compared our results to a variant of the widely‐used key ancestors approach and to two haplotype‐based approaches. We calculated the expected phasing accuracy of the haplotypes of a focal individual at the sequence level given the proportion of the xed sequencing budget allocated to its family. Results: AlphaSeqOpt maximises the ability to capture and phase the most frequent haplotypes in a population in three ways. First, it selects focal individuals that collectively represent a larger portion of the population haplotype diversity than existing methods. Second, it selects focal individuals from across the pedigree whose haplotypes can be easily phased using family‐based phasing and imputation algorithms, thus maximises the ability to impute sequence into the rest of the population. Third, it allocates more of the xed sequencing budget to focal individuals whose haplotypes are more frequent in the population than to focal individuals whose haplotypes are less frequent. Unlike existing methods, we additionally present an algorithm to allocate part of the sequencing budget to the families (i.e. immediate ancestors) of focal individuals to ensure that their haplotypes can be phased at the sequence level, which is essential for enabling and maximising subsequent sequence imputation. Conclusions: We present a new method for the allocation of a xed sequencing budget to focal individuals and their families such that the nal sequenced haplotypes, when phased at the sequence level, represent the maximum pos‐ sible portion of the haplotype diversity in the population that can be sequenced and phased at that budget.
Is part ofGenetics Selection Evolution, 2017, vol. 49, article number 47
European research projects
The following license files are associated with this item:
Except where otherwise noted, this item's license is described as cc-by (c) Gonen, Serap et al., 2017
Showing items related by title, author, creator and subject.
Accuracy of whole-genome sequence imputation using hybrid peeling in large pedigreed livestock populations Ros Freixedes, Roger; Whalen, Andrew; Chen, Ching-Yi; Gorjanc, Gregor; Herring, William O.; Mileham, Alan J.; Hickey, John M. (BMC (part of Springer Nature), 2020-04-06)Background: The coupling of appropriate sequencing strategies and imputation methods is critical for assembling large whole-genome sequence datasets from livestock populations for research and breeding. In this paper, we ...
Impact of index hopping and bias towards the reference allele on accuracy of genotype calls from low-coverage sequencing Ros Freixedes, Roger; Battagin, Mara; Johnsson, Martin; Gorjanc, Gregor; Mileham, Alan J.; Rounsley, Steve D.; Hickey, John M. (BMC (part of Springer Nature), 2018-12-13)Background: Inherent sources of error and bias that affect the quality of sequence data include index hopping and bias towards the reference allele. The impact of these artefacts is likely greater for low-coverage data ...
A method for allocating low-coverage sequencing resources by targeting haplotypes rather than individuals Ros Freixedes, Roger; Gonen, Serap; Gorjanc, Gregor; Hickey, John M. (BMC (part of Springer Nature), 2017-10-25)Background: This paper describes a heuristic method for allocating low-coverage sequencing resources by target- ing haplotypes rather than individuals. Low-coverage sequencing assembles high-coverage sequence information ...