High Performance computing improvements on bioinformatics consistency-based multiple sequence alignment tools
Lladós Segura, Jordi
MetadataShow full item record
Multiple Sequence Alignment (MSA) is essential for a wide range of applications in Bioinformatics. Traditionally, the alignment accuracy was the main metric used to evaluate the goodness of MSA tools. However, with the growth of sequencing data, other features, such as performance and the capacity to
align larger datasets, are gaining strength. To achieve these new requirements, without affecting accuracy, the use of high-performance computing (HPC) resources and techniques is crucial. In this paper, we apply HPC techniques in T-Coffee, one of the more accurate but less scalable MSA tools. We integrate three innovative solutions into T-Coffee: the Balanced Guide Tree to increase the parallelism/performance, the Optimized Library Method with the aim of enhancing the scalability and the Multiple Tree Alignment, which explores different alignments in parallel to improve the accuracy. The results obtained show that the resulting tool, MTA-TCoffee, is able to improve the scalability in both the execution time and also the number of sequences to be aligned. Furthermore, not only is the alignment accuracy not affected by these improvements, as would be expected, but it improves significantly. Finally, we emphasize that the presented methods are not just restricted to T-Coffee, but may be implemented in any other alignment tools that use similar algorithms (progressive alignment, consistency or guide trees).
Is part ofParallel Computing, 2014, vol. 42, p. 18-34
Showing items related by title, author, creator and subject.
Cloud-Coffee: implementation of a parallel consistency-based multiple alignment algorithm in the T-Coffee package and its benchmarking on the Amazon Elastic-Cloud Di Tommaso, Paolo; Orobitg Cortada, Miquel; Guirado Fernández, Fernando; Cores Prado, Fernando; Espinosa, Toni; Notredame, Cedric (Oxford University Press, 2010)Summary: We present the first parallel implementation of the T-Coffee consistency-based multiple aligner. We benchmark it on the Amazon Elastic Cloud (EC2) and show that the parallelization procedure is reasonably ...
Orobitg Cortada, Miquel; Cores Prado, Fernando; Guirado Fernández, Fernando; Roig Mateu, Concepció; Notredame, Cedric (Springer, 2013)Accuracy on multiple sequence alignments (MSA) is of great significance for such important biological applications as evolution and phylogenetic analysis, homology and domain structure prediction. In such analyses, alignment ...
Orobitg Cortada, Miquel; Guirado Fernández, Fernando; Notredame, Cedric; Cores Prado, Fernando (Springer Verlag, 2011)Multiple Sequence Alignment (MSA) constitutes an extremely powerful tool for important biological applications such as phylogenetic analysis, identification of conserved motifs and domains and structure prediction. In ...