Accurate consistency-based MSA reducing the memory footprint
Lladós Segura, Jordi
MetadataShow full item record
Background and Objective: The emergence of Next-Generation sequencing has created a push for faster and more accurate multiple sequence alignment tools. The growing number of sequences and their longer sizes, which require the use of increased system resources and produce less accurate results, are heavily challenging to these applications. Consistency-based methods have the most intensive CPU and memory usage requirements. We hypothesize that library reductions can enhance the scalability and performance of consistency-based multiple sequence alignment tools; however, we have previously shown a noticeable impact on the accuracy when extreme reductions were performed. Methods: In this study, we propose Matrix-Based T-Coffee, a consistency-based method that uses library reductions in conjunction with a complementary objective function. The proposed method, implemented in T-Coffee, can mitigate the accuracy loss caused by low memory resources. Results: The use of a complementary objective function with a library reduction of 30% improved the accuracy of T-Coffee. Interestingly, 50% library reduction achieved lower execution times and better overall scalability. Conclusions: Matrix-Based T-Coffee benefits from accurate alignments while achieving better scalability. This leads to a reduction in memory footprint and execution time. In addition, these enhancements could be applied to other aligners based on consistency.
Is part ofComputer Methods and Programs in Biomedicine, 2021, vol. 208, p. 106237-1-106237-9
European research projects
The following license files are associated with this item:
Except where otherwise noted, this item's license is described as cc-by-nc-nd (c) Jordi Lladós et al., 2021
Showing items related by title, author, creator and subject.
High Performance computing improvements on bioinformatics consistency-based multiple sequence alignment tools Orobitg Cortada, Miquel; Guirado Fernández, Fernando; Cores Prado, Fernando; Lladós Segura, Jordi; Notredame, Cedric (Elsevier, 2014-10-08)Multiple Sequence Alignment (MSA) is essential for a wide range of applications in Bioinformatics. Traditionally, the alignment accuracy was the main metric used to evaluate the goodness of MSA tools. However, with the ...
Lladós Segura, Jordi; Cores Prado, Fernando; Guirado Fernández, Fernando (Springer, 2019)With the advent of new high-throughput next-generation sequencing technologies, the volume of genetic data processed has increased significantly. It is becoming essential for these applications to achieve large-scale alignments ...
Lladós Segura, Jordi; Guirado Fernández, Fernando; Cores Prado, Fernando; Lérida Monsó, Josep Lluís; Notredame, Cedric (Springer Verlag, 2015-05-01)Multiple sequence alignment (MSA) is crucial for high-throughput next generation sequencing applications. Large-scale alignments with thousands of sequences are necessary for these applications. However, the quality of the ...