Universitat de Lleida
    • English
    • català
    • español
  • English 
    • English
    • català
    • español
  • Login
Repositori Obert UdL
View Item 
  •   Home
  • Recerca
  • Informàtica i Enginyeria Industrial
  • Articles publicats (Informàtica i Enginyeria Industrial)
  • View Item
  •   Home
  • Recerca
  • Informàtica i Enginyeria Industrial
  • Articles publicats (Informàtica i Enginyeria Industrial)
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Scalable Consistency in T-Coffee Through Apache Spark and Cassandra Database

Thumbnail
View/Open
Postprint (398.6Kb)
Issue date
2018
Author
Lladós Segura, Jordi
Cores Prado, Fernando
Guirado Fernández, Fernando
Suggested citation
Lladós Segura, Jordi; Cores Prado, Fernando; Guirado Fernández, Fernando; . (2018) . Scalable Consistency in T-Coffee Through Apache Spark and Cassandra Database. Journal of Computational Biology, 2018, vol, 25, nun. 8, p. 894-906. https://doi.org/10.1089/cmb.2018.0084.
Impact


Web of Science logo    citations in Web of Science

Scopus logo    citations in Scopus

Google Scholar logo  Google Scholar
Share
Export to Mendeley
Metadata
Show full item record
Abstract
Next-generation sequencing, also known as high-throughput sequencing, has increased the volume of genetic data processed by sequencers. In the bioinformatic scientific area, highly rated multiple sequence alignment tools, such as MAFFT, ProbCons, and T-Coffee (TC), use the probabilistic consistency as a prior step to the progressive alignment stage to improve the final accuracy. However, such methods are severely limited by the memory required to store the consistency information. Big data processing and persistence techniques are used to manage and store the huge amount of information that is generated. Although these techniques have significant advantages, few biological applications have adopted them. In this article, a novel approach named big data tree-based consistency objective function for alignment evaluation (BDT-Coffee) is presented. BDT-Coffee is based on the integration of consistency information through Cassandra database in TC, previously generated by the MapReduce processing paradigm, to enable large data sets to be processed with the aim of improving the performance and scalability of the original algorithm.
URI
http://hdl.handle.net/10459.1/69308
DOI
https://doi.org/10.1089/cmb.2018.0084
Is part of
Journal of Computational Biology, 2018, vol, 25, nun. 8, p. 894-906
European research projects
Collections
  • Articles publicats (Informàtica i Enginyeria Industrial) [990]
  • Publicacions de projectes de recerca del Plan Nacional [2958]

Contact Us | Send Feedback | Legal Notice
© 2023 BiD. Universitat de Lleida
Metadata subjected to 
 

 

Browse

All of the repositoryCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

Statistics

View Usage Statistics

D'interès

Política institucional d'accés obertDiposita les teves publicacionsDiposita dades de recercaSuport a la recerca

Contact Us | Send Feedback | Legal Notice
© 2023 BiD. Universitat de Lleida
Metadata subjected to