Search CORE

1,957 research outputs found

Concurrent and Accurate Short Read Mapping on Multicore Processors

Author: Barrachina Mir Sergio
Castillo Catalán Maribel
Dopazo Joaquín
Martínez Pérez Héctor
Medina Ignacio
Quintana-Orti Enrique S.
Tárraga Joaquín
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2015
Field of study

We introduce a parallel aligner with a work-flow organization for fast and accurate mapping of RNA sequences on servers equipped with multicore processors. Our software, HPG Aligner SA1, exploits a suffix array to rapidly map a large fraction of the RNA fragments (reads), as well as leverages the accuracy of the Smith-Waterman algorithm to deal with conflictive reads. The aligner is enhanced with a careful strategy to detect splice junctions based on an adaptive division of RNA reads into small segments (or seeds), which are then mapped onto a number of candidate alignment locations, providing crucial information for the successful alignment of the complete reads. The experimental results on a platform with Intel multicore technology report the parallel performance of HPG Aligner SA, on RNA reads of 100–400 nucleotides, which excels in execution time/sensitivity to state-of-the-art aligners such as TopHat 2+Bowtie 2, MapSplice, and STAR.This work has been supported by the Bull-CIPF Chair for Computational Genomics. The researchers from the Jaume I University were supported by project TIN2011-23283 and FEDER

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositori Institucional de la Universitat Jaume I

A framework for genomic sequencing on clusters of multicore and manycore processors

Author: Barrachina Sergio
Castillo Maribel
Dopazo Joaquín
Martínez Héctor
Medina Ignacio
Quintana Ortí Enrique Salvador
Tárraga Joaquín
Publication venue: 'SAGE Publications'
Publication date: 01/01/2016
Field of study

[EN] The advances in genomic sequencing during the past few years have motivated the development of fast and reliable software for DNA/RNA sequencing on current high performance architectures. Most of these efforts target multicore processors, only a few can also exploit graphics processing units, and a much smaller set will run in clusters equipped with any of these multi-threaded architecture technologies. Furthermore, the examples that can be used on clusters today are all strongly coupled with a particular aligner. In this paper we introduce an alignment framework that can be leveraged to coordinately run any single-node aligner, taking advantage of the resources of a cluster without having to modify any portion of the original software. The key to our transparent migration lies in hiding the complexity associated with the multi-node execution (such as coordinating the processes running in the cluster nodes) inside the generic-aligner framework. Moreover, following the design and operation in our Message Passing Interface (MPI) version of HPG Aligner RNA BWT, we organize the framework into two stages in order to be able to execute different aligners in each one of them. With this configuration, for example, the first stage can ideally apply a fast aligner to accelerate the process, while the second one can be tuned to act as a refinement stage that further improves the global alignment process with little cost.The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: The researchers from the University Jaume I were supported by the MINECO/CICYT (grant numbers TIN2011-23283 and TIN2014-53495-R) and FEDER.Martínez, H.; Barrachina, S.; Castillo, M.; Tárraga, J.; Medina, I.; Dopazo, J.; Quintana Ortí, ES. (2018). A framework for genomic sequencing on clusters of multicore and manycore processors. International Journal of High Performance Computing Applications. 32(3):393-406. https://doi.org/10.1177/1094342016653243S393406323Biesecker, L. G. (2010). Exome sequencing makes medical genomics a reality. Nature Genetics, 42(1), 13-14. doi:10.1038/ng0110-13Burrows M, Wheeler D (1994) A block sorting lossless data compression algorithm. Technical report 124, Palo Alto: Digital Equipment Corporation.Cock, P. J. A., Fields, C. J., Goto, N., Heuer, M. L., & Rice, P. M. (2009). The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic Acids Research, 38(6), 1767-1771. doi:10.1093/nar/gkp1137Dobin, A., Davis, C. A., Schlesinger, F., Drenkow, J., Zaleski, C., Jha, S., … Gingeras, T. R. (2012). STAR: ultrafast universal RNA-seq aligner. Bioinformatics, 29(1), 15-21. doi:10.1093/bioinformatics/bts635Ferragina, P., & Manzini, G. (s. f.). Opportunistic data structures with applications. Proceedings 41st Annual Symposium on Foundations of Computer Science. doi:10.1109/sfcs.2000.892127Garber, M., Grabherr, M. G., Guttman, M., & Trapnell, C. (2011). Computational methods for transcriptome annotation and quantification using RNA-seq. Nature Methods, 8(6), 469-477. doi:10.1038/nmeth.1613Grant, G. R., Farkas, M. H., Pizarro, A. D., Lahens, N. F., Schug, J., Brunk, B. P., … Pierce, E. A. (2011). Comparative analysis of RNA-Seq alignment algorithms and the RNA-Seq unified mapper (RUM). Bioinformatics, 27(18), 2518-2528. doi:10.1093/bioinformatics/btr427Kim, D., Pertea, G., Trapnell, C., Pimentel, H., Kelley, R., & Salzberg, S. L. (2013). TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biology, 14(4), R36. doi:10.1186/gb-2013-14-4-r36Langmead, B., & Salzberg, S. L. (2012). Fast gapped-read alignment with Bowtie 2. Nature Methods, 9(4), 357-359. doi:10.1038/nmeth.1923Langmead, B., Trapnell, C., Pop, M., & Salzberg, S. L. (2009). Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biology, 10(3), R25. doi:10.1186/gb-2009-10-3-r25Li, H., Handsaker, B., Wysoker, A., Fennell, T., Ruan, J., … Homer, N. (2009). The Sequence Alignment/Map format and SAMtools. Bioinformatics, 25(16), 2078-2079. doi:10.1093/bioinformatics/btp352Li, H., & Homer, N. (2010). A survey of sequence alignment algorithms for next-generation sequencing. Briefings in Bioinformatics, 11(5), 473-483. doi:10.1093/bib/bbq015Yongchao Liu, & Schmidt, B. (2014). CUSHAW2-GPU: Empowering Faster Gapped Short-Read Alignment Using GPU Computing. IEEE Design & Test, 31(1), 31-39. doi:10.1109/mdat.2013.2284198Liu, Y., Popp, B., & Schmidt, B. (2014). CUSHAW3: Sensitive and Accurate Base-Space and Color-Space Short-Read Alignment with Hybrid Seeding. PLoS ONE, 9(1), e86869. doi:10.1371/journal.pone.0086869Manber, U., & Myers, G. (1993). Suffix Arrays: A New Method for On-Line String Searches. SIAM Journal on Computing, 22(5), 935-948. doi:10.1137/0222058Martinez, H., Barrachina, S., Castillo, M., Tarraga, J., Medina, I., Dopazo, J., & Quintana-Orti, E. S. (2015). Scalable RNA Sequencing on Clusters of Multicore Processors. 2015 IEEE Trustcom/BigDataSE/ISPA. doi:10.1109/trustcom.2015.631Martínez, H., Tárraga, J., Medina, I., Barrachina, S., Castillo, M., Dopazo, J., & Quintana-Ortí, E. S. (2013). A dynamic pipeline for RNA sequencing on multicore processors. Proceedings of the 20th European MPI Users’ Group Meeting on - EuroMPI ’13. doi:10.1145/2488551.2488581Martinez, H., Tarraga, J., Medina, I., Barrachina, S., Castillo, M., Dopazo, J., & Quintana-Orti, E. S. (2015). Concurrent and Accurate Short Read Mapping on Multicore Processors. IEEE/ACM Transactions on Computational Biology and Bioinformatics, 12(5), 995-1007. doi:10.1109/tcbb.2015.2392077Smith, T. F., & Waterman, M. S. (1981). Identification of common molecular subsequences. Journal of Molecular Biology, 147(1), 195-197. doi:10.1016/0022-2836(81)90087-5Tárraga, J., Arnau, V., Martínez, H., Moreno, R., Cazorla, D., Salavert-Torres, J., … Medina, I. (2014). Acceleration of short and long DNA read mapping without loss of accuracy using suffix array. Bioinformatics, 30(23), 3396-3398. doi:10.1093/bioinformatics/btu553Wang, K., Singh, D., Zeng, Z., Coleman, S. J., Huang, Y., Savich, G. L., … Liu, J. (2010). MapSplice: Accurate mapping of RNA-seq reads for splice junction discovery. Nucleic Acids Research, 38(18), e178-e178. doi:10.1093/nar/gkq62

Crossref

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositori Institucional de la Universitat Jaume I

RiuNet

Performance and programmability comparison of the thick control flow architecture and current multicore processors

Author: Forsell Martti
Leppänen Ville
Nikula Sara
Roivainen Jussi
Träff Jesper Larsson
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 20/07/2021
Field of study

VTT Research System

Concurrent and Accurate RNA Sequencing on Multicore Platforms

Author: Barrachina Sergio
Castillo Maribel
Dopazo Joaquín
Martínez Héctor
Medina Ignacio
Quintana-Ortí Enrique S.
Tárraga Joaquín
Publication venue
Publication date: 01/01/2013
Field of study

In this paper we introduce a novel parallel pipeline for fast and accurate mapping of RNA sequences on servers equipped with multicore processors. Our software, named HPG-Aligner, leverages the speed of the Burrows-Wheeler Transform to map a large number of RNA fragments (reads) rapidly, as well as the accuracy of the Smith-Waterman algorithm, that is employed to deal with conflictive reads. The aligner is complemented with a careful strategy to detect splice junctions based on the division of RNA reads into short segments (or seeds), which are then mapped onto a number of candidate alignment locations, providing useful information for the successful alignment of the complete reads. Experimental results on platforms with AMD and Intel multicore processors report the remarkable parallel performance of HPG-Aligner, on short and long RNA reads, which excels in both execution time and sensitivity to an state-of-the-art aligner such as TopHat 2 built on top of Bowtie and Bowtie 2

arXiv.org e-Print Archive

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositori Institucional de la Universitat Jaume I

Efficiently mapping high-performance early vision algorithms onto multicore embedded platforms

Author: Apewokin Senyo
Publication venue: Georgia Institute of Technology
Publication date: 09/01/2009
Field of study

The combination of low-cost imaging chips and high-performance, multicore, embedded processors heralds a new era in portable vision systems. Early vision algorithms have the potential for highly data-parallel, integer execution. However, an implementation must operate within the constraints of embedded systems including low clock rate, low-power operation and with limited memory. This dissertation explores new approaches to adapt novel pixel-based vision algorithms for tomorrow's multicore embedded processors. It presents : - An adaptive, multimodal background modeling technique called Multimodal Mean that achieves high accuracy and frame rate performance with limited memory and a slow-clock, energy-efficient, integer processing core. - A new workload partitioning technique to optimize the execution of early vision algorithms on multi-core systems. - A novel data transfer technique called cat-tail dma that provides globally-ordered, non-blocking data transfers on a multicore system. By using efficient data representations, Multimodal Mean provides comparable accuracy to the widely used Mixture of Gaussians (MoG) multimodal method. However, it achieves a 6.2x improvement in performance while using 18% less storage than MoG while executing on a representative embedded platform. When this algorithm is adapted to a multicore execution environment, the new workload partitioning technique demonstrates an improvement in execution times of 25% with only a 125 ms system reaction time. It also reduced the overall number of data transfers by 50%. Finally, the cat-tail buffering technique reduces the data-transfer latency between execution cores and main memory by 32.8% over the baseline technique when executing Multimodal Mean. This technique concurrently performs data transfers with code execution on individual cores, while maintaining global ordering through low-overhead scheduling to prevent collisions.Ph.D.Committee Chair: Wills, Scott; Committee Co-Chair: Wills, Linda; Committee Member: Bader, David; Committee Member: Davis, Jeff; Committee Member: Hamblen, James; Committee Member: Lanterman, Aaro

Scholarly Materials And Research @ Georgia Tech

DReAM: An approach to estimate per-Task DRAM energy in multicore systems

Author: Abella Ferrer Jaume
Cazorla Almeida Francisco Javier
Liu Qixiao
Moreto Planas Miquel
Valero Cortés Mateo
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2016
Field of study

Accurate per-task energy estimation in multicore systems would allow performing per-task energy-aware task scheduling and energy-aware billing in data centers, among other applications. Per-task energy estimation is challenged by the interaction between tasks in shared resources, which impacts tasks’ energy consumption in uncontrolled ways. Some accurate mechanisms have been devised recently to estimate per-task energy consumed on-chip in multicores, but there is a lack of such mechanisms for DRAM memories. This article makes the case for accurate per-task DRAM energy metering in multicores, which opens new paths to energy/performance optimizations. In particular, the contributions of this article are (i) an ideal per-task energy metering model for DRAM memories; (ii) DReAM, an accurate yet low cost implementation of the ideal model (less than 5% accuracy error when 16 tasks share memory); and (iii) a comparison with standard methods (even distribution and access-count based) proving that DReAM is much more accurate than these other methods.Peer ReviewedPostprint (author's final draft

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

DReAM: Per-task DRAM energy metering in multicore systems

Author: Abella Ferrer Jaume
Cazorla Almeida Francisco Javier
Liu Qixiao
Moreto Planas Miquel
Valero Cortés Mateo
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Interaction across applications in DRAM memory impacts its energy consumption. This paper makes the case for accurate per-task DRAM energy metering in multicores, which opens new paths to energy/performance optimizations, such as per-task energy-aware task scheduling and energy-aware billing in datacenters. In particular, the contributions of this paper are (i) an ideal per-task energy metering model for DRAM memories; (ii) DReAM, an accurate, yet low cost, implementation of the ideal model (less than 5% accuracy error when 16 tasks share memory); and (iii) a comparison with standard methods (even distribution and access-count based) proving that DReAM is more accurate than these other methods.This work has been partially supported by the Spanish Ministry of Science and Innovation under grant TIN2012-34557, the HiPEAC Network of Excellence, by the European Research Council under the European Union’s 7th FP, ERC Grant Agreement n. 321253, and by a joint study agreement between IBM and BSC (number W1361154). Qixiao Liu has also been funded by the Chinese Scholarship Council under grant 2010608015.Postprint (published version

UPCommons. Portal del coneixement obert de la UPC

Exploiting managed language semantics to optimize for hardware heterogeneity

Author: Akram Shoaib
Publication venue
Publication date: 01/01/2019
Field of study

Ghent University Academic Bibliography

Programming MPSoC platforms: Road works ahead

Author: Bekooij Marco
Domer Rainer
Leupers Rainer
Nohl Achim
Soonhoi Ha
Vajda Andras
Publication venue: IEEE Computer Society Press
Publication date: 01/01/2009
Field of study

This paper summarizes a special session on multicore/multi-processor system-on-chip (MPSoC) programming challenges. The current trend towards MPSoC platforms in most computing domains does not only mean a radical change in computer architecture. Even more important from a SW developer´s viewpoint, at the same time the classical sequential von Neumann programming model needs to be overcome. Efficient utilization of the MPSoC HW resources demands for radically new models and corresponding SW development tools, capable of exploiting the available parallelism and guaranteeing bug-free parallel SW. While several standards are established in the high-performance computing domain (e.g. OpenMP), it is clear that more innovations are required for successful\ud deployment of heterogeneous embedded MPSoC. On the other hand, at least for coming years, the freedom for disruptive programming technologies is limited by the huge amount of certified sequential code that demands for a more pragmatic, gradual tool and code replacement strategy

Publikationsserver der RWTH Aachen University

University of Twente Research Information

Parallelism-Aware Memory Interference Delay Analysis for COTS Multicore Systems

Author: Yun Heechul
Publication venue
Publication date: 25/07/2014
Field of study

In modern Commercial Off-The-Shelf (COTS) multicore systems, each core can generate many parallel memory requests at a time. The processing of these parallel requests in the DRAM controller greatly affects the memory interference delay experienced by running tasks on the platform. In this paper, we model a modern COTS multicore system which has a nonblocking last-level cache (LLC) and a DRAM controller that prioritizes reads over writes. To minimize interference, we focus on LLC and DRAM bank partitioned systems. Based on the model, we propose an analysis that computes a safe upper bound for the worst-case memory interference delay. We validated our analysis on a real COTS multicore platform with a set of carefully designed synthetic benchmarks as well as SPEC2006 benchmarks. Evaluation results show that our analysis is more accurately capture the worst-case memory interference delay and provides safer upper bounds compared to a recently proposed analysis which significantly under-estimate the delay.Comment: Technical Repor

arXiv.org e-Print Archive

CiteSeerX