601 research outputs found

    SMusket: Spark-based DNA error correction on distributed-memory systems

    Get PDF
    ©2020 Elsevier B.V. All rights reserved. This manuscript version is made available under the CC-BY-NC-ND 4.0 license https://creativecommons.org/licenses/bync-nd/4.0/. This version of the article has been accepted for publication in Future Generation Computer Systems. The Version of Record is available online at https://doi.org/10.1016/j.future.2019.10.038This is the accepted version of: R. R. Expósito, J. González-Domínguez, and J. Touriño, "SMusket: Sparkbased DNA error correction on distributed-memory systems", Future Generation Computer Systems, vol. 111, pp. 698-713, 2020, https://doi.org/10.1016/j.future.2019.10.038[Abstract]: Next-Generation Sequencing (NGS) technologies have revolutionized genomics research over the last decade, bringing new opportunities for scientists to perform groundbreaking biological studies. Error correction in NGS datasets is considered an important preprocessing step in many workflows as sequencing errors can severely affect the quality of downstream analysis. Although current error correction approaches provide reasonably high accuracies, their computational cost can be still unacceptable when processing large datasets. In this paper we propose SparkMusket (SMusket), a Big Data tool built upon the open-source Apache Spark cluster computing framework to boost the performance of Musket, one of the most widely adopted and top-performing multithreaded correctors. Our tool efficiently exploits Spark features to implement a scalable error correction algorithm intended for distributed-memory systems built using commodity hardware. The experimental evaluation on a 16-node cluster using four publicly available datasets has shown that SMusket is up to 15.3 times faster than previous state-of-the-art MPI-based tools, also providing a maximum speedup of 29.8 over its multithreaded counterpart. SMusket is publicly available under an open-source license at https://github.com/rreye/smusketThis work was supported by the Ministry of Economy, Industry and Competitiveness of Spain and FEDER, Spain funds of the European Union (project TIN2016-75845-P, AEI/FEDER/EU); and by Xunta de Galicia, Spain (projects ED431G/01 and ED431C 2017/04).Xunta de galicia; ED431G/01Xunta de Galicia; ED431C 2017/0

    Hybrid PolyLingual Object Model: An Efficient and Seamless Integration of Java and Native Components on the Dalvik Virtual Machine

    Get PDF
    Copyright © 2014 Yukun Huang et al. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. JNI in the Android platform is often observed with low efficiency and high coding complexity. Although many researchers have investigated the JNI mechanism, few of them solve the efficiency and the complexity problems of JNI in the Android platform simultaneously. In this paper, a hybrid polylingual object (HPO) model is proposed to allow a CAR object being accessed as a Java object and as vice in the Dalvik virtual machine. It is an acceptable substitute for JNI to reuse the CAR-compliant components in Android applications in a seamless and efficient way. The metadata injection mechanism is designed to support the automatic mapping and reflection between CAR objects and Java objects. A prototype virtual machine, called HPO-Dalvik, is implemented by extending the Dalvik virtual machine to support the HPO model. Lifespan management, garbage collection, and data type transformation of HPO objects are also handled in the HPO-Dalvik virtual machine automatically. The experimental result shows that the HPO model outweighs the standard JNI in lower overhead on native side, better executing performance with no JNI bridging code being demanded. 1

    Enhancing in-memory Efficiency for MapReduce-based Data Processing

    Get PDF
    This is a post-peer-review, pre-copyedit version of an article published in Journal of Parallel and Distributed Computing. The final authenticated version is available online at: https://doi.org/10.1016/j.jpdc.2018.04.001[Abstract] As the memory capacity of computational systems increases, the in-memory data management of Big Data processing frameworks becomes more crucial for performance. This paper analyzes and improves the memory efficiency of Flame-MR, a framework that accelerates Hadoop applications, providing valuable insight into the impact of memory management on performance. By optimizing memory allocation, the garbage collection overheads and execution times have been reduced by up to 85% and 44%, respectively, on a multi-core cluster. Moreover, different data buffer implementations are evaluated, showing that off-heap buffers achieve better results overall. Memory resources are also leveraged by caching intermediate results, improving iterative applications by up to 26%. The memory-enhanced version of Flame-MR has been compared with Hadoop and Spark on the Amazon EC2 cloud platform. The experimental results have shown significant performance benefits reducing Hadoop execution times by up to 65%, while providing very competitive results compared to Spark.Ministerio de Economía, industria y Competitividad; TIN2016-75845-P, AEI/FEDER/EUMinisterio de Educación; FPU14/0280

    Tachyon: Reliable, Memory Speed Storage for Cluster Computing Frameworks

    Get PDF
    Tachyon is a distributed file system enabling reliable data sharing at memory speed across cluster computing frameworks. While caching today improves read workloads, writes are either network or disk bound, as replication is used for fault-tolerance. Tachyon eliminates this bottleneck by pushing lineage, a well-known technique, into the storage layer. The key challenge in making a long-running lineage-based storage system is timely data recovery in case of failures. Tachyon addresses this issue by introducing a checkpointing algorithm that guarantees bounded recovery cost and resource allocation strategies for recomputation under commonly used resource schedulers. Our evaluation shows that Tachyon outperforms in-memory HDFS by 110x for writes. It also improves the end-to-end latency of a realistic workflow by 4x. Tachyon is open source and is deployed at multiple companies.National Science Foundation (U.S.) (CISE Expeditions Award CCF-1139158)Lawrence Berkeley National Laboratory (Award 7076018)United States. Defense Advanced Research Projects Agency (XData Award FA8750-12-2-0331

    Intel Concurrent Collections for Haskell

    Get PDF
    Intel Concurrent Collections (CnC) is a parallel programming model in which a network of steps (functions) communicate through message-passing as well as a limited form of shared memory. This paper describes a new implementation of CnC for Haskell. Compared to existing parallel programming models for Haskell, CnC occupies a useful point in the design space: pure and deterministic like Evaluation Strategies, but more explicit about granularity and the structure of the parallel computation, which affords the programmer greater control over parallel performance. We present results on 4, 8, and 32-core machines demonstrating parallel speedups over 20x on non-trivial benchmarks

    GeantV: Results from the prototype of concurrent vector particle transport simulation in HEP

    Full text link
    Full detector simulation was among the largest CPU consumer in all CERN experiment software stacks for the first two runs of the Large Hadron Collider (LHC). In the early 2010's, the projections were that simulation demands would scale linearly with luminosity increase, compensated only partially by an increase of computing resources. The extension of fast simulation approaches to more use cases, covering a larger fraction of the simulation budget, is only part of the solution due to intrinsic precision limitations. The remainder corresponds to speeding-up the simulation software by several factors, which is out of reach using simple optimizations on the current code base. In this context, the GeantV R&D project was launched, aiming to redesign the legacy particle transport codes in order to make them benefit from fine-grained parallelism features such as vectorization, but also from increased code and data locality. This paper presents extensively the results and achievements of this R&D, as well as the conclusions and lessons learnt from the beta prototype.Comment: 34 pages, 26 figures, 24 table
    corecore