92 research outputs found

    Accelerating binary biclustering on platforms with CUDA-enabled GPUs

    Get PDF
    © 2018 Elsevier B.V. All rights reserved. This manuscript version is made available under the CC-BY-NC-ND 4.0 license https://creativecommons.org/licenses/bync-nd/4.0/. This version of the article has been accepted for publication in Information Sciences. The Version of Record is available online at https://doi.org/10.1016/j.ins.2018.05.025This is a version of: J. González-Domínguez and R. R. Expósito, "Accelerating binary biclustering on platforms with CUDA-enabled GPUs", Information Sciences, Vol. 496, Sept. 2019, pp. 317-325, https://doi.org/10.1016/j.ins.2018.05.025[Abstract]: Data mining is nowadays essential in many scientific fields to extract valuable information from large input datasets and transform it into an understandable structure. For instance, biclustering techniques are very useful in identifying subsets of two-dimensional data where both rows and columns are correlated. However, some biclustering techniques have become extremely time-consuming when processing very large datasets, which nowadays prevents their use in many areas of research and industry (such as bioinformatics) that have experienced an explosive growth on the amount of available data. In this work we present CUBiBit, a tool that accelerates the search for relevant biclusters on binary data by exploiting the computational capabilities of CUDA-enabled GPUs as well as the several CPU cores available in most current systems. The experimental evaluation has shown that CUBiBit is up to 116 times faster than the fastest state-of-the-art tool, BiBit, in a system with two Intel Sandy Bridge processors (16 CPU cores) and three NVIDIA K20 GPUs. CUBiBit is publicly available to download from https://sourceforge.net/projects/cubibitThis work was supported by the Ministry of Economy, Industry and Competitiveness of Spain and FEDER funds of the European Union [grant TIN2016-75845-P (AEI/FEDER/UE)], as well as by Xunta de Galicia (Centro Singular de Investigacion de Galicia accreditation 2016-2019, ref. EDG431G/01).Xunta de Galicia; EDG431G/0

    ParBiBit: Parallel tool for binary biclustering on modern distributed-memory systems

    Get PDF
    [Abstract]: Biclustering techniques are gaining attention in the analysis of large-scale datasets as they identify two-dimensional submatrices where both rows and columns are correlated. In this work we present ParBiBit, a parallel tool to accelerate the search of interesting biclusters on binary datasets, which are very popular on different fields such as genetics, marketing or text mining. It is based on the state-of-the-art sequential Java tool BiBit, which has been proved accurate by several studies, especially on scenarios that result on many large biclusters. ParBiBit uses the same methodology as BiBit (grouping the binary information into patterns) and provides the same results. Nevertheless, our tool significantly improves performance thanks to an efficient implementation based on C++11 that includes support for threads and MPI processes in order to exploit the compute capabilities of modern distributed-memory systems, which provide several multicore CPU nodes interconnected through a network. Our performance evaluation with 18 representative input datasets on two different eight-node systems shows that our tool is significantly faster than the original BiBit. Source code in C++ and MPI running on Linux systems as well as a reference manual are available at https://sourceforge.net/projects/parbibit/.This work was supported by the Ministry of Economy, Industry and Competitiveness of Spain and FEDER funds of the European Union [grant TIN2016-75845-P (AEI/FEDER/UE)], as well as by Xunta de Galicia (Centro Singular de Investigacion de Galicia accreditation 2016-2019, ref. EDG431G/01).Xunta de Galicia; EDG431G/0

    SMusket: Spark-based DNA error correction on distributed-memory systems

    Get PDF
    ©2020 Elsevier B.V. All rights reserved. This manuscript version is made available under the CC-BY-NC-ND 4.0 license https://creativecommons.org/licenses/bync-nd/4.0/. This version of the article has been accepted for publication in Future Generation Computer Systems. The Version of Record is available online at https://doi.org/10.1016/j.future.2019.10.038This is the accepted version of: R. R. Expósito, J. González-Domínguez, and J. Touriño, "SMusket: Sparkbased DNA error correction on distributed-memory systems", Future Generation Computer Systems, vol. 111, pp. 698-713, 2020, https://doi.org/10.1016/j.future.2019.10.038[Abstract]: Next-Generation Sequencing (NGS) technologies have revolutionized genomics research over the last decade, bringing new opportunities for scientists to perform groundbreaking biological studies. Error correction in NGS datasets is considered an important preprocessing step in many workflows as sequencing errors can severely affect the quality of downstream analysis. Although current error correction approaches provide reasonably high accuracies, their computational cost can be still unacceptable when processing large datasets. In this paper we propose SparkMusket (SMusket), a Big Data tool built upon the open-source Apache Spark cluster computing framework to boost the performance of Musket, one of the most widely adopted and top-performing multithreaded correctors. Our tool efficiently exploits Spark features to implement a scalable error correction algorithm intended for distributed-memory systems built using commodity hardware. The experimental evaluation on a 16-node cluster using four publicly available datasets has shown that SMusket is up to 15.3 times faster than previous state-of-the-art MPI-based tools, also providing a maximum speedup of 29.8 over its multithreaded counterpart. SMusket is publicly available under an open-source license at https://github.com/rreye/smusketThis work was supported by the Ministry of Economy, Industry and Competitiveness of Spain and FEDER, Spain funds of the European Union (project TIN2016-75845-P, AEI/FEDER/EU); and by Xunta de Galicia, Spain (projects ED431G/01 and ED431C 2017/04).Xunta de galicia; ED431G/01Xunta de Galicia; ED431C 2017/0

    SeQual: Big Data Tool to Perform Quality Control and Data Preprocessing of Large NGS Datasets

    Get PDF
    [Abstract] This paper presents SeQual, a scalable tool to efficiently perform quality control of large genomic datasets. Our tool currently supports more than 30 different operations (e.g., filtering, trimming, formatting) that can be applied to DNA/RNA reads in FASTQ/FASTA formats to improve subsequent downstream analyses, while providing a simple and user-friendly graphical interface for non-expert users. Furthermore, SeQual takes full advantage of Big Data technologies to process massive datasets on distributed-memory systems such as clusters by relying on the open-source Apache Spark cluster computing framework. Our scalable Spark-based implementation allows to reduce the runtime from more than three hours to less than 20 minutes when processing a paired-end dataset with 251 million reads per input file on an 8-node multi-core cluster.10.13039/501100004837-Ministry of Science and Innovation of Spain (Grant Number: TIN2016-75845-P and PID2019-104184RB-I00) 10.13039/501100004837-AEI/FEDER/EU (Grant Number: 10.13039/501100011033) 10.13039/501100010801-Xunta de Galicia and FEDER funds (Centro de Investigación de Galicia accreditation 2019–2022 and the Consolidation Program of Competitive Reference Groups) (Grant Number: ED431G 2019/01 and ED431C 2017/04)Xunta de Galicia; ED431G 2019/01Xunta de Galicia; ED431C 2017/0

    CUDA-JMI: Acceleration of feature selection on heterogeneous systems

    Get PDF
    ©2019 Elsevier B.V. All rights reserved. This manuscript version is made available under the CC-BY-NC-ND 4.0 license https://creativecommons.org/licenses/bync-nd/4.0/. This version of the article has been accepted for publication in Future Generation Computer Systems. The Version of Record is available online at https://doi.org/10.1016/j.future.2019.08.031Versión final aceptada de: J. González-Domínguez, R. R. Expósito, and V. Bolón-Canedo, "CUDA-JMI: Acceleration of feature selection on heterogeneous systemss", Future Generation Computer Systems, Vol. 102, pp. 426-436, Jan. 2020, https://doi.org/10.1016/j.future.2019.08.031[Abstract]: Feature selection is a crucial step nowadays in machine learning and data analytics to remove irrelevant and redundant characteristics and thus to provide fast and reliable analyses. Many research works have focused on developing new methods that increase the global relevance of the subset of selected features while reducing the redundancy of information. However, those methods that select features with high relevance and low redundancy are extremely time-consuming when processing large datasets. In this work we present CUDA-JMI, a tool based on Joint Mutual Information that accelerates feature selection by exploiting the computational capabilities of modern heterogeneous systems that contain several CPU cores and GPU devices. The experimental evaluation has been carried out in three systems with different type and amount of CPUs and GPUs using five publicly available datasets from different fields. These results show that CUDA-JMI is significantly faster than its original sequential counterpart for all systems and input datasets. For instance, the runtime of CUDA-JMI is up to 52 times faster than an existing sequential JMI-based implementation in a machine with 24 CPU cores and two NVIDIA M60 boards (four GPUs). CUDA-JMI is publicly available to download from https://sourceforge.net/projects/cuda-jmiThis research has been partially funded by projects TIN2016-75845-P and TIN-2015-65069-C2-1-R of the Ministry of Economy, Industry and Competitiveness of Spain, as well as by Xunta de Galicia, Spain projects ED431D R2016/045, ED431G/01 and GRC2014/035, all of them partially funded by FEDER, Spain funds of the European Union.Xunta de Galicia; ED431D R2016/045Xunta de Galicia; ED431G/01Xunta de Galicia; GRC2014/03

    MarDRe: efficient MapReduce-based removal of duplicate DNA reads in the cloud

    Get PDF
    This is a pre-copyedited, author-produced version of an article accepted for publication in Bioinformatics following peer review. The version of record Roberto R. Expósito, Jorge Veiga, Jorge González-Domínguez, Juan Touriño; MarDRe: efficient MapReduce-based removal of duplicate DNA reads in the cloud, Bioinformatics, Volume 33, Issue 17, 1 September 2017, Pages 2762–2764 is available online at: https://doi.org/10.1093/bioinformatics/btx307[Abstract] This article presents MarDRe, a de novo cloud-ready duplicate and near-duplicate removal tool that can process single- and paired-end reads from FASTQ/FASTA datasets. MarDRe takes advantage of the widely adopted MapReduce programming model to fully exploit Big Data technologies on cloud-based infrastructures. Written in Java to maximize cross-platform compatibility, MarDRe is built upon the open-source Apache Hadoop project, the most popular distributed computing framework for scalable Big Data processing. On a 16-node cluster deployed on the Amazon EC2 cloud platform, MarDRe is up to 8.52 times faster than a representative state-of-the-art tool.Ministerio de Economia y Competitividad; TIN2016-75845-PMinisterio de Educación; FPU014/0280

    MPI-dot2dot: A Parallel Tool to Find DNA Tandem Repeats on Multicore Clusters

    Get PDF
    Financiado para publicación en acceso aberto: Universidade da Coruña/CISUG[Abstract] Tandem Repeats (TRs) are segments that occur several times in a DNA sequence, and each copy is adjacent to other. In the last few years, TRs have gained significant attention as they are thought to be related with certain human diseases. Therefore, identifying and classifying TRs have become a highly important task in bioinformatics in order to analyze their disorders and relationships with illnesses. Dot2dot, a tool recently developed to find TRs, provides more accurate results than the previous state-of-the-art, but it requires a long execution time even when using multiple threads. This work presents MPI-dot2dot, a novel version of this tool that combines MPI and OpenMP so that it can be executed in a cluster of multicore nodes and thus reduces its execution time. The performance of this new parallel implementation has been tested using different real datasets. Depending on the characteristics of the input genomes, it is able to obtain the same biological results as Dot2dot but more than 100 times faster on a 16-node multicore cluster (384 cores). MPI-dot2dot is publicly available to download from https://sourceforge.net/projects/mpi-dot2dot.This work was supported by the Ministry of Science and Innovation of Spain (PID2019-104184RB-I00 / AEI / 10.13039/501100011033), and by Xunta de Galicia and FEDER funds (Centro de Investigación de Galicia accreditation 2019-2022 and Consolidation Program of Competitive Reference Groups, under Grants ED431G 2019/01 and ED431C 2021/30, respectively). The authors would like to thank the Galician Supercomputing Center (CESGA) for providing access to the Finis Terrae II supercomputer. Open Access funding provided thanks to the CRUE-CSIC agreement with Springer NatureXunta de Galicia; ED431G 2019/01Xunta de Galicia; ED431C 2021/3

    The Servet 3.0 benchmark suite: characterization of network performance degradation

    Get PDF
    This is a post-peer-review, pre-copyedit version of an article published in Computers & Electrical Engineering. The final authenticated version is available online at: https://doi.org/10.1016/j.compeleceng.2013.08.012.[Abstract] Servet is a suite of benchmarks focused on extracting a set of parameters with high influence on the overall performance of multicore clusters. These parameters can be used to optimize the performance of parallel applications by adapting part of their behavior to the characteristics of the machine. Up to now the tool considered network bandwidth as constant and independent of the communication pattern. Nevertheless, the inter-node communication bandwidth decreases on modern large supercomputers depending on the number of cores per node that simultaneously access the network and on the distance between the communicating nodes. This paper describes two new benchmarks that improve Servet by characterizing the network performance degradation depending on these factors. This work also shows the experimental results of these benchmarks on a Cray XE6 supercomputer and some examples of how real parallel codes can be optimized by using the information about network degradation.Ministerio de Ciencia e Innovación; TIN2010-16735Ministerio de Educación; AP2008-01578Ministerio de Educación; AP2010-4348European Commision; HPC-Europa2 Programme; 22839

    Analysis of I/O Performance on an Amazon EC2 Cluster Compute and High I/O Platform

    Get PDF
    “This is a post-peer-review, pre-copyedit version of an article published in Journal of Grid Computing. The final authenticated version is available online at: https://doi.org/10.1007/s10723-013-9250-y[Abstract] Cloud computing is currently being explored by the scientific community to assess its suitability for High Performance Computing (HPC) environments. In this novel paradigm, compute and storage resources, as well as applications, can be dynamically provisioned on a pay-per-use basis. This paper presents a thorough evaluation of the I/O storage subsystem using the Amazon EC2 Cluster Compute platform and the recent High I/O instance type, to determine its suitability for I/O-intensive applications. The evaluation has been carried out at different layers using representative benchmarks in order to evaluate the low-level cloud storage devices available in Amazon EC2, ephemeral disks and Elastic Block Store (EBS) volumes, both on local and distributed file systems. In addition, several I/O interfaces (POSIX, MPI-IO and HDF5) commonly used by scientific workloads have also been assessed. Furthermore, the scalability of a representative parallel I/O code has also been analyzed at the application level, taking into account both performance and cost metrics. The analysis of the experimental results has shown that available cloud storage devices can have different performance characteristics and usage constraints. Our comprehensive evaluation can help scientists to increase significantly (up to several times) the performance of I/O-intensive applications in Amazon EC2 cloud. An example of optimal configuration that can maximize I/O performance in this cloud is the use of a RAID 0 of 2 ephemeral disks, TCP with 9,000 bytes MTU, NFS async and MPI-IO on the High I/O instance type, which provides ephemeral disks backed by Solid State Drive (SSD) technology.Ministerio de Ciencia e Innovación; TIN2010-16735Ministerio de Educación; AP2010-4348Galicia. Consellería de Cultura, Educación e Ordenación Universitaria; ref. 2010/
    corecore