Search CORE

505 research outputs found

Universal Indexes for Highly Repetitive Document Collections

Author: Claude Francisco
Fariña Antonio
Martínez-Prieto Miguel A.
Navarro Gonzalo
Publication venue: 'Elsevier BV'
Publication date: 01/01/2016
Field of study

Indexing highly repetitive collections has become a relevant problem with the emergence of large repositories of versioned documents, among other applications. These collections may reach huge sizes, but are formed mostly of documents that are near-copies of others. Traditional techniques for indexing these collections fail to properly exploit their regularities in order to reduce space. We introduce new techniques for compressing inverted indexes that exploit this near-copy regularity. They are based on run-length, Lempel-Ziv, or grammar compression of the differential inverted lists, instead of the usual practice of gap-encoding them. We show that, in this highly repetitive setting, our compression methods significantly reduce the space obtained with classical techniques, at the price of moderate slowdowns. Moreover, our best methods are universal, that is, they do not need to know the versioning structure of the collection, nor that a clear versioning structure even exists. We also introduce compressed self-indexes in the comparison. These are designed for general strings (not only natural language texts) and represent the text collection plus the index structure (not an inverted index) in integrated form. We show that these techniques can compress much further, using a small fraction of the space required by our new inverted indexes. Yet, they are orders of magnitude slower.Comment: This research has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Sk{\l}odowska-Curie Actions H2020-MSCA-RISE-2015 BIRDS GA No. 69094

arXiv.org e-Print Archive

Repositorio da Universidade da Coruña

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositorio Académico de la Universidad de Chile

Performance characterization and acceleration of genome-mapping tools on HPC environments

Author: Matzoros Christos Konstantinos
Publication venue: Universitat Politècnica de Catalunya
Publication date: 19/10/2022
Field of study

Nowadays, the efficient analysis and exploitation of genomic information is paramount to future advancements in the healthcare sector, such as better diagnosis techniques and the development of improved disease treatments. In the past decades, the exponential increase in the biological data production has fostered the development of more efficient genomic pipelines. For that, modern genome analysis requires better and more scalable algorithms, and improved high-performance implementations that can exploit current hardware accelerators. For most genome analysis pipelines, sequence mapping is one of the most computationally intensive and time-consuming processing stages. The ultimate goal of this work is to propose techniques to accelerate read mapping, leveraging novel algorithms and hardware vector extensions. In this thesis, we present a thorough performance characterization of the most widely-used genome-mapping tools and propose acceleration techniques that can effectively improve the performance of these tools. To that end, first, we identify the most time-consuming kernels, their performance bottlenecks, and the underlying causes of inefficiency. Afterwards, we design and implement an accelerated version of one of the most time-consuming steps: pairwise sequence alignment. For that, we propose to replace the classical dynamic-programming algorithm, used within these tools, with the recently proposed wavefront alignment algorithm (WFA). Moreover, we design and implement the first fully-vectorized version of the WFA, leveraging Intel's AVX2 and AVX-512 instructions, to further accelerate sequence-to-sequence alignment. As a result, we demonstrate that our vectorized WFA implementation outperforms the original scalar WFA implementation between 1.1x-2.4x. In turn, this renders speedups from 2.4x up to 826.7x compared to the most widely-used alignment algorithm, KSW2 (used within Minimap2 and Bwa-Mem2). We conclude that these tools can be significantly accelerated by selecting better algorithms (like the WFA) and leveraging fine-tuned implementations that can exploit hardware resources available in current high performance computing (HPC) processors

UPCommons. Portal del coneixement obert de la UPC

Parallel Construction of Wavelet Trees on Multicore Architectures

Author: Elejalde Erick
Ferres Leo
Fuentes-Sepúlveda José
Seco Diego
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2016
Field of study

The wavelet tree has become a very useful data structure to efficiently represent and query large volumes of data in many different domains, from bioinformatics to geographic information systems. One problem with wavelet trees is their construction time. In this paper, we introduce two algorithms that reduce the time complexity of a wavelet tree's construction by taking advantage of nowadays ubiquitous multicore machines. Our first algorithm constructs all the levels of the wavelet in parallel in

O(n)

time and

O(n\lg\sigma + \sigma\lg n)

bits of working space, where

n

is the size of the input sequence and

\sigma

is the size of the alphabet. Our second algorithm constructs the wavelet tree in a domain-decomposition fashion, using our first algorithm in each segment, reaching

O(\lg n)

time and

O(n\lg\sigma + p\sigma\lg n/\lg\sigma)

bits of extra space, where

p

is the number of available cores. Both algorithms are practical and report good speedup for large real datasets.Comment: This research has received funding from the European Union's Horizon 2020 research and innovation programme under the Marie Sk{\l}odowska-Curie Actions H2020-MSCA-RISE-2015 BIRDS GA No. 69094

arXiv.org e-Print Archive

Repositorio da Universidade da Coruña

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Efficient Storage of Genomic Sequences in High Performance Computing Systems

Author: Guerra Soler Aníbal José
Publication venue: Medellín, Colombia
Publication date: 01/01/2019
Field of study

ABSTRACT: In this dissertation, we address the challenges of genomic data storage in high performance computing systems. In particular, we focus on developing a referential compression approach for Next Generation Sequence data stored in FASTQ format files. The amount of genomic data available for researchers to process has increased exponentially, bringing enormous challenges for its efficient storage and transmission. General-purpose compressors can only offer limited performance for genomic data, thus the need for specialized compression solutions. Two trends have emerged as alternatives to harness the particular properties of genomic data: non-referential and referential compression. Non-referential compressors offer higher compression rations than general purpose compressors, but still below of what a referential compressor could theoretically achieve. However, the effectiveness of referential compression depends on selecting a good reference and on having enough computing resources available. This thesis presents one of the first referential compressors for FASTQ files. We first present a comprehensive analytical and experimental evaluation of the most relevant tools for genomic raw data compression, which led us to identify the main needs and opportunities in this field. As a consequence, we propose a novel compression workflow that aims at improving the usability of referential compressors. Subsequently, we discuss the implementation and performance evaluation for the core of the proposed workflow: a referential compressor for reads in FASTQ format that combines local read-to-reference alignments with a specialized binary-encoding strategy. The compression algorithm, named UdeACompress, achieved very competitive compression ratios when compared to the best compressors in the current state of the art, while showing reasonable execution times and memory use. In particular, UdeACompress outperformed all competitors when compressing long reads, typical of the newest sequencing technologies. Finally, we study the main aspects of the data-level parallelism in the Intel AVX-512 architecture, in order to develop a parallel version of the UdeACompress algorithms to reduce the runtime. Through the use of SIMD programming, we managed to significantly accelerate the main bottleneck found in UdeACompress, the Suffix Array Construction

Biblioteca Digital del Sistema de Bibliotecas de la Universidad de Antioquia

Accelerating edit-distance sequence alignment on GPU using the wavefront algorithm

Author: Aguado Puig Quim
Castells Rufas David
Espinosa Morales Antonio
Marco-Sola Santiago
Moreto Planas Miquel
Moure López Juan Carlos
Álvarez Martí Lluc
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 10/06/2022
Field of study

Sequence alignment remains a fundamental problem with practical applications ranging from pattern recognition to computational biology. Traditional algorithms based on dynamic programming are hard to parallelize, require significant amounts of memory, and fail to scale for large inputs. This work presents eWFA-GPU, a GPU (graphics processing unit)-accelerated tool to compute the exact edit-distance sequence alignment based on the wavefront alignment algorithm (WFA). This approach exploits the similarities between the input sequences to accelerate the alignment process while requiring less memory than other algorithms. Our implementation takes full advantage of the massive parallel capabilities of modern GPUs to accelerate the alignment process. In addition, we propose a succinct representation of the alignment data that successfully reduces the overall amount of memory required, allowing the exploitation of the fast shared memory of a GPU. Our results show that our GPU implementation outperforms by 3- 9× the baseline edit-distance WFA implementation running on a 20 core machine. As a result, eWFA-GPU is up to 265 times faster than state-of-the-art CPU implementation, and up to 56 times faster than state-of-the-art GPU implementations.This work was supported in part by the European Unions’s Horizon 2020 Framework Program through the DeepHealth Project under Grant 825111; in part by the European Union Regional Development Fund within the Framework of the European Regional Development Fund (ERDF) Operational Program of Catalonia 2014–2020 with a Grant of 50% of Total Cost Eligible through the Designing RISC-V-based Accelerators for next-generation Computers Project under Grant 001-P-001723; in part by the Ministerio de Ciencia e Innovacion (MCIN) Agencia Estatal de Investigación (AEI)/10.13039/501100011033 under Contract PID2020-113614RB-C21 and Contract TIN2015-65316-P; and in part by the Generalitat de Catalunya (GenCat)-Departament de Recerca i Universitats (DIUiE) (GRR) under Contract 2017-SGR-313, Contract 2017-SGR-1328, and Contract 2017-SGR-1414. The work of Miquel Moreto was supported in part by the Spanish Ministry of Economy, Industry and Competitiveness under Ramon y Cajal Fellowship under Grant RYC-2016-21104.Peer ReviewedPostprint (published version

UPCommons. Portal del coneixement obert de la UPC

GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers

Author: Abraham Mark James
Hess Berk
Lindahl Erik
Murtola Teemu
Páll Szilárd
Schulz Roland
Smith Jeremy C.
Publication venue: The Authors. Published by Elsevier B.V.
Publication date: 01/01/2015
Field of study

AbstractGROMACS is one of the most widely used open-source and free software codes in chemistry, used primarily for dynamical simulations of biomolecules. It provides a rich set of calculation types, preparation and analysis tools. Several advanced techniques for free-energy calculations are supported. In version 5, it reaches new performance heights, through several new and enhanced parallelization algorithms. These work on every level; SIMD registers inside cores, multithreading, heterogeneous CPU–GPU acceleration, state-of-the-art 3D domain decomposition, and ensemble-level parallelization through built-in replica exchange and the separate Copernicus framework. The latest best-in-class compressed trajectory storage format is supported

Publikationer från KTH

Elsevier - Publisher Connector

Directory of Open Access Journals

Digitala Vetenskapliga Arkivet - Academic Archive On-line

PiCo: A Domain-Specific Language for Data Analytics Pipelines

Author: Misale Claudia
Publication venue
Publication date: 01/01/2017
Field of study

In the world of Big Data analytics, there is a series of tools aiming at simplifying programming applications to be executed on clusters. Although each tool claims to provide better programming, data and execution models—for which only informal (and often confusing) semantics is generally provided—all share a common under- lying model, namely, the Dataflow model. Using this model as a starting point, it is possible to categorize and analyze almost all aspects about Big Data analytics tools from a high level perspective. This analysis can be considered as a first step toward a formal model to be exploited in the design of a (new) framework for Big Data analytics. By putting clear separations between all levels of abstraction (i.e., from the runtime to the user API), it is easier for a programmer or software designer to avoid mixing low level with high level aspects, as we are often used to see in state-of-the-art Big Data analytics frameworks. From the user-level perspective, we think that a clearer and simple semantics is preferable, together with a strong separation of concerns. For this reason, we use the Dataflow model as a starting point to build a programming environment with a simplified programming model implemented as a Domain-Specific Language, that is on top of a stack of layers that build a prototypical framework for Big Data analytics. The contribution of this thesis is twofold: first, we show that the proposed model is (at least) as general as existing batch and streaming frameworks (e.g., Spark, Flink, Storm, Google Dataflow), thus making it easier to understand high-level data-processing applications written in such frameworks. As result of this analysis, we provide a layered model that can represent tools and applications following the Dataflow paradigm and we show how the analyzed tools fit in each level. Second, we propose a programming environment based on such layered model in the form of a Domain-Specific Language (DSL) for processing data collections, called PiCo (Pipeline Composition). The main entity of this programming model is the Pipeline, basically a DAG-composition of processing elements. This model is intended to give the user an unique interface for both stream and batch processing, hiding completely data management and focusing only on operations, which are represented by Pipeline stages. Our DSL will be built on top of the FastFlow library, exploiting both shared and distributed parallelism, and implemented in C++11/14 with the aim of porting C++ into the Big Data world

ZENODO

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Institutional Research Information System University of Turin