3,938 research outputs found
Parallel Construction of Wavelet Trees on Multicore Architectures
The wavelet tree has become a very useful data structure to efficiently
represent and query large volumes of data in many different domains, from
bioinformatics to geographic information systems. One problem with wavelet
trees is their construction time. In this paper, we introduce two algorithms
that reduce the time complexity of a wavelet tree's construction by taking
advantage of nowadays ubiquitous multicore machines.
Our first algorithm constructs all the levels of the wavelet in parallel in
time and bits of working space, where
is the size of the input sequence and is the size of the alphabet. Our
second algorithm constructs the wavelet tree in a domain-decomposition fashion,
using our first algorithm in each segment, reaching time and
bits of extra space, where is the
number of available cores. Both algorithms are practical and report good
speedup for large real datasets.Comment: This research has received funding from the European Union's Horizon
2020 research and innovation programme under the Marie Sk{\l}odowska-Curie
Actions H2020-MSCA-RISE-2015 BIRDS GA No. 69094
Inference with Constrained Hidden Markov Models in PRISM
A Hidden Markov Model (HMM) is a common statistical model which is widely
used for analysis of biological sequence data and other sequential phenomena.
In the present paper we show how HMMs can be extended with side-constraints and
present constraint solving techniques for efficient inference. Defining HMMs
with side-constraints in Constraint Logic Programming have advantages in terms
of more compact expression and pruning opportunities during inference.
We present a PRISM-based framework for extending HMMs with side-constraints
and show how well-known constraints such as cardinality and all different are
integrated. We experimentally validate our approach on the biologically
motivated problem of global pairwise alignment
Process algebra modelling styles for biomolecular processes
We investigate how biomolecular processes are modelled in process algebras, focussing on chemical reactions. We consider various modelling styles and how design decisions made in the definition of the process algebra have an impact on how a modelling style can be applied. Our goal is to highlight the often implicit choices that modellers make in choosing a formalism, and illustrate, through the use of examples, how this can affect expressability as well as the type and complexity of the analysis that can be performed
Entropy-based parametric estimation of spike train statistics
We consider the evolution of a network of neurons, focusing on the asymptotic
behavior of spikes dynamics instead of membrane potential dynamics. The spike
response is not sought as a deterministic response in this context, but as a
conditional probability : "Reading out the code" consists of inferring such a
probability. This probability is computed from empirical raster plots, by using
the framework of thermodynamic formalism in ergodic theory. This gives us a
parametric statistical model where the probability has the form of a Gibbs
distribution. In this respect, this approach generalizes the seminal and
profound work of Schneidman and collaborators. A minimal presentation of the
formalism is reviewed here, while a general algorithmic estimation method is
proposed yielding fast convergent implementations. It is also made explicit how
several spike observables (entropy, rate, synchronizations, correlations) are
given in closed-form from the parametric estimation. This paradigm does not
only allow us to estimate the spike statistics, given a design choice, but also
to compare different models, thus answering comparative questions about the
neural code such as : "are correlations (or time synchrony or a given set of
spike patterns, ..) significant with respect to rate coding only ?" A numerical
validation of the method is proposed and the perspectives regarding spike-train
code analysis are also discussed.Comment: 37 pages, 8 figures, submitte
Gunrock: GPU Graph Analytics
For large-scale graph analytics on the GPU, the irregularity of data access
and control flow, and the complexity of programming GPUs, have presented two
significant challenges to developing a programmable high-performance graph
library. "Gunrock", our graph-processing system designed specifically for the
GPU, uses a high-level, bulk-synchronous, data-centric abstraction focused on
operations on a vertex or edge frontier. Gunrock achieves a balance between
performance and expressiveness by coupling high performance GPU computing
primitives and optimization strategies with a high-level programming model that
allows programmers to quickly develop new graph primitives with small code size
and minimal GPU programming knowledge. We characterize the performance of
various optimization strategies and evaluate Gunrock's overall performance on
different GPU architectures on a wide range of graph primitives that span from
traversal-based algorithms and ranking algorithms, to triangle counting and
bipartite-graph-based algorithms. The results show that on a single GPU,
Gunrock has on average at least an order of magnitude speedup over Boost and
PowerGraph, comparable performance to the fastest GPU hardwired primitives and
CPU shared-memory graph libraries such as Ligra and Galois, and better
performance than any other GPU high-level graph library.Comment: 52 pages, invited paper to ACM Transactions on Parallel Computing
(TOPC), an extended version of PPoPP'16 paper "Gunrock: A High-Performance
Graph Processing Library on the GPU
- …