204,418 research outputs found
Data structures and algorithms for approximate string matching Zvi Galil, Raffaele Giancarlo
This paper surveys techniques for designing efficient sequential and parallel approximate string matching algorithms. Special attention is given to the methods for the construction of data structures that efficiently support primitive operations needed in approximate string matching
Recommended from our members
A multiprocessor parallel approach to bit-parallel approximate string matching
The purpose of this project is to present with empirical results that a parallel design with the use of multiple processors can be successfully applied along with bit-parallel approximate string matching algorithms to solve practical bioinformatics problems. It will demonstrate that nearly optimal speedup can be achieved with a cluster of between two and eight workstations using MPI (Message Passing Interface), directly decreasing the total latency required to perform a string matching problem
On the Benefit of Merging Suffix Array Intervals for Parallel Pattern Matching
We present parallel algorithms for exact and approximate pattern matching
with suffix arrays, using a CREW-PRAM with processors. Given a static text
of length , we first show how to compute the suffix array interval of a
given pattern of length in
time for . For approximate pattern matching with differences or
mismatches, we show how to compute all occurrences of a given pattern in
time, where is the size of the alphabet
and . The workhorse of our algorithms is a data structure
for merging suffix array intervals quickly: Given the suffix array intervals
for two patterns and , we present a data structure for computing the
interval of in sequential time, or in
parallel time. All our data structures are of size bits (in addition to
the suffix array)
Closed-Form Approximation for Parallel-Plate Waveguide Coefficients
Simple closed-form formulas for calculating coefficients of modes excited in a parallel-plate waveguide illuminated by a planar wave are presented. The mode-matching technique and Green’s formula are used to arrive at a matrix-based expression for waveguide coefficients calculation. Simplified solution to this matrix is proposed to derive approximate mode coefficient formulas in closed-form for both TE and TM polarization. The results are validated by numerical simulations and show good accuracy for all incidence angles and in broad frequency range
Exponentially Faster Massively Parallel Maximal Matching
The study of approximate matching in the Massively Parallel Computations
(MPC) model has recently seen a burst of breakthroughs. Despite this progress,
however, we still have a far more limited understanding of maximal matching
which is one of the central problems of parallel and distributed computing. All
known MPC algorithms for maximal matching either take polylogarithmic time
which is considered inefficient, or require a strictly super-linear space of
per machine.
In this work, we close this gap by providing a novel analysis of an extremely
simple algorithm a variant of which was conjectured to work by Czumaj et al.
[STOC'18]. The algorithm edge-samples the graph, randomly partitions the
vertices, and finds a random greedy maximal matching within each partition. We
show that this algorithm drastically reduces the vertex degrees. This, among
some other results, leads to an round algorithm for
maximal matching with space (or even mildly sublinear in using
standard techniques).
As an immediate corollary, we get a approximate minimum vertex cover in
essentially the same rounds and space. This is the best possible approximation
factor under standard assumptions, culminating a long line of research. It also
leads to an improved round algorithm for
approximate matching. All these results can also be implemented in the
congested clique model within the same number of rounds.Comment: A preliminary version of this paper is to appear in the proceedings
of The 60th Annual IEEE Symposium on Foundations of Computer Science (FOCS
2019
Feature detection using spikes: the greedy approach
A goal of low-level neural processes is to build an efficient code extracting
the relevant information from the sensory input. It is believed that this is
implemented in cortical areas by elementary inferential computations
dynamically extracting the most likely parameters corresponding to the sensory
signal. We explore here a neuro-mimetic feed-forward model of the primary
visual area (VI) solving this problem in the case where the signal may be
described by a robust linear generative model. This model uses an over-complete
dictionary of primitives which provides a distributed probabilistic
representation of input features. Relying on an efficiency criterion, we derive
an algorithm as an approximate solution which uses incremental greedy inference
processes. This algorithm is similar to 'Matching Pursuit' and mimics the
parallel architecture of neural computations. We propose here a simple
implementation using a network of spiking integrate-and-fire neurons which
communicate using lateral interactions. Numerical simulations show that this
Sparse Spike Coding strategy provides an efficient model for representing
visual data from a set of natural images. Even though it is simplistic, this
transformation of spatial data into a spatio-temporal pattern of binary events
provides an accurate description of some complex neural patterns observed in
the spiking activity of biological neural networks.Comment: This work links Matching Pursuit with bayesian inference by providing
the underlying hypotheses (linear model, uniform prior, gaussian noise
model). A parallel with the parallel and event-based nature of neural
computations is explored and we show application to modelling Primary Visual
Cortex / image processsing.
http://incm.cnrs-mrs.fr/perrinet/dynn/LaurentPerrinet/Publications/Perrinet04tau
MASSIVELY PARALLEL ALGORITHMS FOR POINT CLOUD BASED OBJECT RECOGNITION ON HETEROGENEOUS ARCHITECTURE
With the advent of new commodity depth sensors, point cloud data processing plays an increasingly important role in object recognition and perception. However, the computational cost of point cloud data processing is extremely high due to the large data size, high dimensionality, and algorithmic complexity. To address the computational challenges of real-time processing, this work investigates the possibilities of using modern heterogeneous computing platforms and its supporting ecosystem such as massively parallel architecture (MPA), computing cluster, compute unified device architecture (CUDA), and multithreaded programming to accelerate the point cloud based object recognition. The aforementioned computing platforms would not yield high performance unless the specific features are properly utilized. Failing that the result actually produces an inferior performance. To achieve the high-speed performance in image descriptor computing, indexing, and matching in point cloud based object recognition, this work explores both coarse and fine grain level parallelism, identifies the acceptable levels of algorithmic approximation, and analyzes various performance impactors. A set of heterogeneous parallel algorithms are designed and implemented in this work. These algorithms include exact and approximate scalable massively parallel image descriptors for descriptor computing, parallel construction of k-dimensional tree (KD-tree) and the forest of KD-trees for descriptor indexing, parallel approximate nearest neighbor search (ANNS) and buffered ANNS (BANNS) on the KD-tree and the forest of KD-trees for descriptor matching. The results show that the proposed massively parallel algorithms on heterogeneous computing platforms can significantly improve the execution time performance of feature computing, indexing, and matching. Meanwhile, this work demonstrates that the heterogeneous computing architectures, with appropriate architecture specific algorithms design and optimization, have the distinct advantages of improving the performance of multimedia applications
- …