2,446 research outputs found
Near Optimal Parallel Algorithms for Dynamic DFS in Undirected Graphs
Depth first search (DFS) tree is a fundamental data structure for solving
graph problems. The classical algorithm [SiComp74] for building a DFS tree
requires time for a given graph having vertices and edges.
Recently, Baswana et al. [SODA16] presented a simple algorithm for updating DFS
tree of an undirected graph after an edge/vertex update in time.
However, their algorithm is strictly sequential. We present an algorithm
achieving similar bounds, that can be adopted easily to the parallel
environment.
In the parallel model, a DFS tree can be computed from scratch using
processors in expected time [SiComp90] on an EREW PRAM, whereas
the best deterministic algorithm takes time
[SiComp90,JAlg93] on a CRCW PRAM. Our algorithm can be used to develop optimal
(upto polylog n factors deterministic algorithms for maintaining fully dynamic
DFS and fault tolerant DFS, of an undirected graph.
1- Parallel Fully Dynamic DFS:
Given an arbitrary online sequence of vertex/edge updates, we can maintain a
DFS tree of an undirected graph in time per update using
processors on an EREW PRAM.
2- Parallel Fault tolerant DFS:
An undirected graph can be preprocessed to build a data structure of size
O(m) such that for a set of updates (where is constant) in the graph,
the updated DFS tree can be computed in time using
processors on an EREW PRAM.
Moreover, our fully dynamic DFS algorithm provides, in a seamless manner,
nearly optimal (upto polylog n factors) algorithms for maintaining a DFS tree
in semi-streaming model and a restricted distributed model. These are the first
parallel, semi-streaming and distributed algorithms for maintaining a DFS tree
in the dynamic setting.Comment: Accepted to appear in SPAA'17, 32 Pages, 5 Figure
Prospects and limitations of full-text index structures in genome analysis
The combination of incessant advances in sequencing technology producing large amounts of data and innovative bioinformatics approaches, designed to cope with this data flood, has led to new interesting results in the life sciences. Given the magnitude of sequence data to be processed, many bioinformatics tools rely on efficient solutions to a variety of complex string problems. These solutions include fast heuristic algorithms and advanced data structures, generally referred to as index structures. Although the importance of index structures is generally known to the bioinformatics community, the design and potency of these data structures, as well as their properties and limitations, are less understood. Moreover, the last decade has seen a boom in the number of variant index structures featuring complex and diverse memory-time trade-offs. This article brings a comprehensive state-of-the-art overview of the most popular index structures and their recently developed variants. Their features, interrelationships, the trade-offs they impose, but also their practical limitations, are explained and compared
Fast and Tiny Structural Self-Indexes for XML
XML document markup is highly repetitive and therefore well compressible
using dictionary-based methods such as DAGs or grammars. In the context of
selectivity estimation, grammar-compressed trees were used before as synopsis
for structural XPath queries. Here a fully-fledged index over such grammars is
presented. The index allows to execute arbitrary tree algorithms with a
slow-down that is comparable to the space improvement. More interestingly,
certain algorithms execute much faster over the index (because no decompression
occurs). E.g., for structural XPath count queries, evaluating over the index is
faster than previous XPath implementations, often by two orders of magnitude.
The index also allows to serialize XML results (including texts) faster than
previous systems, by a factor of ca. 2-3. This is due to efficient copy
handling of grammar repetitions, and because materialization is totally
avoided. In order to compare with twig join implementations, we implemented a
materializer which writes out pre-order numbers of result nodes, and show its
competitiveness.Comment: 13 page
Time Versus Cost Tradeoffs for Deterministic Rendezvous in Networks
Two mobile agents, starting from different nodes of a network at possibly
different times, have to meet at the same node. This problem is known as
. Agents move in synchronous rounds. Each agent has a
distinct integer label from the set . Two main efficiency
measures of rendezvous are its (the number of rounds until the
meeting) and its (the total number of edge traversals). We
investigate tradeoffs between these two measures. A natural benchmark for both
time and cost of rendezvous in a network is the number of edge traversals
needed for visiting all nodes of the network, called the exploration time.
Hence we express the time and cost of rendezvous as functions of an upper bound
on the time of exploration (where and a corresponding exploration
procedure are known to both agents) and of the size of the label space. We
present two natural rendezvous algorithms. Algorithm has cost
(and, in fact, a version of this algorithm for the model where the
agents start simultaneously has cost exactly ) and time . Algorithm
has both time and cost . Our main contributions are
lower bounds showing that, perhaps surprisingly, these two algorithms capture
the tradeoffs between time and cost of rendezvous almost tightly. We show that
any deterministic rendezvous algorithm of cost asymptotically (i.e., of
cost ) must have time . On the other hand, we show that any
deterministic rendezvous algorithm with time complexity must have
cost
Hybrid model for vascular tree structures
This paper proposes a new representation scheme of the cerebral blood
vessels. This model provides information on the semantics of the
vascular structure: the topological relationships between vessels and
the labeling of vascular accidents such as aneurysms and stenoses.
In addition, the model keeps information of the inner surface geometry
as well as of the vascular map volume properties, i.e. the tissue
density, the blood flow velocity and the vessel wall elasticity.
The model can be constructed automatically in a pre-process from a set
of segmented MRA images. Its memory requirements are optimized on the
basis of the sparseness of the vascular structure. It allows fast
queries and efficient traversals and navigations. The visualizations
of the vessel surface can be performed at different levels of
detail. The direct rendering of the volume is fast because the model
provides a natural way to skip over empty data.
The paper analyzes the memory requirements of the model along with the
costs of the most important operations on it.Postprint (published version
Geometry-Oblivious FMM for Compressing Dense SPD Matrices
We present GOFMM (geometry-oblivious FMM), a novel method that creates a
hierarchical low-rank approximation, "compression," of an arbitrary dense
symmetric positive definite (SPD) matrix. For many applications, GOFMM enables
an approximate matrix-vector multiplication in or even time,
where is the matrix size. Compression requires storage and work.
In general, our scheme belongs to the family of hierarchical matrix
approximation methods. In particular, it generalizes the fast multipole method
(FMM) to a purely algebraic setting by only requiring the ability to sample
matrix entries. Neither geometric information (i.e., point coordinates) nor
knowledge of how the matrix entries have been generated is required, thus the
term "geometry-oblivious." Also, we introduce a shared-memory parallel scheme
for hierarchical matrix computations that reduces synchronization barriers. We
present results on the Intel Knights Landing and Haswell architectures, and on
the NVIDIA Pascal architecture for a variety of matrices.Comment: 13 pages, accepted by SC'1
Scaling Limits for Minimal and Random Spanning Trees in Two Dimensions
A general formulation is presented for continuum scaling limits of stochastic
spanning trees. A spanning tree is expressed in this limit through a consistent
collection of subtrees, which includes a tree for every finite set of endpoints
in . Tightness of the distribution, as , is established for
the following two-dimensional examples: the uniformly random spanning tree on
, the minimal spanning tree on (with random edge
lengths), and the Euclidean minimal spanning tree on a Poisson process of
points in with density . In each case, sample trees are
proven to have the following properties, with probability one with respect to
any of the limiting measures: i) there is a single route to infinity (as was
known for ), ii) the tree branches are given by curves which are
regular in the sense of H\"older continuity, iii) the branches are also rough,
in the sense that their Hausdorff dimension exceeds one, iv) there is a random
dense subset of , of dimension strictly between one and two, on the
complement of which (and only there) the spanning subtrees are unique with
continuous dependence on the endpoints, v) branching occurs at countably many
points in , and vi) the branching numbers are uniformly bounded. The
results include tightness for the loop erased random walk (LERW) in two
dimensions. The proofs proceed through the derivation of scale-invariant power
bounds on the probabilities of repeated crossings of annuli.Comment: Revised; 54 pages, 6 figures (LaTex
- …