3,792 research outputs found
The Lock-free -LSM Relaxed Priority Queue
Priority queues are data structures which store keys in an ordered fashion to
allow efficient access to the minimal (maximal) key. Priority queues are
essential for many applications, e.g., Dijkstra's single-source shortest path
algorithm, branch-and-bound algorithms, and prioritized schedulers.
Efficient multiprocessor computing requires implementations of basic data
structures that can be used concurrently and scale to large numbers of threads
and cores. Lock-free data structures promise superior scalability by avoiding
blocking synchronization primitives, but the \emph{delete-min} operation is an
inherent scalability bottleneck in concurrent priority queues. Recent work has
focused on alleviating this obstacle either by batching operations, or by
relaxing the requirements to the \emph{delete-min} operation.
We present a new, lock-free priority queue that relaxes the \emph{delete-min}
operation so that it is allowed to delete \emph{any} of the smallest
keys, where is a runtime configurable parameter. Additionally, the
behavior is identical to a non-relaxed priority queue for items added and
removed by the same thread. The priority queue is built from a logarithmic
number of sorted arrays in a way similar to log-structured merge-trees. We
experimentally compare our priority queue to recent state-of-the-art lock-free
priority queues, both with relaxed and non-relaxed semantics, showing high
performance and good scalability of our approach.Comment: Short version as ACM PPoPP'15 poste
Fast and Accurate Random Walk with Restart on Dynamic Graphs with Guarantees
Given a time-evolving graph, how can we track similarity between nodes in a
fast and accurate way, with theoretical guarantees on the convergence and the
error? Random Walk with Restart (RWR) is a popular measure to estimate the
similarity between nodes and has been exploited in numerous applications. Many
real-world graphs are dynamic with frequent insertion/deletion of edges; thus,
tracking RWR scores on dynamic graphs in an efficient way has aroused much
interest among data mining researchers. Recently, dynamic RWR models based on
the propagation of scores across a given graph have been proposed, and have
succeeded in outperforming previous other approaches to compute RWR
dynamically. However, those models fail to guarantee exactness and convergence
time for updating RWR in a generalized form. In this paper, we propose OSP, a
fast and accurate algorithm for computing dynamic RWR with insertion/deletion
of nodes/edges in a directed/undirected graph. When the graph is updated, OSP
first calculates offset scores around the modified edges, propagates the offset
scores across the updated graph, and then merges them with the current RWR
scores to get updated RWR scores. We prove the exactness of OSP and introduce
OSP-T, a version of OSP which regulates a trade-off between accuracy and
computation time by using error tolerance {\epsilon}. Given restart probability
c, OSP-T guarantees to return RWR scores with O ({\epsilon} /c ) error in O
(log ({\epsilon}/2)/log(1-c)) iterations. Through extensive experiments, we
show that OSP tracks RWR exactly up to 4605x faster than existing static RWR
method on dynamic graphs, and OSP-T requires up to 15x less time with 730x
lower L1 norm error and 3.3x lower rank error than other state-of-the-art
dynamic RWR methods.Comment: 10 pages, 8 figure
DESQ: Frequent Sequence Mining with Subsequence Constraints
Frequent sequence mining methods often make use of constraints to control
which subsequences should be mined. A variety of such subsequence constraints
has been studied in the literature, including length, gap, span,
regular-expression, and hierarchy constraints. In this paper, we show that many
subsequence constraints---including and beyond those considered in the
literature---can be unified in a single framework. A unified treatment allows
researchers to study jointly many types of subsequence constraints (instead of
each one individually) and helps to improve usability of pattern mining systems
for practitioners. In more detail, we propose a set of simple and intuitive
"pattern expressions" to describe subsequence constraints and explore
algorithms for efficiently mining frequent subsequences under such general
constraints. Our algorithms translate pattern expressions to compressed finite
state transducers, which we use as computational model, and simulate these
transducers in a way suitable for frequent sequence mining. Our experimental
study on real-world datasets indicates that our algorithms---although more
general---are competitive to existing state-of-the-art algorithms.Comment: Long version of the paper accepted at the IEEE ICDM 2016 conferenc
- …