94,080 research outputs found
Any-k: Anytime Top-k Tree Pattern Retrieval in Labeled Graphs
Many problems in areas as diverse as recommendation systems, social network
analysis, semantic search, and distributed root cause analysis can be modeled
as pattern search on labeled graphs (also called "heterogeneous information
networks" or HINs). Given a large graph and a query pattern with node and edge
label constraints, a fundamental challenge is to nd the top-k matches ac-
cording to a ranking function over edge and node weights. For users, it is di
cult to select value k . We therefore propose the novel notion of an any-k
ranking algorithm: for a given time budget, re- turn as many of the top-ranked
results as possible. Then, given additional time, produce the next lower-ranked
results quickly as well. It can be stopped anytime, but may have to continues
until all results are returned. This paper focuses on acyclic patterns over
arbitrary labeled graphs. We are interested in practical algorithms that
effectively exploit (1) properties of heterogeneous networks, in particular
selective constraints on labels, and (2) that the users often explore only a
fraction of the top-ranked results. Our solution, KARPET, carefully integrates
aggressive pruning that leverages the acyclic nature of the query, and
incremental guided search. It enables us to prove strong non-trivial time and
space guarantees, which is generally considered very hard for this type of
graph search problem. Through experimental studies we show that KARPET achieves
running times in the order of milliseconds for tree patterns on large networks
with millions of nodes and edges.Comment: To appear in WWW 201
Distributed memory compiler methods for irregular problems: Data copy reuse and runtime partitioning
Outlined here are two methods which we believe will play an important role in any distributed memory compiler able to handle sparse and unstructured problems. We describe how to link runtime partitioners to distributed memory compilers. In our scheme, programmers can implicitly specify how data and loop iterations are to be distributed between processors. This insulates users from having to deal explicitly with potentially complex algorithms that carry out work and data partitioning. We also describe a viable mechanism for tracking and reusing copies of off-processor data. In many programs, several loops access the same off-processor memory locations. As long as it can be verified that the values assigned to off-processor memory locations remain unmodified, we show that we can effectively reuse stored off-processor data. We present experimental data from a 3-D unstructured Euler solver run on iPSC/860 to demonstrate the usefulness of our methods
Distributed interpolatory algorithms for set membership estimation
This work addresses the distributed estimation problem in a set membership
framework. The agents of a network collect measurements which are affected by
bounded errors, thus implying that the unknown parameters to be estimated
belong to a suitable feasible set. Two distributed algorithms are considered,
based on projections of the estimate of each agent onto its local feasible set.
The main contribution of the paper is to show that such algorithms are
asymptotic interpolatory estimators, i.e. they converge to an element of the
global feasible set, under the assumption that the feasible set associated to
each measurement is convex. The proposed techniques are demonstrated on a
distributed linear regression estimation problem
On Efficiently Detecting Overlapping Communities over Distributed Dynamic Graphs
Modern networks are of huge sizes as well as high dynamics, which challenges
the efficiency of community detection algorithms. In this paper, we study the
problem of overlapping community detection on distributed and dynamic graphs.
Given a distributed, undirected and unweighted graph, the goal is to detect
overlapping communities incrementally as the graph is dynamically changing. We
propose an efficient algorithm, called \textit{randomized Speaker-Listener
Label Propagation Algorithm} (rSLPA), based on the \textit{Speaker-Listener
Label Propagation Algorithm} (SLPA) by relaxing the probability distribution of
label propagation. Besides detecting high-quality communities, rSLPA can
incrementally update the detected communities after a batch of edge insertion
and deletion operations. To the best of our knowledge, rSLPA is the first
algorithm that can incrementally capture the same communities as those obtained
by applying the detection algorithm from the scratch on the updated graph.
Extensive experiments are conducted on both synthetic and real-world datasets,
and the results show that our algorithm can achieve high accuracy and
efficiency at the same time.Comment: A short version of this paper will be published as ICDE'2018 poste
- …