58,613 research outputs found
A Selectivity based approach to Continuous Pattern Detection in Streaming Graphs
Cyber security is one of the most significant technical challenges in current
times. Detecting adversarial activities, prevention of theft of intellectual
properties and customer data is a high priority for corporations and government
agencies around the world. Cyber defenders need to analyze massive-scale,
high-resolution network flows to identify, categorize, and mitigate attacks
involving networks spanning institutional and national boundaries. Many of the
cyber attacks can be described as subgraph patterns, with prominent examples
being insider infiltrations (path queries), denial of service (parallel paths)
and malicious spreads (tree queries). This motivates us to explore subgraph
matching on streaming graphs in a continuous setting. The novelty of our work
lies in using the subgraph distributional statistics collected from the
streaming graph to determine the query processing strategy. We introduce a
"Lazy Search" algorithm where the search strategy is decided on a
vertex-to-vertex basis depending on the likelihood of a match in the vertex
neighborhood. We also propose a metric named "Relative Selectivity" that is
used to select between different query processing strategies. Our experiments
performed on real online news, network traffic stream and a synthetic social
network benchmark demonstrate 10-100x speedups over selectivity agnostic
approaches.Comment: in 18th International Conference on Extending Database Technology
(EDBT) (2015
Fast Search for Dynamic Multi-Relational Graphs
Acting on time-critical events by processing ever growing social media or
news streams is a major technical challenge. Many of these data sources can be
modeled as multi-relational graphs. Continuous queries or techniques to search
for rare events that typically arise in monitoring applications have been
studied extensively for relational databases. This work is dedicated to answer
the question that emerges naturally: how can we efficiently execute a
continuous query on a dynamic graph? This paper presents an exact subgraph
search algorithm that exploits the temporal characteristics of representative
queries for online news or social media monitoring. The algorithm is based on a
novel data structure called the Subgraph Join Tree (SJ-Tree) that leverages the
structural and semantic characteristics of the underlying multi-relational
graph. The paper concludes with extensive experimentation on several real-world
datasets that demonstrates the validity of this approach.Comment: SIGMOD Workshop on Dynamic Networks Management and Mining (DyNetMM),
201
On the Complexity of Searching in Trees: Average-case Minimization
We focus on the average-case analysis: A function w : V -> Z+ is given which
defines the likelihood for a node to be the one marked, and we want the
strategy that minimizes the expected number of queries. Prior to this paper,
very little was known about this natural question and the complexity of the
problem had remained so far an open question.
We close this question and prove that the above tree search problem is
NP-complete even for the class of trees with diameter at most 4. This results
in a complete characterization of the complexity of the problem with respect to
the diameter size. In fact, for diameter not larger than 3 the problem can be
shown to be polynomially solvable using a dynamic programming approach.
In addition we prove that the problem is NP-complete even for the class of
trees of maximum degree at most 16. To the best of our knowledge, the only
known result in this direction is that the tree search problem is solvable in
O(|V| log|V|) time for trees with degree at most 2 (paths).
We match the above complexity results with a tight algorithmic analysis. We
first show that a natural greedy algorithm attains a 2-approximation.
Furthermore, for the bounded degree instances, we show that any optimal
strategy (i.e., one that minimizes the expected number of queries) performs at
most O(\Delta(T) (log |V| + log w(T))) queries in the worst case, where w(T) is
the sum of the likelihoods of the nodes of T and \Delta(T) is the maximum
degree of T. We combine this result with a non-trivial exponential time
algorithm to provide an FPTAS for trees with bounded degree
Improved Bounds for 3SUM, -SUM, and Linear Degeneracy
Given a set of real numbers, the 3SUM problem is to decide whether there
are three of them that sum to zero. Until a recent breakthrough by Gr{\o}nlund
and Pettie [FOCS'14], a simple -time deterministic algorithm for
this problem was conjectured to be optimal. Over the years many algorithmic
problems have been shown to be reducible from the 3SUM problem or its variants,
including the more generalized forms of the problem, such as -SUM and
-variate linear degeneracy testing (-LDT). The conjectured hardness of
these problems have become extremely popular for basing conditional lower
bounds for numerous algorithmic problems in P.
In this paper, we show that the randomized -linear decision tree
complexity of 3SUM is , and that the randomized -linear
decision tree complexity of -SUM and -LDT is , for any odd
. These bounds improve (albeit randomized) the corresponding
and decision tree bounds
obtained by Gr{\o}nlund and Pettie. Our technique includes a specialized
randomized variant of fractional cascading data structure. Additionally, we
give another deterministic algorithm for 3SUM that runs in time. The latter bound matches a recent independent bound by Freund
[Algorithmica 2017], but our algorithm is somewhat simpler, due to a better use
of word-RAM model
Towards a Scalable Dynamic Spatial Database System
With the rise of GPS-enabled smartphones and other similar mobile devices,
massive amounts of location data are available. However, no scalable solutions
for soft real-time spatial queries on large sets of moving objects have yet
emerged. In this paper we explore and measure the limits of actual algorithms
and implementations regarding different application scenarios. And finally we
propose a novel distributed architecture to solve the scalability issues.Comment: (2012
Symbolic Partial-Order Execution for Testing Multi-Threaded Programs
We describe a technique for systematic testing of multi-threaded programs. We
combine Quasi-Optimal Partial-Order Reduction, a state-of-the-art technique
that tackles path explosion due to interleaving non-determinism, with symbolic
execution to handle data non-determinism. Our technique iteratively and
exhaustively finds all executions of the program. It represents program
executions using partial orders and finds the next execution using an
underlying unfolding semantics. We avoid the exploration of redundant program
traces using cutoff events. We implemented our technique as an extension of
KLEE and evaluated it on a set of large multi-threaded C programs. Our
experiments found several previously undiscovered bugs and undefined behaviors
in memcached and GNU sort, showing that the new method is capable of finding
bugs in industrial-size benchmarks.Comment: Extended version of a paper presented at CAV'2
- …