1,782 research outputs found
A Selectivity based approach to Continuous Pattern Detection in Streaming Graphs
Cyber security is one of the most significant technical challenges in current
times. Detecting adversarial activities, prevention of theft of intellectual
properties and customer data is a high priority for corporations and government
agencies around the world. Cyber defenders need to analyze massive-scale,
high-resolution network flows to identify, categorize, and mitigate attacks
involving networks spanning institutional and national boundaries. Many of the
cyber attacks can be described as subgraph patterns, with prominent examples
being insider infiltrations (path queries), denial of service (parallel paths)
and malicious spreads (tree queries). This motivates us to explore subgraph
matching on streaming graphs in a continuous setting. The novelty of our work
lies in using the subgraph distributional statistics collected from the
streaming graph to determine the query processing strategy. We introduce a
"Lazy Search" algorithm where the search strategy is decided on a
vertex-to-vertex basis depending on the likelihood of a match in the vertex
neighborhood. We also propose a metric named "Relative Selectivity" that is
used to select between different query processing strategies. Our experiments
performed on real online news, network traffic stream and a synthetic social
network benchmark demonstrate 10-100x speedups over selectivity agnostic
approaches.Comment: in 18th International Conference on Extending Database Technology
(EDBT) (2015
NOUS: Construction and Querying of Dynamic Knowledge Graphs
The ability to construct domain specific knowledge graphs (KG) and perform
question-answering or hypothesis generation is a transformative capability.
Despite their value, automated construction of knowledge graphs remains an
expensive technical challenge that is beyond the reach for most enterprises and
academic institutions. We propose an end-to-end framework for developing custom
knowledge graph driven analytics for arbitrary application domains. The
uniqueness of our system lies A) in its combination of curated KGs along with
knowledge extracted from unstructured text, B) support for advanced trending
and explanatory questions on a dynamic KG, and C) the ability to answer queries
where the answer is embedded across multiple data sources.Comment: Codebase: https://github.com/streaming-graphs/NOU
Loom: Query-aware Partitioning of Online Graphs
As with general graph processing systems, partitioning data over a cluster of
machines improves the scalability of graph database management systems.
However, these systems will incur additional network cost during the execution
of a query workload, due to inter-partition traversals. Workload-agnostic
partitioning algorithms typically minimise the likelihood of any edge crossing
partition boundaries. However, these partitioners are sub-optimal with respect
to many workloads, especially queries, which may require more frequent
traversal of specific subsets of inter-partition edges. Furthermore, they
largely unsuited to operating incrementally on dynamic, growing graphs.
We present a new graph partitioning algorithm, Loom, that operates on a
stream of graph updates and continuously allocates the new vertices and edges
to partitions, taking into account a query workload of graph pattern
expressions along with their relative frequencies.
First we capture the most common patterns of edge traversals which occur when
executing queries. We then compare sub-graphs, which present themselves
incrementally in the graph update stream, against these common patterns.
Finally we attempt to allocate each match to single partitions, reducing the
number of inter-partition edges within frequently traversed sub-graphs and
improving average query performance.
Loom is extensively evaluated over several large test graphs with realistic
query workloads and various orderings of the graph updates. We demonstrate
that, given a workload, our prototype produces partitionings of significantly
better quality than existing streaming graph partitioning algorithms Fennel and
LDG
- …