166,032 research outputs found
Capturing Topology in Graph Pattern Matching
Graph pattern matching is often defined in terms of subgraph isomorphism, an
NP-complete problem. To lower its complexity, various extensions of graph
simulation have been considered instead. These extensions allow pattern
matching to be conducted in cubic-time. However, they fall short of capturing
the topology of data graphs, i.e., graphs may have a structure drastically
different from pattern graphs they match, and the matches found are often too
large to understand and analyze. To rectify these problems, this paper proposes
a notion of strong simulation, a revision of graph simulation, for graph
pattern matching. (1) We identify a set of criteria for preserving the topology
of graphs matched. We show that strong simulation preserves the topology of
data graphs and finds a bounded number of matches. (2) We show that strong
simulation retains the same complexity as earlier extensions of simulation, by
providing a cubic-time algorithm for computing strong simulation. (3) We
present the locality property of strong simulation, which allows us to
effectively conduct pattern matching on distributed graphs. (4) We
experimentally verify the effectiveness and efficiency of these algorithms,
using real-life data and synthetic data.Comment: VLDB201
Search Trees for Distributed Graph Transformation Systems
Graph transformation systems, like PROGRES and Fujaba, can be used for modeling software systems of various domains, and support the automatic generation of executable code.
A graph transformation rule is executed only if the pattern of the transformation's left-hand side is found in the graph.
The search for the pattern has an exponential worst-case complexity.
In many cases, the average complexity can be reduced using search tree algorithms in the code generation phase.
When modeling distributed graph transformations, the communication overhead between the coupled applications largely affects the pattern matching performance.
Therefore, we present an approach for adapting existing search tree algorithms for the efficient search of distributed graph patterns.
Our algorithm divides the distributed graph pattern into several sub-patterns such that every sub-pattern affects solely the graph of exactly one coupled application.
The results of these sub-patterns are used to determine the match for the entire graph pattern
Search Trees for Distributed Graph Transformation Systems
Graph transformation systems, like PROGRES and Fujaba, can be used for modeling software systems of various domains, and support the automatic generation of executable code.
A graph transformation rule is executed only if the pattern of the transformation's left-hand side is found in the graph.
The search for the pattern has an exponential worst-case complexity.
In many cases, the average complexity can be reduced using search tree algorithms in the code generation phase.
When modeling distributed graph transformations, the communication overhead between the coupled applications largely affects the pattern matching performance.
Therefore, we present an approach for adapting existing search tree algorithms for the efficient search of distributed graph patterns.
Our algorithm divides the distributed graph pattern into several sub-patterns such that every sub-pattern affects solely the graph of exactly one coupled application.
The results of these sub-patterns are used to determine the match for the entire graph pattern
Parallelization of Graph Transformation Based on Incremental Pattern Matching
oai:journal.ub.tu-berlin.de:article/265Graph transformation based on incremental pattern matching explicitly stores all occurrences of patterns (left-hand side of rules) and updates this result cache upon model changes. This allows instantaneous pattern queries at the expense of costlier model manipulation and higher memory consumption.
Up to now, this incremental approach has considered only sequential execution despite the inherently distributed structure of the underlying match caching mechanism. The paper explores various possibilities of parallelizing graph transformation to harness the power of modern multi-core, multi-processor computing environments: (i) incremental pattern matching enables the concurrent execution of model manipulation and pattern matching; moreover, (ii) pattern matching itself can be parallelized along caches
Recommended from our members
Formalizing Gremlin pattern matching traversals in an integrated graph Algebra
Graph data management (also called NoSQL) has revealed beneficial characteristics in terms of flexibility and scalability by differ-ently balancing between query expressivity and schema flexibility. This peculiar advantage has resulted into an unforeseen race of developing new task-specific graph systems, query languages and data models, such as property graphs, key-value, wide column, resource description framework (RDF), etc. Present-day graph query languages are focused towards flex-ible graph pattern matching (aka sub-graph matching), whereas graph computing frameworks aim towards providing fast parallel (distributed) execution of instructions. The consequence of this rapid growth in the variety of graph-based data management systems has resulted in a lack of standardization. Gremlin, a graph traversal language, and machine provide a common platform for supporting any graph computing sys-tem (such as an OLTP graph database or OLAP graph processors). In this extended report, we present a formalization of graph pattern match-ing for Gremlin queries. We also study, discuss and consolidate various existing graph algebra operators into an integrated graph algebra
Graph pattern matching on social network analysis
Graph pattern matching is fundamental to social network analysis. Its effectiveness
for identifying social communities and social positions, making recommendations and
so on has been repeatedly demonstrated. However, the social network analysis raises
new challenges to graph pattern matching. As real-life social graphs are typically
large, it is often prohibitively expensive to conduct graph pattern matching over such
large graphs, e.g., NP-complete for subgraph isomorphism, cubic time for bounded
simulation, and quadratic time for simulation. These hinder the applicability of graph
pattern matching on social network analysis. In response to these challenges, the thesis
presents a series of effective techniques for querying large, dynamic, and distributively
stored social networks.
First of all, we propose a notion of query preserving graph compression, to compress
large social graphs relative to a class Q of queries. We then develop both batch
and incremental compression strategies for two commonly used pattern queries. Via
both theoretical analysis and experimental studies, we show that (1) using compressed
graphs Gr benefits graph pattern matching dramatically; and (2) the computation of Gr
as well as its maintenance can be processed efficiently.
Secondly, we investigate the distributed graph pattern matching problem, and explore
parallel computation for graph pattern matching. We show that our techniques
possess following performance guarantees: (1) each site is visited only once; (2) the total
network traffic is independent of the size of G; and (3) the response time is decided
by the size of largest fragment of G rather than the size of entire G. Furthermore, we
show how these distributed algorithms can be implemented in the MapReduce framework.
Thirdly, we study the problem of answering graph pattern matching using views
since view based techniques have proven an effective technique for speeding up query
evaluation. We propose a notion of pattern containment to characterise graph pattern
matching using views, and introduce efficient algorithms to answer graph pattern
matching using views. Moreover, we identify three problems related to graph pattern
containment, and provide efficient algorithms for containment checking (approximation
when the problem is intractable).
Fourthly, we revise graph pattern matching by supporting a designated output node,
which we treat as “query focus”. We then introduce algorithms for computing the top-k
relevant matches w.r.t. the output node for both acyclic and cyclic pattern graphs, respectively,
with early termination property. Furthermore, we investigate the diversified
top-k matching problem, and develop an approximation algorithm with performance
guarantee and a heuristic algorithm with early termination property.
Finally, we introduce an expert search system, called ExpFinder, for large and dynamic
social networks. ExpFinder identifies top-k experts in social networks by graph
pattern matching, and copes with the sheer size of real-life social networks by integrating
incremental graph pattern matching, query preserving compression and top-k
matching computation. In particular, we also introduce bounded (resp. unbounded)
incremental algorithms to maintain the weighted landmark vectors which are used for
incremental maintenance for cached results
Distributed Runtime Verification of Cyber-Physical Systems Based on Graph Pattern Matching
Cyber-physical systems process a huge amount of data coming from sensors and other information sources and they often have to provide real-time feedback and reaction. Cyber-physical systems are often critical, which means that their failure can lead to serious injuries or even loss of human lives. Ensuring correctness is an important issue, however traditional design-time verification approaches can not be applied due to
the complex interaction with the changing environment, the
distributed behavior and the intelligent/autonomous solutions.
In this paper we present a framework for distributed runtime
verification of cyber-physical systems including the solution for executing queries on a distributed model stored on multiple
nodes
Don't Repeat Yourself: Seamless Execution and Analysis of Extensive Network Experiments
This paper presents MACI, the first bespoke framework for the management, the
scalable execution, and the interactive analysis of a large number of network
experiments. Driven by the desire to avoid repetitive implementation of just a
few scripts for the execution and analysis of experiments, MACI emerged as a
generic framework for network experiments that significantly increases
efficiency and ensures reproducibility. To this end, MACI incorporates and
integrates established simulators and analysis tools to foster rapid but
systematic network experiments.
We found MACI indispensable in all phases of the research and development
process of various communication systems, such as i) an extensive DASH video
streaming study, ii) the systematic development and improvement of Multipath
TCP schedulers, and iii) research on a distributed topology graph pattern
matching algorithm. With this work, we make MACI publicly available to the
research community to advance efficient and reproducible network experiments
- …