1,227 research outputs found
A Survey on Graph Database Management Techniques for Huge Unstructured Data
Data analysis, data management, and big data play a major role in both social and business perspective, in the last decade. Nowadays, the graph database is the hottest and trending research topic. A graph database is preferred to deal with the dynamic and complex relationships in connected data and offer better results. Every data element is represented as a node. For example, in social media site, a person is represented as a node, and its properties name, age, likes, and dislikes, etc and the nodes are connected with the relationships via edges. Use of graph database is expected to be beneficial in business, and social networking sites that generate huge unstructured data as that Big Data requires proper and efficient computational techniques to handle with. This paper reviews the existing graph data computational techniques and the research work, to offer the future research line up in graph database management
Learning to Count Isomorphisms with Graph Neural Networks
Subgraph isomorphism counting is an important problem on graphs, as many
graph-based tasks exploit recurring subgraph patterns. Classical methods
usually boil down to a backtracking framework that needs to navigate a huge
search space with prohibitive computational costs. Some recent studies resort
to graph neural networks (GNNs) to learn a low-dimensional representation for
both the query and input graphs, in order to predict the number of subgraph
isomorphisms on the input graph. However, typical GNNs employ a node-centric
message passing scheme that receives and aggregates messages on nodes, which is
inadequate in complex structure matching for isomorphism counting. Moreover, on
an input graph, the space of possible query graphs is enormous, and different
parts of the input graph will be triggered to match different queries. Thus,
expecting a fixed representation of the input graph to match diversely
structured query graphs is unrealistic. In this paper, we propose a novel GNN
called Count-GNN for subgraph isomorphism counting, to deal with the above
challenges. At the edge level, given that an edge is an atomic unit of encoding
graph structures, we propose an edge-centric message passing scheme, where
messages on edges are propagated and aggregated based on the edge adjacency to
preserve fine-grained structural information. At the graph level, we modulate
the input graph representation conditioned on the query, so that the input
graph can be adapted to each query individually to improve their matching.
Finally, we conduct extensive experiments on a number of benchmark datasets to
demonstrate the superior performance of Count-GNN.Comment: AAAI-23 main trac
The Vadalog System: Datalog-based Reasoning for Knowledge Graphs
Over the past years, there has been a resurgence of Datalog-based systems in
the database community as well as in industry. In this context, it has been
recognized that to handle the complex knowl\-edge-based scenarios encountered
today, such as reasoning over large knowledge graphs, Datalog has to be
extended with features such as existential quantification. Yet, Datalog-based
reasoning in the presence of existential quantification is in general
undecidable. Many efforts have been made to define decidable fragments. Warded
Datalog+/- is a very promising one, as it captures PTIME complexity while
allowing ontological reasoning. Yet so far, no implementation of Warded
Datalog+/- was available. In this paper we present the Vadalog system, a
Datalog-based system for performing complex logic reasoning tasks, such as
those required in advanced knowledge graphs. The Vadalog system is Oxford's
contribution to the VADA research programme, a joint effort of the universities
of Oxford, Manchester and Edinburgh and around 20 industrial partners. As the
main contribution of this paper, we illustrate the first implementation of
Warded Datalog+/-, a high-performance Datalog+/- system utilizing an aggressive
termination control strategy. We also provide a comprehensive experimental
evaluation.Comment: Extended version of VLDB paper
<https://doi.org/10.14778/3213880.3213888
cuTS: Scaling Subgraph Isomorphism on Distributed Multi-GPU Systems Using Trie Based Data Structure
Subgraph isomorphism is a pattern-matching algorithm widely used in many domains such as chem-informatics, bioinformatics, databases, and social network analysis. It is computationally expensive and is a proven NP-hard problem. The massive parallelism in GPUs is well suited for solving subgraph isomorphism. However, current GPU implementations are far from the achievable performance. Moreover, the enormous memory requirement of current approaches limits the problem size that can be handled. This work analyzes the fundamental challenges associated with processing subgraph isomorphism on GPUs and develops an efficient GPU implementation. We also develop a GPU-friendly trie-based data structure to drastically reduce the intermediate storage space requirement, enabling large benchmarks to be processed. We also develop the first distributed sub-graph isomorphism algorithm for GPUs. Our experimental evaluation demonstrates the efficacy of our approach by comparing the execution time and number of cases that can be handled against the state-of-the-art GPU implementations
Scaling Subgraph Matching by Improving Ullmann Algorithm
Graphs are widely used to model complicated data semantics in many application domains. Subgraph isomorphism checking (an NP-complete problem) is a regular operation with this kind of data. In this paper, we propose an improvement of Ullmann algorithm, a well-known subgraph isomorphism checker. Our new algorithm is called Ullmann-ONL. It utilizes a new search ordering and L-levels of vertex neighborhoods (NL) to confine the search space of Ullmann algorithm. Our performance study shows that Ullmann-ONL outperforms previously proposed algorithms with a wide margin
- …