Search CORE

30 research outputs found

Experimental Evaluation of Subgraph Isomorphism Solvers

Author: C McCreesh
C Mccreesh
C Solnon
C Solnon
D Conte
G Audemard
G Damiand
J Larrosa
JR Ullmann
L Kotthoff
LP Cordella
M Santo De
P Erdős
R Hoffmann
S Zampelli
S Zampelli
V Bonnici
V Carletti
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date: 19/06/2019
Field of study

International audienceSubgraph Isomorphism (SI) is an NP-complete problem which is at the heart of many structural pattern recognition tasks as it involves finding a copy of a pattern graph into a target graph. In the pattern recognition community, the most well-known SI solvers are VF2, VF3, and RI. SI is also widely studied in the constraint programming community, and many constraint-based SI solvers have been proposed since Ullman, such as LAD and Glasgow, for example. All these SI solvers can solve very quickly some large SI instances, that involve graphs with thousands of nodes. However, McCreesh et al. have recently shown how to randomly generate SI instances the hardness of which can be controlled and predicted, and they have built small instances which are computationally challenging for all solvers. They have also shown that some small instances, which are predicted to be easy and are easily solved by constraint-based solvers, appear to be challenging for VF2 and VF3. In this paper, we widen this study by considering a large test suite coming from eight benchmarks. We show that, as expected for an NP-complete problem, the solving time of an instance does not depend on its size, and that some small instances coming from real applications are not solved by any of the considered solvers. We also show that, if RI and VF3 can solve very quickly a large number of easy instances, for which Glasgow or LAD need more time, they fail at solving some other instances that are quickly solved by Glasgow or LAD, and they are clearly outperformed by Glasgow on hard instances. Finally, we show that we can easily combine solvers to take benefit of their complementarity

Crossref

HAL

Hal-Diderot

Simple Pattern-only Heuristics Lead To Fast Subgraph Matching Strategies on Very Large Networks

Author: Alfredo Ferro
Alfredo Pulvirenti
Aparo Antonino
Bonnici Vincenzo
Dennis Shasha
Giovanni Micale
Giugno Rosalba
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

A wide range of biomedical applications entails solving the subgraph isomorphism problem, i.e. nding all the possible subgraphs of a target graph that are structurally equivalent to an input pattern graph. Targets may be very large and complex structures compared to patterns. Methods that address this NP-complete problem use heuristics. Their performance in both time and quality depends on a few subtleties of those heuristics. This paper compares the performance of state-of-theart algorithms for subgraph isomorphism on small, medium and very large graphs. Results show that heuristics based on pattern graphs alone prove to be the most ecient, an unexpected result

Archivio istituzionale della Ricerca - Università degli Studi di Parma

Catalogo dei prodotti della ricerca

AEDNet: Adaptive Edge-Deleting Network For Subgraph Matching

Author: Lan Zixun
Ma Fei
Ma Ye
Yu Limin
Yuan LingLong
Publication venue: 'Elsevier BV'
Publication date: 08/11/2022
Field of study

Subgraph matching is to find all subgraphs in a data graph that are isomorphic to an existing query graph. Subgraph matching is an NP-hard problem, yet has found its applications in many areas. Many learning-based methods have been proposed for graph matching, whereas few have been designed for subgraph matching. The subgraph matching problem is generally more challenging, mainly due to the different sizes between the two graphs, resulting in considerable large space of solutions. Also the extra edges existing in the data graph connecting to the matched nodes may lead to two matched nodes of two graphs having different adjacency structures and often being identified as distinct objects. Due to the extra edges, the existing learning based methods often fail to generate sufficiently similar node-level embeddings for matched nodes. This study proposes a novel Adaptive Edge-Deleting Network (AEDNet) for subgraph matching. The proposed method is trained in an end-to-end fashion. In AEDNet, a novel sample-wise adaptive edge-deleting mechanism removes extra edges to ensure consistency of adjacency structure of matched nodes, while a unidirectional cross-propagation mechanism ensures consistency of features of matched nodes. We applied the proposed method on six datasets with graph sizes varying from 20 to 2300. Our evaluations on six open datasets demonstrate that the proposed AEDNet outperforms six state-of-the-arts and is much faster than the exact methods on large graphs

arXiv.org e-Print Archive

GraphMineSuite: Enabling High-Performance and Programmable Graph Mining Algorithms with Set Algebra

Author: Balla Adrian
Beranek Jakub
Besta Maciej
Copik Marcin
Gianinazzi Lukas
Hoefler Torsten
Holenstein Tobias
Janda Kacper
Kalvoda Pavel
Konieczny Marek
Kwasniewski Grzegorz
Leisinger Sebastian
Lindenberger Philipp
Mutlu Onur
Ozdemir Esref
Schaffner Yannick
Schwarz Leonardo
Tatkowski Peter
Vonarburg-Shmaria Zur
Publication venue
Publication date: 05/03/2021
Field of study

We propose GraphMineSuite (GMS): the first benchmarking suite for graph mining that facilitates evaluating and constructing high-performance graph mining algorithms. First, GMS comes with a benchmark specification based on extensive literature review, prescribing representative problems, algorithms, and datasets. Second, GMS offers a carefully designed software platform for seamless testing of different fine-grained elements of graph mining algorithms, such as graph representations or algorithm subroutines. The platform includes parallel implementations of more than 40 considered baselines, and it facilitates developing complex and fast mining algorithms. High modularity is possible by harnessing set algebra operations such as set intersection and difference, which enables breaking complex graph mining algorithms into simple building blocks that can be separately experimented with. GMS is supported with a broad concurrency analysis for portability in performance insights, and a novel performance metric to assess the throughput of graph mining algorithms, enabling more insightful evaluation. As use cases, we harness GMS to rapidly redesign and accelerate state-of-the-art baselines of core graph mining problems: degeneracy reordering (by up to >2x), maximal clique listing (by up to >9x), k-clique listing (by 1.1x), and subgraph isomorphism (by up to 2.5x), also obtaining better theoretical performance bounds

arXiv.org e-Print Archive

Repository for Publications and Research Data

GSI: GPU-friendly Subgraph Isomorphism

Author: Hu Lin
Zeng Li
Zhang Fan
Zou Lei
Özsu M. Tamer
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 14/09/2019
Field of study

Subgraph isomorphism is a well-known NP-hard problem that is widely used in many applications, such as social network analysis and query over the knowledge graph. Due to the inherent hardness, its performance is often a bottleneck in various real-world applications. Therefore, we address this by designing an efficient subgraph isomorphism algorithm leveraging features of GPU architecture, such as massive parallelism and memory hierarchy. Existing GPU-based solutions adopt a two-step output scheme, performing the same join process twice in order to write intermediate results concurrently. They also lack GPU architecture-aware optimizations that allow scaling to large graphs. In this paper, we propose a GPU-friendly subgraph isomorphism algorithm, GSI. Different from existing edge join-based GPU solutions, we propose a Prealloc-Combine strategy based on the vertex-oriented framework, which avoids joining-twice in existing solutions. Also, a GPU-friendly data structure (called PCSR) is proposed to represent an edge-labeled graph. Extensive experiments on both synthetic and real graphs show that GSI outperforms the state-of-the-art algorithms by up to several orders of magnitude and has good scalability with graph size scaling to hundreds of millions of edges.Comment: 15 pages, 17 figures, conferenc

arXiv.org e-Print Archive

Crossref

Partitioning algorithms for induced subgraph problems

Author: Trimble James
Publication venue
Publication date: 01/01/2023
Field of study

This dissertation introduces the MCSPLIT family of algorithms for two closely-related NP-hard problems that involve finding a large induced subgraph contained by each of two input graphs: the induced subgraph isomorphism problem and the maximum common induced subgraph problem. The MCSPLIT algorithms resemble forward-checking constrant programming algorithms, but use problem-specific data structures that allow multiple, identical domains to be stored without duplication. These data structures enable fast, simple constraint propagation algorithms and very fast calculation of upper bounds. Versions of these algorithms for both sparse and dense graphs are described and implemented. The resulting algorithms are over an order of magnitude faster than the best existing algorithm for maximum common induced subgraph on unlabelled graphs, and outperform the state of the art on several classes of induced subgraph isomorphism instances. A further advantage of the MCSPLIT data structures is that variables and values are treated identically; this allows us to choose to branch on variables representing vertices of either input graph with no overhead. An extensive set of experiments shows that such two-sided branching can be particularly beneficial if the two input graphs have very different orders or densities. Finally, we turn from subgraphs to supergraphs, tackling the problem of finding a small graph that contains every member of a given family of graphs as an induced subgraph. Exact and heuristic techniques are developed for this problem, in each case using a MCSPLIT algorithm as a subroutine. These algorithms allow us to add new terms to two entries of the On-Line Encyclopedia of Integer Sequences

Glasgow Theses Service

Learning-Based Approaches for Graph Problems: A Survey

Author: Luo Siqiang
Yow Kai Siong
Publication venue
Publication date: 17/04/2022
Field of study

Over the years, many graph problems specifically those in NP-complete are studied by a wide range of researchers. Some famous examples include graph colouring, travelling salesman problem and subgraph isomorphism. Most of these problems are typically addressed by exact algorithms, approximate algorithms and heuristics. There are however some drawback for each of these methods. Recent studies have employed learning-based frameworks such as machine learning techniques in solving these problems, given that they are useful in discovering new patterns in structured data that can be represented using graphs. This research direction has successfully attracted a considerable amount of attention. In this survey, we provide a systematic review mainly on classic graph problems in which learning-based approaches have been proposed in addressing the problems. We discuss the overview of each framework, and provide analyses based on the design and performance of the framework. Some potential research questions are also suggested. Ultimately, this survey gives a clearer insight and can be used as a stepping stone to the research community in studying problems in this field.Comment: v1: 41 pages; v2: 40 page

arXiv.org e-Print Archive

Matched Filters for Noisy Induced Subgraph Detection

Author: Lyzinski Vince
Park Youngser
Priebe Carey E.
Sussman Daniel L.
Publication venue
Publication date: 03/06/2018
Field of study

The problem of finding the vertex correspondence between two noisy graphs with different number of vertices where the smaller graph is still large has many applications in social networks, neuroscience, and computer vision. We propose a solution to this problem via a graph matching matched filter: centering and padding the smaller adjacency matrix and applying graph matching methods to align it to the larger network. The centering and padding schemes can be incorporated into any algorithm that matches using adjacency matrices. Under a statistical model for correlated pairs of graphs, which yields a noisy copy of the small graph within the larger graph, the resulting optimization problem can be guaranteed to recover the true vertex correspondence between the networks. However, there are currently no efficient algorithms for solving this problem. To illustrate the possibilities and challenges of such problems, we use an algorithm that can exploit a partially known correspondence and show via varied simulations and applications to {\it Drosophila} and human connectomes that this approach can achieve good performance.Comment: 41 pages, 7 figure

arXiv.org e-Print Archive

Boston University Institutional Repository (OpenBU)