43,075 research outputs found
Maximum common subgraph isomorphism algorithms for the matching of chemical structures
The maximum common subgraph (MCS) problem has become increasingly important in those aspects of chemoinformatics that involve the matching of 2D or 3D chemical structures. This paper provides a classification and a review of the many MCS algorithms, both exact and approximate, that have been described in the literature, and makes recommendations regarding their applicability to typical chemoinformatics tasks
On Spectral Graph Embedding: A Non-Backtracking Perspective and Graph Approximation
Graph embedding has been proven to be efficient and effective in facilitating
graph analysis. In this paper, we present a novel spectral framework called
NOn-Backtracking Embedding (NOBE), which offers a new perspective that
organizes graph data at a deep level by tracking the flow traversing on the
edges with backtracking prohibited. Further, by analyzing the non-backtracking
process, a technique called graph approximation is devised, which provides a
channel to transform the spectral decomposition on an edge-to-edge matrix to
that on a node-to-node matrix. Theoretical guarantees are provided by bounding
the difference between the corresponding eigenvalues of the original graph and
its graph approximation. Extensive experiments conducted on various real-world
networks demonstrate the efficacy of our methods on both macroscopic and
microscopic levels, including clustering and structural hole spanner detection.Comment: SDM 2018 (Full version including all proofs
Fast and simple connectivity in graph timelines
In this paper we study the problem of answering connectivity queries about a
\emph{graph timeline}. A graph timeline is a sequence of undirected graphs
on a common set of vertices of size such that each graph
is obtained from the previous one by an addition or a deletion of a single
edge. We present data structures, which preprocess the timeline and can answer
the following queries:
- forall -- does the path exist in each of
?
- exists -- does the path exist in any of
?
- forall2 -- do there exist two edge-disjoint paths connecting
and in each of
We show data structures that can answer forall and forall2 queries in time after preprocessing in time. Here by we denote the
number of edges that remain unchanged in each graph of the timeline. For the
case of exists queries, we show how to extend an existing data structure to
obtain a preprocessing/query trade-off of and show a matching conditional lower bound.Comment: 21 pages, extended abstract to appear in WADS'1
Context-Free Path Querying by Matrix Multiplication
Graph data models are widely used in many areas, for example, bioinformatics,
graph databases. In these areas, it is often required to process queries for
large graphs. Some of the most common graph queries are navigational queries.
The result of query evaluation is a set of implicit relations between nodes of
the graph, i.e. paths in the graph. A natural way to specify these relations is
by specifying paths using formal grammars over the alphabet of edge labels. An
answer to a context-free path query in this approach is usually a set of
triples (A, m, n) such that there is a path from the node m to the node n,
whose labeling is derived from a non-terminal A of the given context-free
grammar. This type of queries is evaluated using the relational query
semantics. Another example of path query semantics is the single-path query
semantics which requires presenting a single path from the node m to the node
n, whose labeling is derived from a non-terminal A for all triples (A, m, n)
evaluated using the relational query semantics. There is a number of algorithms
for query evaluation which use these semantics but all of them perform poorly
on large graphs. One of the most common technique for efficient big data
processing is the use of a graphics processing unit (GPU) to perform
computations, but these algorithms do not allow to use this technique
efficiently. In this paper, we show how the context-free path query evaluation
using these query semantics can be reduced to the calculation of the matrix
transitive closure. Also, we propose an algorithm for context-free path query
evaluation which uses relational query semantics and is based on matrix
operations that make it possible to speed up computations by using a GPU.Comment: 9 pages, 11 figures, 2 table
Fully polynomial FPT algorithms for some classes of bounded clique-width graphs
Parameterized complexity theory has enabled a refined classification of the
difficulty of NP-hard optimization problems on graphs with respect to key
structural properties, and so to a better understanding of their true
difficulties. More recently, hardness results for problems in P were achieved
using reasonable complexity theoretic assumptions such as: Strong Exponential
Time Hypothesis (SETH), 3SUM and All-Pairs Shortest-Paths (APSP). According to
these assumptions, many graph theoretic problems do not admit truly
subquadratic algorithms, nor even truly subcubic algorithms (Williams and
Williams, FOCS 2010 and Abboud, Grandoni, Williams, SODA 2015). A central
technique used to tackle the difficulty of the above mentioned problems is
fixed-parameter algorithms for polynomial-time problems with polynomial
dependency in the fixed parameter (P-FPT). This technique was introduced by
Abboud, Williams and Wang in SODA 2016 and continued by Husfeldt (IPEC 2016)
and Fomin et al. (SODA 2017), using the treewidth as a parameter. Applying this
technique to clique-width, another important graph parameter, remained to be
done. In this paper we study several graph theoretic problems for which
hardness results exist such as cycle problems (triangle detection, triangle
counting, girth, diameter), distance problems (diameter, eccentricities, Gromov
hyperbolicity, betweenness centrality) and maximum matching. We provide
hardness results and fully polynomial FPT algorithms, using clique-width and
some of its upper-bounds as parameters (split-width, modular-width and
-sparseness). We believe that our most important result is an -time algorithm for computing a maximum matching where
is either the modular-width or the -sparseness. The latter generalizes
many algorithms that have been introduced so far for specific subclasses such
as cographs, -lite graphs, -extendible graphs and -tidy
graphs. Our algorithms are based on preprocessing methods using modular
decomposition, split decomposition and primeval decomposition. Thus they can
also be generalized to some graph classes with unbounded clique-width
- …