783 research outputs found
Modeling Graphs with Vertex Replacement Grammars
One of the principal goals of graph modeling is to capture the building
blocks of network data in order to study various physical and natural
phenomena. Recent work at the intersection of formal language theory and graph
theory has explored the use of graph grammars for graph modeling. However,
existing graph grammar formalisms, like Hyperedge Replacement Grammars, can
only operate on small tree-like graphs. The present work relaxes this
restriction by revising a different graph grammar formalism called Vertex
Replacement Grammars (VRGs). We show that a variant of the VRG called
Clustering-based Node Replacement Grammar (CNRG) can be efficiently extracted
from many hierarchical clusterings of a graph. We show that CNRGs encode a
succinct model of the graph, yet faithfully preserves the structure of the
original graph. In experiments on large real-world datasets, we show that
graphs generated from the CNRG model exhibit a diverse range of properties that
are similar to those found in the original networks.Comment: Accepted as a regular paper at IEEE ICDM 2019. 15 pages, 9 figure
Modeling Graph Languages with Grammars Extracted via Tree Decompositions
Work on probabilistic models of natural language tends to focus on strings and trees, but there is increasing interest in more general graph-shaped structures since they seem to be better suited for representing natural language semantics, ontologies, or other varieties of knowledge structures. However, while there are relatively simple approaches to defining generative models over strings and trees, it has proven more challenging for more general graphs. This paper describes a natural generalization of the n-gram to graphs, making use of Hyperedge Replacement Grammars to define generative models of graph languages.9 page(s
On external presentations of infinite graphs
The vertices of a finite state system are usually a subset of the natural
numbers. Most algorithms relative to these systems only use this fact to select
vertices.
For infinite state systems, however, the situation is different: in
particular, for such systems having a finite description, each state of the
system is a configuration of some machine. Then most algorithmic approaches
rely on the structure of these configurations. Such characterisations are said
internal. In order to apply algorithms detecting a structural property (like
identifying connected components) one may have first to transform the system in
order to fit the description needed for the algorithm. The problem of internal
characterisation is that it hides structural properties, and each solution
becomes ad hoc relatively to the form of the configurations.
On the contrary, external characterisations avoid explicit naming of the
vertices. Such characterisation are mostly defined via graph transformations.
In this paper we present two kind of external characterisations:
deterministic graph rewriting, which in turn characterise regular graphs,
deterministic context-free languages, and rational graphs. Inverse substitution
from a generator (like the complete binary tree) provides characterisation for
prefix-recognizable graphs, the Caucal Hierarchy and rational graphs. We
illustrate how these characterisation provide an efficient tool for the
representation of infinite state systems
Growing Graphs with Hyperedge Replacement Graph Grammars
Discovering the underlying structures present in large real world graphs is a
fundamental scientific problem. In this paper we show that a graph's clique
tree can be used to extract a hyperedge replacement grammar. If we store an
ordering from the extraction process, the extracted graph grammar is guaranteed
to generate an isomorphic copy of the original graph. Or, a stochastic
application of the graph grammar rules can be used to quickly create random
graphs. In experiments on large real world networks, we show that random
graphs, generated from extracted graph grammars, exhibit a wide range of
properties that are very similar to the original graphs. In addition to graph
properties like degree or eigenvector centrality, what a graph "looks like"
ultimately depends on small details in local graph substructures that are
difficult to define at a global level. We show that our generative graph model
is able to preserve these local substructures when generating new graphs and
performs well on new and difficult tests of model robustness.Comment: 18 pages, 19 figures, accepted to CIKM 2016 in Indianapolis, I
Towards Interpretable Graph Modeling with Vertex Replacement Grammars
An enormous amount of real-world data exists in the form of graphs.
Oftentimes, interesting patterns that describe the complex dynamics of these
graphs are captured in the form of frequently reoccurring substructures. Recent
work at the intersection of formal language theory and graph theory has
explored the use of graph grammars for graph modeling and pattern mining.
However, existing formulations do not extract meaningful and easily
interpretable patterns from the data. The present work addresses this
limitation by extracting a special type of vertex replacement grammar, which we
call a KT grammar, according to the Minimum Description Length (MDL) heuristic.
In experiments on synthetic and real-world datasets, we show that KT-grammars
can be efficiently extracted from a graph and that these grammars encode
meaningful patterns that represent the dynamics of the real-world system.Comment: 10 pages, 9 figures, accepted at IEEE BigData 201
Inferring Chemical Reaction Patterns Using Rule Composition in Graph Grammars
Modeling molecules as undirected graphs and chemical reactions as graph
rewriting operations is a natural and convenient approach tom odeling
chemistry. Graph grammar rules are most naturally employed to model elementary
reactions like merging, splitting, and isomerisation of molecules. It is often
convenient, in particular in the analysis of larger systems, to summarize
several subsequent reactions into a single composite chemical reaction. We use
a generic approach for composing graph grammar rules to define a chemically
useful rule compositions. We iteratively apply these rule compositions to
elementary transformations in order to automatically infer complex
transformation patterns. This is useful for instance to understand the net
effect of complex catalytic cycles such as the Formose reaction. The
automatically inferred graph grammar rule is a generic representative that also
covers the overall reaction pattern of the Formose cycle, namely two carbonyl
groups that can react with a bound glycolaldehyde to a second glycolaldehyde.
Rule composition also can be used to study polymerization reactions as well as
more complicated iterative reaction schemes. Terpenes and the polyketides, for
instance, form two naturally occurring classes of compounds of utmost
pharmaceutical interest that can be understood as "generalized polymers"
consisting of five-carbon (isoprene) and two-carbon units, respectively
Generic Strategies for Chemical Space Exploration
Computational approaches to exploring "chemical universes", i.e., very large
sets, potentially infinite sets of compounds that can be constructed by a
prescribed collection of reaction mechanisms, in practice suffer from a
combinatorial explosion. It quickly becomes impossible to test, for all pairs
of compounds in a rapidly growing network, whether they can react with each
other. More sophisticated and efficient strategies are therefore required to
construct very large chemical reaction networks.
Undirected labeled graphs and graph rewriting are natural models of chemical
compounds and chemical reactions. Borrowing the idea of partial evaluation from
functional programming, we introduce partial applications of rewrite rules.
Binding substrate to rules increases the number of rules but drastically prunes
the substrate sets to which it might match, resulting in dramatically reduced
resource requirements. At the same time, exploration strategies can be guided,
e.g. based on restrictions on the product molecules to avoid the explicit
enumeration of very unlikely compounds. To this end we introduce here a generic
framework for the specification of exploration strategies in graph-rewriting
systems. Using key examples of complex chemical networks from sugar chemistry
and the realm of metabolic networks we demonstrate the feasibility of a
high-level strategy framework.
The ideas presented here can not only be used for a strategy-based chemical
space exploration that has close correspondence of experimental results, but
are much more general. In particular, the framework can be used to emulate
higher-level transformation models such as illustrated in a small puzzle game
- …