32,553 research outputs found
A Generic Framework for Engineering Graph Canonization Algorithms
The state-of-the-art tools for practical graph canonization are all based on
the individualization-refinement paradigm, and their difference is primarily in
the choice of heuristics they include and in the actual tool implementation. It
is thus not possible to make a direct comparison of how individual algorithmic
ideas affect the performance on different graph classes.
We present an algorithmic software framework that facilitates implementation
of heuristics as independent extensions to a common core algorithm. It
therefore becomes easy to perform a detailed comparison of the performance and
behaviour of different algorithmic ideas. Implementations are provided of a
range of algorithms for tree traversal, target cell selection, and node
invariant, including choices from the literature and new variations. The
framework readily supports extraction and visualization of detailed data from
separate algorithm executions for subsequent analysis and development of new
heuristics.
Using collections of different graph classes we investigate the effect of
varying the selections of heuristics, often revealing exactly which individual
algorithmic choice is responsible for particularly good or bad performance. On
several benchmark collections, including a newly proposed class of difficult
instances, we additionally find that our implementation performs better than
the current state-of-the-art tools
edge2vec: Representation learning using edge semantics for biomedical knowledge discovery
Representation learning provides new and powerful graph analytical approaches
and tools for the highly valued data science challenge of mining knowledge
graphs. Since previous graph analytical methods have mostly focused on
homogeneous graphs, an important current challenge is extending this
methodology for richly heterogeneous graphs and knowledge domains. The
biomedical sciences are such a domain, reflecting the complexity of biology,
with entities such as genes, proteins, drugs, diseases, and phenotypes, and
relationships such as gene co-expression, biochemical regulation, and
biomolecular inhibition or activation. Therefore, the semantics of edges and
nodes are critical for representation learning and knowledge discovery in real
world biomedical problems. In this paper, we propose the edge2vec model, which
represents graphs considering edge semantics. An edge-type transition matrix is
trained by an Expectation-Maximization approach, and a stochastic gradient
descent model is employed to learn node embedding on a heterogeneous graph via
the trained transition matrix. edge2vec is validated on three biomedical domain
tasks: biomedical entity classification, compound-gene bioactivity prediction,
and biomedical information retrieval. Results show that by considering
edge-types into node embedding learning in heterogeneous graphs,
\textbf{edge2vec}\ significantly outperforms state-of-the-art models on all
three tasks. We propose this method for its added value relative to existing
graph analytical methodology, and in the real world context of biomedical
knowledge discovery applicability.Comment: 10 page
Contact handles, duality, and sutured Floer homology
We give an explicit construction of the Honda--Kazez--Mati\'c gluing maps in
terms of contact handles. We use this to prove a duality result for turning a
sutured manifold cobordism around, and to compute the trace in the sutured
Floer TQFT. We also show that the decorated link cobordism maps on the hat
version of link Floer homology defined by the first author via sutured manifold
cobordisms and by the second author via elementary cobordisms agree.Comment: 86 pages, 54 figures, to appear in Geometry and Topolog
Genus expansion for real Wishart matrices
We present an exact formula for moments and cumulants of several real
compound Wishart matrices in terms of an Euler characteristic expansion,
similar to the genus expansion for complex random matrices. We consider their
asymptotic values in the large matrix limit: as in a genus expansion, the terms
which survive in the large matrix limit are those with the greatest Euler
characteristic, that is, either spheres or collections of spheres. This
topological construction motivates an algebraic expression for the moments and
cumulants in terms of the symmetric group. We examine the combinatorial
properties distinguishing the leading order terms. By considering higher
cumulants, we give a central limit-type theorem for the asymptotic distribution
around the expected value
Joint Vertex Degrees in an Inhomogeneous Random Graph Model
In a random graph, counts for the number of vertices with given degrees will
typically be dependent. We show via a multivariate normal and a Poisson process
approximation that, for graphs which have independent edges, with a possibly
inhomogeneous distribution, only when the degrees are large can we reasonably
approximate the joint counts as independent. The proofs are based on Stein's
method and the Stein-Chen method with a new size-biased coupling for such
inhomogeneous random graphs, and hence bounds on distributional distance are
obtained. Finally we illustrate that apparent (pseudo-) power-law type
behaviour can arise in such inhomogeneous networks despite not actually
following a power-law degree distribution.Comment: 30 pages, 9 figure
Initial Draft of a Possible Declarative Semantics for the Language
This article introduces a preliminary declarative semantics for a subset of the language Xcerpt (so-called
grouping-stratifiable programs) in form of a classical (Tarski style) model theory, adapted to the specific
requirements of Xcerpt’s constructs (e.g. the various aspects of incompleteness in query terms, grouping
constructs in rule heads, etc.). Most importantly, the model theory uses term simulation as a replacement
for term equality to handle incomplete term specifications, and an extended notion of substitutions in
order to properly convey the semantics of grouping constructs. Based upon this model theory, a fixpoint
semantics is also described, leading to a first notion of forward chaining evaluation of Xcerpt program
- …