35,033 research outputs found
Sparse Learning over Infinite Subgraph Features
We present a supervised-learning algorithm from graph data (a set of graphs)
for arbitrary twice-differentiable loss functions and sparse linear models over
all possible subgraph features. To date, it has been shown that under all
possible subgraph features, several types of sparse learning, such as Adaboost,
LPBoost, LARS/LASSO, and sparse PLS regression, can be performed. Particularly
emphasis is placed on simultaneous learning of relevant features from an
infinite set of candidates. We first generalize techniques used in all these
preceding studies to derive an unifying bounding technique for arbitrary
separable functions. We then carefully use this bounding to make block
coordinate gradient descent feasible over infinite subgraph features, resulting
in a fast converging algorithm that can solve a wider class of sparse learning
problems over graph data. We also empirically study the differences from the
existing approaches in convergence property, selected subgraph features, and
search-space sizes. We further discuss several unnoticed issues in sparse
learning over all possible subgraph features.Comment: 42 pages, 24 figures, 4 table
Uniform random generation of large acyclic digraphs
Directed acyclic graphs are the basic representation of the structure
underlying Bayesian networks, which represent multivariate probability
distributions. In many practical applications, such as the reverse engineering
of gene regulatory networks, not only the estimation of model parameters but
the reconstruction of the structure itself is of great interest. As well as for
the assessment of different structure learning algorithms in simulation
studies, a uniform sample from the space of directed acyclic graphs is required
to evaluate the prevalence of certain structural features. Here we analyse how
to sample acyclic digraphs uniformly at random through recursive enumeration,
an approach previously thought too computationally involved. Based on
complexity considerations, we discuss in particular how the enumeration
directly provides an exact method, which avoids the convergence issues of the
alternative Markov chain methods and is actually computationally much faster.
The limiting behaviour of the distribution of acyclic digraphs then allows us
to sample arbitrarily large graphs. Building on the ideas of recursive
enumeration based sampling we also introduce a novel hybrid Markov chain with
much faster convergence than current alternatives while still being easy to
adapt to various restrictions. Finally we discuss how to include such
restrictions in the combinatorial enumeration and the new hybrid Markov chain
method for efficient uniform sampling of the corresponding graphs.Comment: 15 pages, 2 figures. To appear in Statistics and Computin
An Efficient Algorithm for Enumerating Chordless Cycles and Chordless Paths
A chordless cycle (induced cycle) of a graph is a cycle without any
chord, meaning that there is no edge outside the cycle connecting two vertices
of the cycle. A chordless path is defined similarly. In this paper, we consider
the problems of enumerating chordless cycles/paths of a given graph
and propose algorithms taking time for each chordless cycle/path. In
the existing studies, the problems had not been deeply studied in the
theoretical computer science area, and no output polynomial time algorithm has
been proposed. Our experiments showed that the computation time of our
algorithms is constant per chordless cycle/path for non-dense random graphs and
real-world graphs. They also show that the number of chordless cycles is much
smaller than the number of cycles. We applied the algorithm to prediction of
NMR (Nuclear Magnetic Resonance) spectra, and increased the accuracy of the
prediction
Enumeration of Matchings: Problems and Progress
This document is built around a list of thirty-two problems in enumeration of
matchings, the first twenty of which were presented in a lecture at MSRI in the
fall of 1996. I begin with a capsule history of the topic of enumeration of
matchings. The twenty original problems, with commentary, comprise the bulk of
the article. I give an account of the progress that has been made on these
problems as of this writing, and include pointers to both the printed and
on-line literature; roughly half of the original twenty problems were solved by
participants in the MSRI Workshop on Combinatorics, their students, and others,
between 1996 and 1999. The article concludes with a dozen new open problems.
(Note: This article supersedes math.CO/9801060 and math.CO/9801061.)Comment: 1+37 pages; to appear in "New Perspectives in Geometric
Combinatorics" (ed. by Billera, Bjorner, Green, Simeon, and Stanley),
Mathematical Science Research Institute publication #37, Cambridge University
Press, 199
- …