35,033 research outputs found

    Sparse Learning over Infinite Subgraph Features

    Full text link
    We present a supervised-learning algorithm from graph data (a set of graphs) for arbitrary twice-differentiable loss functions and sparse linear models over all possible subgraph features. To date, it has been shown that under all possible subgraph features, several types of sparse learning, such as Adaboost, LPBoost, LARS/LASSO, and sparse PLS regression, can be performed. Particularly emphasis is placed on simultaneous learning of relevant features from an infinite set of candidates. We first generalize techniques used in all these preceding studies to derive an unifying bounding technique for arbitrary separable functions. We then carefully use this bounding to make block coordinate gradient descent feasible over infinite subgraph features, resulting in a fast converging algorithm that can solve a wider class of sparse learning problems over graph data. We also empirically study the differences from the existing approaches in convergence property, selected subgraph features, and search-space sizes. We further discuss several unnoticed issues in sparse learning over all possible subgraph features.Comment: 42 pages, 24 figures, 4 table

    Uniform random generation of large acyclic digraphs

    Full text link
    Directed acyclic graphs are the basic representation of the structure underlying Bayesian networks, which represent multivariate probability distributions. In many practical applications, such as the reverse engineering of gene regulatory networks, not only the estimation of model parameters but the reconstruction of the structure itself is of great interest. As well as for the assessment of different structure learning algorithms in simulation studies, a uniform sample from the space of directed acyclic graphs is required to evaluate the prevalence of certain structural features. Here we analyse how to sample acyclic digraphs uniformly at random through recursive enumeration, an approach previously thought too computationally involved. Based on complexity considerations, we discuss in particular how the enumeration directly provides an exact method, which avoids the convergence issues of the alternative Markov chain methods and is actually computationally much faster. The limiting behaviour of the distribution of acyclic digraphs then allows us to sample arbitrarily large graphs. Building on the ideas of recursive enumeration based sampling we also introduce a novel hybrid Markov chain with much faster convergence than current alternatives while still being easy to adapt to various restrictions. Finally we discuss how to include such restrictions in the combinatorial enumeration and the new hybrid Markov chain method for efficient uniform sampling of the corresponding graphs.Comment: 15 pages, 2 figures. To appear in Statistics and Computin

    An Efficient Algorithm for Enumerating Chordless Cycles and Chordless Paths

    Full text link
    A chordless cycle (induced cycle) CC of a graph is a cycle without any chord, meaning that there is no edge outside the cycle connecting two vertices of the cycle. A chordless path is defined similarly. In this paper, we consider the problems of enumerating chordless cycles/paths of a given graph G=(V,E),G=(V,E), and propose algorithms taking O(∣E∣)O(|E|) time for each chordless cycle/path. In the existing studies, the problems had not been deeply studied in the theoretical computer science area, and no output polynomial time algorithm has been proposed. Our experiments showed that the computation time of our algorithms is constant per chordless cycle/path for non-dense random graphs and real-world graphs. They also show that the number of chordless cycles is much smaller than the number of cycles. We applied the algorithm to prediction of NMR (Nuclear Magnetic Resonance) spectra, and increased the accuracy of the prediction

    Enumeration of Matchings: Problems and Progress

    Full text link
    This document is built around a list of thirty-two problems in enumeration of matchings, the first twenty of which were presented in a lecture at MSRI in the fall of 1996. I begin with a capsule history of the topic of enumeration of matchings. The twenty original problems, with commentary, comprise the bulk of the article. I give an account of the progress that has been made on these problems as of this writing, and include pointers to both the printed and on-line literature; roughly half of the original twenty problems were solved by participants in the MSRI Workshop on Combinatorics, their students, and others, between 1996 and 1999. The article concludes with a dozen new open problems. (Note: This article supersedes math.CO/9801060 and math.CO/9801061.)Comment: 1+37 pages; to appear in "New Perspectives in Geometric Combinatorics" (ed. by Billera, Bjorner, Green, Simeon, and Stanley), Mathematical Science Research Institute publication #37, Cambridge University Press, 199
    • …
    corecore