569 research outputs found
On the structure of random unlabelled acyclic graphs
AbstractOne can use Poisson approximation techniques to get results about the asymptotics of graphical properties on random unlabelled acyclic graphs i.e., on random unlabelled free (rootless) trees. We will use some “colored” partitions to get some rough descriptions of the structure of “most” unlabelled acyclic graphs. In particular, we will prove that for any fixed rooted tree T, almost every sufficiently large acyclic graph has a “subtree” isomorphic to T. We can use this result to get a zero-one law for Monadic Second Order queries on random unlabelled acyclic graphs
Entropy of Some Models of Sparse Random Graphs With Vertex-Names
Consider the setting of sparse graphs on N vertices, where the vertices have
distinct "names", which are strings of length O(log N) from a fixed finite
alphabet. For many natural probability models, the entropy grows as cN log N
for some model-dependent rate constant c. The mathematical content of this
paper is the (often easy) calculation of c for a variety of models, in
particular for various standard random graph models adapted to this setting.
Our broader purpose is to publicize this particular setting as a natural
setting for future theoretical study of data compression for graphs, and (more
speculatively) for discussion of unorganized versus organized complexity.Comment: 31 page
Uniform random generation of large acyclic digraphs
Directed acyclic graphs are the basic representation of the structure
underlying Bayesian networks, which represent multivariate probability
distributions. In many practical applications, such as the reverse engineering
of gene regulatory networks, not only the estimation of model parameters but
the reconstruction of the structure itself is of great interest. As well as for
the assessment of different structure learning algorithms in simulation
studies, a uniform sample from the space of directed acyclic graphs is required
to evaluate the prevalence of certain structural features. Here we analyse how
to sample acyclic digraphs uniformly at random through recursive enumeration,
an approach previously thought too computationally involved. Based on
complexity considerations, we discuss in particular how the enumeration
directly provides an exact method, which avoids the convergence issues of the
alternative Markov chain methods and is actually computationally much faster.
The limiting behaviour of the distribution of acyclic digraphs then allows us
to sample arbitrarily large graphs. Building on the ideas of recursive
enumeration based sampling we also introduce a novel hybrid Markov chain with
much faster convergence than current alternatives while still being easy to
adapt to various restrictions. Finally we discuss how to include such
restrictions in the combinatorial enumeration and the new hybrid Markov chain
method for efficient uniform sampling of the corresponding graphs.Comment: 15 pages, 2 figures. To appear in Statistics and Computin
Asymptotic analysis and efficient random sampling of directed ordered acyclic graphs
Directed acyclic graphs (DAGs) are directed graphs in which there is no path
from a vertex to itself. DAGs are an omnipresent data structure in computer
science and the problem of counting the DAGs of given number of vertices and to
sample them uniformly at random has been solved respectively in the 70's and
the 00's. In this paper, we propose to explore a new variation of this model
where DAGs are endowed with an independent ordering of the out-edges of each
vertex, thus allowing to model a wide range of existing data structures. We
provide efficient algorithms for sampling objects of this new class, both with
or without control on the number of edges, and obtain an asymptotic equivalent
of their number. We also show the applicability of our method by providing an
effective algorithm for the random generation of classical labelled DAGs with a
prescribed number of vertices and edges, based on a similar approach. This is
the first known algorithm for sampling labelled DAGs with full control on the
number of edges, and it meets a need in terms of applications, that had already
been acknowledged in the literature.Comment: 32 pages, 12 figures. For the implementation of the algorithms, see
https://github.com/Kerl13/randda
Causal Learning via Manifold Regularization.
This paper frames causal structure estimation as a machine learning task. The idea is to treat indicators of causal relationships between variables as 'labels' and to exploit available data on the variables of interest to provide features for the labelling task. Background scientific knowledge or any available interventional data provide labels on some causal relationships and the remainder are treated as unlabelled. To illustrate the key ideas, we develop a distance-based approach (based on bivariate histograms) within a manifold regularization framework. We present empirical results on three different biological data sets (including examples where causal effects can be verified by experimental intervention), that together demonstrate the efficacy and general nature of the approach as well as its simplicity from a user's point of view
- …