952 research outputs found
Incomplete graphical model inference via latent tree aggregation
Graphical network inference is used in many fields such as genomics or
ecology to infer the conditional independence structure between variables, from
measurements of gene expression or species abundances for instance. In many
practical cases, not all variables involved in the network have been observed,
and the samples are actually drawn from a distribution where some variables
have been marginalized out. This challenges the sparsity assumption commonly
made in graphical model inference, since marginalization yields locally dense
structures, even when the original network is sparse. We present a procedure
for inferring Gaussian graphical models when some variables are unobserved,
that accounts both for the influence of missing variables and the low density
of the original network. Our model is based on the aggregation of spanning
trees, and the estimation procedure on the Expectation-Maximization algorithm.
We treat the graph structure and the unobserved nodes as missing variables and
compute posterior probabilities of edge appearance. To provide a complete
methodology, we also propose several model selection criteria to estimate the
number of missing nodes. A simulation study and an illustration flow cytometry
data reveal that our method has favorable edge detection properties compared to
existing graph inference techniques. The methods are implemented in an R
package
Modeling heterogeneity in random graphs through latent space models: a selective review
We present a selective review on probabilistic modeling of heterogeneity in
random graphs. We focus on latent space models and more particularly on
stochastic block models and their extensions that have undergone major
developments in the last five years
Cooperation with public research institutions and success in innovation: Evidence from France and Germany
We evaluate the impact of cooperation with public research institutions on firms' inno-vative activities in France and Germany, using data from the fourth Community Innova-tion Survey (CIS4). We propose an original econometric methodology, which explicitly takes into account potential estimation biases arising from self-selection and endoge-neity, and apply it to both process and product innovation. We find a positive effect of cooperation on both types of innovation. This effect is significant in both countries, but much higher in Germany than in France. Drawing on a comparison of the institutional context of cooperation across both countries, we interpret this difference as a conse-quence of the more diffusion-oriented German science policy. Finally, our robustness checks confirm the importance of controlling for selection and endogeneity. We show that these problems can be serious, and may lead to inconsistent estimates if ne-glected. --Public/private research partnerships,University/industry linkages,Innova-tiveness,Heckit procedure with endogenous regressors
Uncovering latent structure in valued graphs: A variational approach
As more and more network-structured data sets are available, the statistical
analysis of valued graphs has become common place. Looking for a latent
structure is one of the many strategies used to better understand the behavior
of a network. Several methods already exist for the binary case. We present a
model-based strategy to uncover groups of nodes in valued graphs. This
framework can be used for a wide span of parametric random graphs models and
allows to include covariates. Variational tools allow us to achieve approximate
maximum likelihood estimation of the parameters of these models. We provide a
simulation study showing that our estimation method performs well over a broad
range of situations. We apply this method to analyze host--parasite interaction
networks in forest ecosystems.Comment: Published in at http://dx.doi.org/10.1214/10-AOAS361 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Degree-based goodness-of-fit tests for heterogeneous random graph models : independent and exchangeable cases
The degrees are a classical and relevant way to study the topology of a
network. They can be used to assess the goodness-of-fit for a given random
graph model. In this paper we introduce goodness-of-fit tests for two classes
of models. First, we consider the case of independent graph models such as the
heterogeneous Erd\"os-R\'enyi model in which the edges have different
connection probabilities. Second, we consider a generic model for exchangeable
random graphs called the W-graph. The stochastic block model and the expected
degree distribution model fall within this framework. We prove the asymptotic
normality of the degree mean square under these independent and exchangeable
models and derive formal tests. We study the power of the proposed tests and we
prove the asymptotic normality under specific sparsity regimes. The tests are
illustrated on real networks from social sciences and ecology, and their
performances are assessed via a simulation study
A closed-form approach to Bayesian inference in tree-structured graphical models
We consider the inference of the structure of an undirected graphical model
in an exact Bayesian framework. More specifically we aim at achieving the
inference with close-form posteriors, avoiding any sampling step. This task
would be intractable without any restriction on the considered graphs, so we
limit our exploration to mixtures of spanning trees. We consider the inference
of the structure of an undirected graphical model in a Bayesian framework. To
avoid convergence issues and highly demanding Monte Carlo sampling, we focus on
exact inference. More specifically we aim at achieving the inference with
close-form posteriors, avoiding any sampling step. To this aim, we restrict the
set of considered graphs to mixtures of spanning trees. We investigate under
which conditions on the priors - on both tree structures and parameters - exact
Bayesian inference can be achieved. Under these conditions, we derive a fast an
exact algorithm to compute the posterior probability for an edge to belong to
{the tree model} using an algebraic result called the Matrix-Tree theorem. We
show that the assumption we have made does not prevent our approach to perform
well on synthetic and flow cytometry data
- …