1,875 research outputs found
Factorial graphical lasso for dynamic networks
Dynamic networks models describe a growing number of important scientific
processes, from cell biology and epidemiology to sociology and finance. There
are many aspects of dynamical networks that require statistical considerations.
In this paper we focus on determining network structure. Estimating dynamic
networks is a difficult task since the number of components involved in the
system is very large. As a result, the number of parameters to be estimated is
bigger than the number of observations. However, a characteristic of many
networks is that they are sparse. For example, the molecular structure of genes
make interactions with other components a highly-structured and therefore
sparse process.
Penalized Gaussian graphical models have been used to estimate sparse
networks. However, the literature has focussed on static networks, which lack
specific temporal constraints. We propose a structured Gaussian dynamical
graphical model, where structures can consist of specific time dynamics, known
presence or absence of links and block equality constraints on the parameters.
Thus, the number of parameters to be estimated is reduced and accuracy of the
estimates, including the identification of the network, can be tuned up. Here,
we show that the constrained optimization problem can be solved by taking
advantage of an efficient solver, logdetPPA, developed in convex optimization.
Moreover, model selection methods for checking the sensitivity of the inferred
networks are described. Finally, synthetic and real data illustrate the
proposed methodologies.Comment: 30 pp, 5 figure
MaxSkew and MultiSkew: Two R Packages for Detecting, Measuring and Removing Multivariate Skewness
Skewness plays a relevant role in several multivariate statistical
techniques. Sometimes it is used to recover data features, as in cluster
analysis. In other circumstances, skewness impairs the performances of
statistical methods, as in the Hotelling's one-sample test. In both cases,
there is the need to check the symmetry of the underlying distribution, either
by visual inspection or by formal testing. The R packages MaxSkew and MultiSkew
address these issues by measuring, testing and removing skewness from
multivariate data. Skewness is assessed by the third multivariate cumulant and
its functions. The hypothesis of symmetry is tested either nonparametrically,
with the bootstrap, or parametrically, under the normality assumption. Skewness
is removed or at least alleviated by projecting the data onto appropriate
linear subspaces. Usages of MaxSkew and MultiSkew are illustrated with the Iris
dataset
Algebraic Theory of Multi-Product Decisions, An
The typical firm produces for sale a plural number of distinct product lines. This paper characterizes the composition of a firm?s optimal production vector as a function of cost and revenue function attributes. The approach taken applies mathematical group theory and revealed preference arguments to exploit controlled asymmetries in the production environment. Assuming some symmetry on the cost function, our central result shows that all optimal production vectors must satisfy a dominance relation on permutations of the firm?s revenue function. When the revenue function is linear in outputs, then the set of admissible output vectors has linear bounds up to transformations. If these transformations are also linear, then convex analysis can be applied to characterize the set of admissible solutions. When the group of symmetries decomposes into a direct product group with index K in N, then the characterization problem separates into K problems of smaller dimension. The central result may be strengthened ; when the cost function is assumed to be quasiconvex.
High dimensional Sparse Gaussian Graphical Mixture Model
This paper considers the problem of networks reconstruction from
heterogeneous data using a Gaussian Graphical Mixture Model (GGMM). It is well
known that parameter estimation in this context is challenging due to large
numbers of variables coupled with the degeneracy of the likelihood. We propose
as a solution a penalized maximum likelihood technique by imposing an
penalty on the precision matrix. Our approach shrinks the parameters thereby
resulting in better identifiability and variable selection. We use the
Expectation Maximization (EM) algorithm which involves the graphical LASSO to
estimate the mixing coefficients and the precision matrices. We show that under
certain regularity conditions the Penalized Maximum Likelihood (PML) estimates
are consistent. We demonstrate the performance of the PML estimator through
simulations and we show the utility of our method for high dimensional data
analysis in a genomic application
Sparse Linear Identifiable Multivariate Modeling
In this paper we consider sparse and identifiable linear latent variable
(factor) and linear Bayesian network models for parsimonious analysis of
multivariate data. We propose a computationally efficient method for joint
parameter and model inference, and model comparison. It consists of a fully
Bayesian hierarchy for sparse models using slab and spike priors (two-component
delta-function and continuous mixtures), non-Gaussian latent factors and a
stochastic search over the ordering of the variables. The framework, which we
call SLIM (Sparse Linear Identifiable Multivariate modeling), is validated and
bench-marked on artificial and real biological data sets. SLIM is closest in
spirit to LiNGAM (Shimizu et al., 2006), but differs substantially in
inference, Bayesian network structure learning and model comparison.
Experimentally, SLIM performs equally well or better than LiNGAM with
comparable computational complexity. We attribute this mainly to the stochastic
search strategy used, and to parsimony (sparsity and identifiability), which is
an explicit part of the model. We propose two extensions to the basic i.i.d.
linear framework: non-linear dependence on observed variables, called SNIM
(Sparse Non-linear Identifiable Multivariate modeling) and allowing for
correlations between latent variables, called CSLIM (Correlated SLIM), for the
temporal and/or spatial data. The source code and scripts are available from
http://cogsys.imm.dtu.dk/slim/.Comment: 45 pages, 17 figure
Inference in Graphical Gaussian Models with Edge and Vertex Symmetries with the gRc Package for R
In this paper we present the R package gRc for statistical inference in graphical Gaussian models in which symmetry restrictions have been imposed on the concentration or partial correlation matrix. The models are represented by coloured graphs where parameters associated with edges or vertices of same colour are restricted to being identical. We describe algorithms for maximum likelihood estimation and discuss model selection issues. The paper illustrates the practical use of the gRc package
- …