83 research outputs found
Stability Approach to Regularization Selection (StARS) for High Dimensional Graphical Models
A challenging problem in estimating high-dimensional graphical models is to
choose the regularization parameter in a data-dependent way. The standard
techniques include -fold cross-validation (-CV), Akaike information
criterion (AIC), and Bayesian information criterion (BIC). Though these methods
work well for low-dimensional problems, they are not suitable in high
dimensional settings. In this paper, we present StARS: a new stability-based
method for choosing the regularization parameter in high dimensional inference
for undirected graphs. The method has a clear interpretation: we use the least
amount of regularization that simultaneously makes a graph sparse and
replicable under random sampling. This interpretation requires essentially no
conditions. Under mild conditions, we show that StARS is partially sparsistent
in terms of graph estimation: i.e. with high probability, all the true edges
will be included in the selected model even when the graph size diverges with
the sample size. Empirically, the performance of StARS is compared with the
state-of-the-art model selection procedures, including -CV, AIC, and BIC, on
both synthetic data and a real microarray dataset. StARS outperforms all these
competing procedures
The Impact of Personality Traits Towards the Intention to Adopt Mobile Learning
Mobile devices have become increasingly more common in the digitally connected world. Mobile learning as a model of e-learning refers to the acquisition of knowledge & skills utilizing mobile technologies. The aim of this study is to identify the extrinsic influential factors for the adoption of mobile learning. This study proposes the use of an extended technology acceptance model (TAM) theory that includes variables of personality traits such as perceived enjoyment and computer self-efficiency. The participants of this study were 351 students at University Technology Malaysia who had experiences in e-learning. The study found that perceived usefulness as an extrinsic factor has the highest influence on students’ intention to adopt mobile learning through an investigation of technology acceptance toward mobile learning. Personality traits such as perceived enjoyment and self-efficacy have impact on behavior intention to adopt mobile learning
Markov Network Structure Learning via Ensemble-of-Forests Models
Real world systems typically feature a variety of different dependency types
and topologies that complicate model selection for probabilistic graphical
models. We introduce the ensemble-of-forests model, a generalization of the
ensemble-of-trees model. Our model enables structure learning of Markov random
fields (MRF) with multiple connected components and arbitrary potentials. We
present two approximate inference techniques for this model and demonstrate
their performance on synthetic data. Our results suggest that the
ensemble-of-forests approach can accurately recover sparse, possibly
disconnected MRF topologies, even in presence of non-Gaussian dependencies
and/or low sample size. We applied the ensemble-of-forests model to learn the
structure of perturbed signaling networks of immune cells and found that these
frequently exhibit non-Gaussian dependencies with disconnected MRF topologies.
In summary, we expect that the ensemble-of-forests model will enable MRF
structure learning in other high dimensional real world settings that are
governed by non-trivial dependencies.Comment: 13 pages, 6 figure
Variational inference for sparse network reconstruction from count data
In multivariate statistics, the question of finding direct interactions can
be formulated as a problem of network inference - or network reconstruction -
for which the Gaussian graphical model (GGM) provides a canonical framework.
Unfortunately, the Gaussian assumption does not apply to count data which are
encountered in domains such as genomics, social sciences or ecology.
To circumvent this limitation, state-of-the-art approaches use two-step
strategies that first transform counts to pseudo Gaussian observations and then
apply a (partial) correlation-based approach from the abundant literature of
GGM inference. We adopt a different stance by relying on a latent model where
we directly model counts by means of Poisson distributions that are conditional
to latent (hidden) Gaussian correlated variables. In this multivariate Poisson
lognormal-model, the dependency structure is completely captured by the latent
layer. This parametric model enables to account for the effects of covariates
on the counts.
To perform network inference, we add a sparsity inducing constraint on the
inverse covariance matrix of the latent Gaussian vector. Unlike the usual
Gaussian setting, the penalized likelihood is generally not tractable, and we
resort instead to a variational approach for approximate likelihood
maximization. The corresponding optimization problem is solved by alternating a
gradient ascent on the variational parameters and a graphical-Lasso step on the
covariance matrix.
We show that our approach is highly competitive with the existing methods on
simulation inspired from microbiological data. We then illustrate on three
various data sets how accounting for sampling efforts via offsets and
integrating external covariates (which is mostly never done in the existing
literature) drastically changes the topology of the inferred network
- …