22,654 research outputs found
High-dimensional Ising model selection using -regularized logistic regression
We consider the problem of estimating the graph associated with a binary
Ising Markov random field. We describe a method based on -regularized
logistic regression, in which the neighborhood of any given node is estimated
by performing logistic regression subject to an -constraint. The method
is analyzed under high-dimensional scaling in which both the number of nodes
and maximum neighborhood size are allowed to grow as a function of the
number of observations . Our main results provide sufficient conditions on
the triple and the model parameters for the method to succeed in
consistently estimating the neighborhood of every node in the graph
simultaneously. With coherence conditions imposed on the population Fisher
information matrix, we prove that consistent neighborhood selection can be
obtained for sample sizes with exponentially decaying
error. When these same conditions are imposed directly on the sample matrices,
we show that a reduced sample size of suffices for the
method to estimate neighborhoods consistently. Although this paper focuses on
the binary graphical models, we indicate how a generalization of the method of
the paper would apply to general discrete Markov random fields.Comment: Published in at http://dx.doi.org/10.1214/09-AOS691 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Learning Bayesian Networks with the bnlearn R Package
bnlearn is an R package which includes several algorithms for learning the
structure of Bayesian networks with either discrete or continuous variables.
Both constraint-based and score-based algorithms are implemented, and can use
the functionality provided by the snow package to improve their performance via
parallel computing. Several network scores and conditional independence
algorithms are available for both the learning algorithms and independent use.
Advanced plotting options are provided by the Rgraphviz package.Comment: 22 pages, 4 picture
Linear and Parallel Learning of Markov Random Fields
We introduce a new embarrassingly parallel parameter learning algorithm for
Markov random fields with untied parameters which is efficient for a large
class of practical models. Our algorithm parallelizes naturally over cliques
and, for graphs of bounded degree, its complexity is linear in the number of
cliques. Unlike its competitors, our algorithm is fully parallel and for
log-linear models it is also data efficient, requiring only the local
sufficient statistics of the data to estimate parameters
A survey of statistical network models
Networks are ubiquitous in science and have become a focal point for
discussion in everyday life. Formal statistical models for the analysis of
network data have emerged as a major topic of interest in diverse areas of
study, and most of these involve a form of graphical representation.
Probability models on graphs date back to 1959. Along with empirical studies in
social psychology and sociology from the 1960s, these early works generated an
active network community and a substantial literature in the 1970s. This effort
moved into the statistical literature in the late 1970s and 1980s, and the past
decade has seen a burgeoning network literature in statistical physics and
computer science. The growth of the World Wide Web and the emergence of online
networking communities such as Facebook, MySpace, and LinkedIn, and a host of
more specialized professional network communities has intensified interest in
the study of networks and network data. Our goal in this review is to provide
the reader with an entry point to this burgeoning literature. We begin with an
overview of the historical development of statistical network modeling and then
we introduce a number of examples that have been studied in the network
literature. Our subsequent discussion focuses on a number of prominent static
and dynamic network models and their interconnections. We emphasize formal
model descriptions, and pay special attention to the interpretation of
parameters and their estimation. We end with a description of some open
problems and challenges for machine learning and statistics.Comment: 96 pages, 14 figures, 333 reference
Graph Signal Processing: Overview, Challenges and Applications
Research in Graph Signal Processing (GSP) aims to develop tools for
processing data defined on irregular graph domains. In this paper we first
provide an overview of core ideas in GSP and their connection to conventional
digital signal processing. We then summarize recent developments in developing
basic GSP tools, including methods for sampling, filtering or graph learning.
Next, we review progress in several application areas using GSP, including
processing and analysis of sensor network data, biological data, and
applications to image processing and machine learning. We finish by providing a
brief historical perspective to highlight how concepts recently developed in
GSP build on top of prior research in other areas.Comment: To appear, Proceedings of the IEE
- …