22,090 research outputs found
A nonparametric empirical Bayes approach to covariance matrix estimation
We propose an empirical Bayes method to estimate high-dimensional covariance
matrices. Our procedure centers on vectorizing the covariance matrix and
treating matrix estimation as a vector estimation problem. Drawing from the
compound decision theory literature, we introduce a new class of decision rules
that generalizes several existing procedures. We then use a nonparametric
empirical Bayes g-modeling approach to estimate the oracle optimal rule in that
class. This allows us to let the data itself determine how best to shrink the
estimator, rather than shrinking in a pre-determined direction such as toward a
diagonal matrix. Simulation results and a gene expression network analysis
shows that our approach can outperform a number of state-of-the-art proposals
in a wide range of settings, sometimes substantially.Comment: 20 pages, 4 figure
Small area estimation of general parameters with application to poverty indicators: A hierarchical Bayes approach
Poverty maps are used to aid important political decisions such as allocation
of development funds by governments and international organizations. Those
decisions should be based on the most accurate poverty figures. However, often
reliable poverty figures are not available at fine geographical levels or for
particular risk population subgroups due to the sample size limitation of
current national surveys. These surveys cannot cover adequately all the desired
areas or population subgroups and, therefore, models relating the different
areas are needed to 'borrow strength" from area to area. In particular, the
Spanish Survey on Income and Living Conditions (SILC) produces national poverty
estimates but cannot provide poverty estimates by Spanish provinces due to the
poor precision of direct estimates, which use only the province specific data.
It also raises the ethical question of whether poverty is more severe for women
than for men in a given province. We develop a hierarchical Bayes (HB) approach
for poverty mapping in Spanish provinces by gender that overcomes the small
province sample size problem of the SILC. The proposed approach has a wide
scope of application because it can be used to estimate general nonlinear
parameters. We use a Bayesian version of the nested error regression model in
which Markov chain Monte Carlo procedures and the convergence monitoring
therein are avoided. A simulation study reveals good frequentist properties of
the HB approach. The resulting poverty maps indicate that poverty, both in
frequency and intensity, is localized mostly in the southern and western
provinces and it is more acute for women than for men in most of the provinces.Comment: Published in at http://dx.doi.org/10.1214/13-AOAS702 the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
Entropy inference and the James-Stein estimator, with application to nonlinear gene association networks
We present a procedure for effective estimation of entropy and mutual
information from small-sample data, and apply it to the problem of inferring
high-dimensional gene association networks. Specifically, we develop a
James-Stein-type shrinkage estimator, resulting in a procedure that is highly
efficient statistically as well as computationally. Despite its simplicity, we
show that it outperforms eight other entropy estimation procedures across a
diverse range of sampling scenarios and data-generating models, even in cases
of severe undersampling. We illustrate the approach by analyzing E. coli gene
expression data and computing an entropy-based gene-association network from
gene expression data. A computer program is available that implements the
proposed shrinkage estimator.Comment: 18 pages, 3 figures, 1 tabl
Feature Augmentation via Nonparametrics and Selection (FANS) in High Dimensional Classification
We propose a high dimensional classification method that involves
nonparametric feature augmentation. Knowing that marginal density ratios are
the most powerful univariate classifiers, we use the ratio estimates to
transform the original feature measurements. Subsequently, penalized logistic
regression is invoked, taking as input the newly transformed or augmented
features. This procedure trains models equipped with local complexity and
global simplicity, thereby avoiding the curse of dimensionality while creating
a flexible nonlinear decision boundary. The resulting method is called Feature
Augmentation via Nonparametrics and Selection (FANS). We motivate FANS by
generalizing the Naive Bayes model, writing the log ratio of joint densities as
a linear combination of those of marginal densities. It is related to
generalized additive models, but has better interpretability and computability.
Risk bounds are developed for FANS. In numerical analysis, FANS is compared
with competing methods, so as to provide a guideline on its best application
domain. Real data analysis demonstrates that FANS performs very competitively
on benchmark email spam and gene expression data sets. Moreover, FANS is
implemented by an extremely fast algorithm through parallel computing.Comment: 30 pages, 2 figure
Structure Learning in Coupled Dynamical Systems and Dynamic Causal Modelling
Identifying a coupled dynamical system out of many plausible candidates, each
of which could serve as the underlying generator of some observed measurements,
is a profoundly ill posed problem that commonly arises when modelling real
world phenomena. In this review, we detail a set of statistical procedures for
inferring the structure of nonlinear coupled dynamical systems (structure
learning), which has proved useful in neuroscience research. A key focus here
is the comparison of competing models of (ie, hypotheses about) network
architectures and implicit coupling functions in terms of their Bayesian model
evidence. These methods are collectively referred to as dynamical casual
modelling (DCM). We focus on a relatively new approach that is proving
remarkably useful; namely, Bayesian model reduction (BMR), which enables rapid
evaluation and comparison of models that differ in their network architecture.
We illustrate the usefulness of these techniques through modelling
neurovascular coupling (cellular pathways linking neuronal and vascular
systems), whose function is an active focus of research in neurobiology and the
imaging of coupled neuronal systems
Kernel Bayes' rule
A nonparametric kernel-based method for realizing Bayes' rule is proposed,
based on representations of probabilities in reproducing kernel Hilbert spaces.
Probabilities are uniquely characterized by the mean of the canonical map to
the RKHS. The prior and conditional probabilities are expressed in terms of
RKHS functions of an empirical sample: no explicit parametric model is needed
for these quantities. The posterior is likewise an RKHS mean of a weighted
sample. The estimator for the expectation of a function of the posterior is
derived, and rates of consistency are shown. Some representative applications
of the kernel Bayes' rule are presented, including Baysian computation without
likelihood and filtering with a nonparametric state-space model.Comment: 27 pages, 5 figure
- …