40,542 research outputs found
Fast, Exact Bootstrap Principal Component Analysis for p>1 million
Many have suggested a bootstrap procedure for estimating the sampling
variability of principal component analysis (PCA) results. However, when the
number of measurements per subject () is much larger than the number of
subjects (), the challenge of calculating and storing the leading principal
components from each bootstrap sample can be computationally infeasible. To
address this, we outline methods for fast, exact calculation of bootstrap
principal components, eigenvalues, and scores. Our methods leverage the fact
that all bootstrap samples occupy the same -dimensional subspace as the
original sample. As a result, all bootstrap principal components are limited to
the same -dimensional subspace and can be efficiently represented by their
low dimensional coordinates in that subspace. Several uncertainty metrics can
be computed solely based on the bootstrap distribution of these low dimensional
coordinates, without calculating or storing the -dimensional bootstrap
components. Fast bootstrap PCA is applied to a dataset of sleep
electroencephalogram (EEG) recordings (, ), and to a dataset of
brain magnetic resonance images (MRIs) ( 3 million, ). For the
brain MRI dataset, our method allows for standard errors for the first 3
principal components based on 1000 bootstrap samples to be calculated on a
standard laptop in 47 minutes, as opposed to approximately 4 days with standard
methods.Comment: 25 pages, including 9 figures and link to R package. 2014-05-14
update: final formatting edits for journal submission, condensed figure
No Spare Parts: Sharing Part Detectors for Image Categorization
This work aims for image categorization using a representation of distinctive
parts. Different from existing part-based work, we argue that parts are
naturally shared between image categories and should be modeled as such. We
motivate our approach with a quantitative and qualitative analysis by
backtracking where selected parts come from. Our analysis shows that in
addition to the category parts defining the class, the parts coming from the
background context and parts from other image categories improve categorization
performance. Part selection should not be done separately for each category,
but instead be shared and optimized over all categories. To incorporate part
sharing between categories, we present an algorithm based on AdaBoost to
jointly optimize part sharing and selection, as well as fusion with the global
image representation. We achieve results competitive to the state-of-the-art on
object, scene, and action categories, further improving over deep convolutional
neural networks
Crowd Counting with Decomposed Uncertainty
Research in neural networks in the field of computer vision has achieved
remarkable accuracy for point estimation. However, the uncertainty in the
estimation is rarely addressed. Uncertainty quantification accompanied by point
estimation can lead to a more informed decision, and even improve the
prediction quality. In this work, we focus on uncertainty estimation in the
domain of crowd counting. With increasing occurrences of heavily crowded events
such as political rallies, protests, concerts, etc., automated crowd analysis
is becoming an increasingly crucial task. The stakes can be very high in many
of these real-world applications. We propose a scalable neural network
framework with quantification of decomposed uncertainty using a bootstrap
ensemble. We demonstrate that the proposed uncertainty quantification method
provides additional insight to the crowd counting problem and is simple to
implement. We also show that our proposed method exhibits the state of the art
performances in many benchmark crowd counting datasets.Comment: Accepted in AAAI 2020 (Main Technical Track
Bayesian uncertainty quantification in linear models for diffusion MRI
Diffusion MRI (dMRI) is a valuable tool in the assessment of tissue
microstructure. By fitting a model to the dMRI signal it is possible to derive
various quantitative features. Several of the most popular dMRI signal models
are expansions in an appropriately chosen basis, where the coefficients are
determined using some variation of least-squares. However, such approaches lack
any notion of uncertainty, which could be valuable in e.g. group analyses. In
this work, we use a probabilistic interpretation of linear least-squares
methods to recast popular dMRI models as Bayesian ones. This makes it possible
to quantify the uncertainty of any derived quantity. In particular, for
quantities that are affine functions of the coefficients, the posterior
distribution can be expressed in closed-form. We simulated measurements from
single- and double-tensor models where the correct values of several quantities
are known, to validate that the theoretically derived quantiles agree with
those observed empirically. We included results from residual bootstrap for
comparison and found good agreement. The validation employed several different
models: Diffusion Tensor Imaging (DTI), Mean Apparent Propagator MRI (MAP-MRI)
and Constrained Spherical Deconvolution (CSD). We also used in vivo data to
visualize maps of quantitative features and corresponding uncertainties, and to
show how our approach can be used in a group analysis to downweight subjects
with high uncertainty. In summary, we convert successful linear models for dMRI
signal estimation to probabilistic models, capable of accurate uncertainty
quantification.Comment: Added results from a group analysis and a comparison with residual
bootstra
Bayesian astrostatistics: a backward look to the future
This perspective chapter briefly surveys: (1) past growth in the use of
Bayesian methods in astrophysics; (2) current misconceptions about both
frequentist and Bayesian statistical inference that hinder wider adoption of
Bayesian methods by astronomers; and (3) multilevel (hierarchical) Bayesian
modeling as a major future direction for research in Bayesian astrostatistics,
exemplified in part by presentations at the first ISI invited session on
astrostatistics, commemorated in this volume. It closes with an intentionally
provocative recommendation for astronomical survey data reporting, motivated by
the multilevel Bayesian perspective on modeling cosmic populations: that
astronomers cease producing catalogs of estimated fluxes and other source
properties from surveys. Instead, summaries of likelihood functions (or
marginal likelihood functions) for source properties should be reported (not
posterior probability density functions), including nontrivial summaries (not
simply upper limits) for candidate objects that do not pass traditional
detection thresholds.Comment: 27 pp, 4 figures. A lightly revised version of a chapter in
"Astrostatistical Challenges for the New Astronomy" (Joseph M. Hilbe, ed.,
Springer, New York, forthcoming in 2012), the inaugural volume for the
Springer Series in Astrostatistics. Version 2 has minor clarifications and an
additional referenc
- …