31,769 research outputs found
Uncertainty quantification in graph-based classification of high dimensional data
Classification of high dimensional data finds wide-ranging applications. In
many of these applications equipping the resulting classification with a
measure of uncertainty may be as important as the classification itself. In
this paper we introduce, develop algorithms for, and investigate the properties
of, a variety of Bayesian models for the task of binary classification; via the
posterior distribution on the classification labels, these methods
automatically give measures of uncertainty. The methods are all based around
the graph formulation of semi-supervised learning.
We provide a unified framework which brings together a variety of methods
which have been introduced in different communities within the mathematical
sciences. We study probit classification in the graph-based setting, generalize
the level-set method for Bayesian inverse problems to the classification
setting, and generalize the Ginzburg-Landau optimization-based classifier to a
Bayesian setting; we also show that the probit and level set approaches are
natural relaxations of the harmonic function approach introduced in [Zhu et al
2003].
We introduce efficient numerical methods, suited to large data-sets, for both
MCMC-based sampling as well as gradient-based MAP estimation. Through numerical
experiments we study classification accuracy and uncertainty quantification for
our models; these experiments showcase a suite of datasets commonly used to
evaluate graph-based semi-supervised learning algorithms.Comment: 33 pages, 14 figure
Semi-Supervised First-Person Activity Recognition in Body-Worn Video
Body-worn cameras are now commonly used for logging daily life, sports, and
law enforcement activities, creating a large volume of archived footage. This
paper studies the problem of classifying frames of footage according to the
activity of the camera-wearer with an emphasis on application to real-world
police body-worn video. Real-world datasets pose a different set of challenges
from existing egocentric vision datasets: the amount of footage of different
activities is unbalanced, the data contains personally identifiable
information, and in practice it is difficult to provide substantial training
footage for a supervised approach. We address these challenges by extracting
features based exclusively on motion information then segmenting the video
footage using a semi-supervised classification algorithm. On publicly available
datasets, our method achieves results comparable to, if not better than,
supervised and/or deep learning methods using a fraction of the training data.
It also shows promising results on real-world police body-worn video
Data granulation by the principles of uncertainty
Researches in granular modeling produced a variety of mathematical models,
such as intervals, (higher-order) fuzzy sets, rough sets, and shadowed sets,
which are all suitable to characterize the so-called information granules.
Modeling of the input data uncertainty is recognized as a crucial aspect in
information granulation. Moreover, the uncertainty is a well-studied concept in
many mathematical settings, such as those of probability theory, fuzzy set
theory, and possibility theory. This fact suggests that an appropriate
quantification of the uncertainty expressed by the information granule model
could be used to define an invariant property, to be exploited in practical
situations of information granulation. In this perspective, a procedure of
information granulation is effective if the uncertainty conveyed by the
synthesized information granule is in a monotonically increasing relation with
the uncertainty of the input data. In this paper, we present a data granulation
framework that elaborates over the principles of uncertainty introduced by
Klir. Being the uncertainty a mesoscopic descriptor of systems and data, it is
possible to apply such principles regardless of the input data type and the
specific mathematical setting adopted for the information granules. The
proposed framework is conceived (i) to offer a guideline for the synthesis of
information granules and (ii) to build a groundwork to compare and
quantitatively judge over different data granulation procedures. To provide a
suitable case study, we introduce a new data granulation technique based on the
minimum sum of distances, which is designed to generate type-2 fuzzy sets. We
analyze the procedure by performing different experiments on two distinct data
types: feature vectors and labeled graphs. Results show that the uncertainty of
the input data is suitably conveyed by the generated type-2 fuzzy set models.Comment: 16 pages, 9 figures, 52 reference
Bayesian astrostatistics: a backward look to the future
This perspective chapter briefly surveys: (1) past growth in the use of
Bayesian methods in astrophysics; (2) current misconceptions about both
frequentist and Bayesian statistical inference that hinder wider adoption of
Bayesian methods by astronomers; and (3) multilevel (hierarchical) Bayesian
modeling as a major future direction for research in Bayesian astrostatistics,
exemplified in part by presentations at the first ISI invited session on
astrostatistics, commemorated in this volume. It closes with an intentionally
provocative recommendation for astronomical survey data reporting, motivated by
the multilevel Bayesian perspective on modeling cosmic populations: that
astronomers cease producing catalogs of estimated fluxes and other source
properties from surveys. Instead, summaries of likelihood functions (or
marginal likelihood functions) for source properties should be reported (not
posterior probability density functions), including nontrivial summaries (not
simply upper limits) for candidate objects that do not pass traditional
detection thresholds.Comment: 27 pp, 4 figures. A lightly revised version of a chapter in
"Astrostatistical Challenges for the New Astronomy" (Joseph M. Hilbe, ed.,
Springer, New York, forthcoming in 2012), the inaugural volume for the
Springer Series in Astrostatistics. Version 2 has minor clarifications and an
additional referenc
Recommended from our members
Uncertainty quantification for semi-supervised multi-class classification in image processing and ego-motion analysis of body-worn videos
Semi-supervised learning uses underlying relationships in data with a scarcity of ground-truth labels. In this paper, we introduce an uncertainty quantification (UQ) method for graph-based semi-supervised multi-class classification problems. We not only predict the class label for each data point, but also provide a confidence score for the prediction. We adopt a Bayesian approach and propose a graphical multi-class probit model together with an effective Gibbs sampling procedure. Furthermore, we propose a confidence measure for each data point that correlates with the classification performance. We use the empirical properties of the proposed confidence measure to guide the design of a human-in-the-loop system. The uncertainty quantification algorithm and the human-in-the-loop system are successfully applied to classification
problems in image processing and ego-motion analysis of
body-worn videos
- …