31,769 research outputs found

    Uncertainty quantification in graph-based classification of high dimensional data

    Get PDF
    Classification of high dimensional data finds wide-ranging applications. In many of these applications equipping the resulting classification with a measure of uncertainty may be as important as the classification itself. In this paper we introduce, develop algorithms for, and investigate the properties of, a variety of Bayesian models for the task of binary classification; via the posterior distribution on the classification labels, these methods automatically give measures of uncertainty. The methods are all based around the graph formulation of semi-supervised learning. We provide a unified framework which brings together a variety of methods which have been introduced in different communities within the mathematical sciences. We study probit classification in the graph-based setting, generalize the level-set method for Bayesian inverse problems to the classification setting, and generalize the Ginzburg-Landau optimization-based classifier to a Bayesian setting; we also show that the probit and level set approaches are natural relaxations of the harmonic function approach introduced in [Zhu et al 2003]. We introduce efficient numerical methods, suited to large data-sets, for both MCMC-based sampling as well as gradient-based MAP estimation. Through numerical experiments we study classification accuracy and uncertainty quantification for our models; these experiments showcase a suite of datasets commonly used to evaluate graph-based semi-supervised learning algorithms.Comment: 33 pages, 14 figure

    Semi-Supervised First-Person Activity Recognition in Body-Worn Video

    Get PDF
    Body-worn cameras are now commonly used for logging daily life, sports, and law enforcement activities, creating a large volume of archived footage. This paper studies the problem of classifying frames of footage according to the activity of the camera-wearer with an emphasis on application to real-world police body-worn video. Real-world datasets pose a different set of challenges from existing egocentric vision datasets: the amount of footage of different activities is unbalanced, the data contains personally identifiable information, and in practice it is difficult to provide substantial training footage for a supervised approach. We address these challenges by extracting features based exclusively on motion information then segmenting the video footage using a semi-supervised classification algorithm. On publicly available datasets, our method achieves results comparable to, if not better than, supervised and/or deep learning methods using a fraction of the training data. It also shows promising results on real-world police body-worn video

    Data granulation by the principles of uncertainty

    Full text link
    Researches in granular modeling produced a variety of mathematical models, such as intervals, (higher-order) fuzzy sets, rough sets, and shadowed sets, which are all suitable to characterize the so-called information granules. Modeling of the input data uncertainty is recognized as a crucial aspect in information granulation. Moreover, the uncertainty is a well-studied concept in many mathematical settings, such as those of probability theory, fuzzy set theory, and possibility theory. This fact suggests that an appropriate quantification of the uncertainty expressed by the information granule model could be used to define an invariant property, to be exploited in practical situations of information granulation. In this perspective, a procedure of information granulation is effective if the uncertainty conveyed by the synthesized information granule is in a monotonically increasing relation with the uncertainty of the input data. In this paper, we present a data granulation framework that elaborates over the principles of uncertainty introduced by Klir. Being the uncertainty a mesoscopic descriptor of systems and data, it is possible to apply such principles regardless of the input data type and the specific mathematical setting adopted for the information granules. The proposed framework is conceived (i) to offer a guideline for the synthesis of information granules and (ii) to build a groundwork to compare and quantitatively judge over different data granulation procedures. To provide a suitable case study, we introduce a new data granulation technique based on the minimum sum of distances, which is designed to generate type-2 fuzzy sets. We analyze the procedure by performing different experiments on two distinct data types: feature vectors and labeled graphs. Results show that the uncertainty of the input data is suitably conveyed by the generated type-2 fuzzy set models.Comment: 16 pages, 9 figures, 52 reference

    Bayesian astrostatistics: a backward look to the future

    Full text link
    This perspective chapter briefly surveys: (1) past growth in the use of Bayesian methods in astrophysics; (2) current misconceptions about both frequentist and Bayesian statistical inference that hinder wider adoption of Bayesian methods by astronomers; and (3) multilevel (hierarchical) Bayesian modeling as a major future direction for research in Bayesian astrostatistics, exemplified in part by presentations at the first ISI invited session on astrostatistics, commemorated in this volume. It closes with an intentionally provocative recommendation for astronomical survey data reporting, motivated by the multilevel Bayesian perspective on modeling cosmic populations: that astronomers cease producing catalogs of estimated fluxes and other source properties from surveys. Instead, summaries of likelihood functions (or marginal likelihood functions) for source properties should be reported (not posterior probability density functions), including nontrivial summaries (not simply upper limits) for candidate objects that do not pass traditional detection thresholds.Comment: 27 pp, 4 figures. A lightly revised version of a chapter in "Astrostatistical Challenges for the New Astronomy" (Joseph M. Hilbe, ed., Springer, New York, forthcoming in 2012), the inaugural volume for the Springer Series in Astrostatistics. Version 2 has minor clarifications and an additional referenc
    corecore