132,199 research outputs found
Evaluating Merging Strategies for Sampling-based Uncertainty Techniques in Object Detection
There has been a recent emergence of sampling-based techniques for estimating
epistemic uncertainty in deep neural networks. While these methods can be
applied to classification or semantic segmentation tasks by simply averaging
samples, this is not the case for object detection, where detection sample
bounding boxes must be accurately associated and merged. A weak merging
strategy can significantly degrade the performance of the detector and yield an
unreliable uncertainty measure. This paper provides the first in-depth
investigation of the effect of different association and merging strategies. We
compare different combinations of three spatial and two semantic affinity
measures with four clustering methods for MC Dropout with a Single Shot
Multi-Box Detector. Our results show that the correct choice of
affinity-clustering combination can greatly improve the effectiveness of the
classification and spatial uncertainty estimation and the resulting object
detection performance. We base our evaluation on a new mix of datasets that
emulate near open-set conditions (semantically similar unknown classes),
distant open-set conditions (semantically dissimilar unknown classes) and the
common closed-set conditions (only known classes).Comment: to appear in IEEE International Conference on Robotics and Automation
2019 (ICRA 2019
Unsupervised cryo-EM data clustering through adaptively constrained K-means algorithm
In single-particle cryo-electron microscopy (cryo-EM), K-means clustering
algorithm is widely used in unsupervised 2D classification of projection images
of biological macromolecules. 3D ab initio reconstruction requires accurate
unsupervised classification in order to separate molecular projections of
distinct orientations. Due to background noise in single-particle images and
uncertainty of molecular orientations, traditional K-means clustering algorithm
may classify images into wrong classes and produce classes with a large
variation in membership. Overcoming these limitations requires further
development on clustering algorithms for cryo-EM data analysis. We propose a
novel unsupervised data clustering method building upon the traditional K-means
algorithm. By introducing an adaptive constraint term in the objective
function, our algorithm not only avoids a large variation in class sizes but
also produces more accurate data clustering. Applications of this approach to
both simulated and experimental cryo-EM data demonstrate that our algorithm is
a significantly improved alterative to the traditional K-means algorithm in
single-particle cryo-EM analysis.Comment: 35 pages, 14 figure
Model-based Methods of Classification: Using the mclust Software in Chemometrics
Due to recent advances in methods and software for model-based clustering, and to the interpretability of the results, clustering procedures based on probability models are increasingly preferred over heuristic methods. The clustering process estimates a model for the data that allows for overlapping clusters, producing a probabilistic clustering that quantifies the uncertainty of observations belonging to components of the mixture. The resulting clustering model can also be used for some other important problems in multivariate analysis, including density estimation and discriminant analysis. Examples of the use of model-based clustering and classification techniques in chemometric studies include multivariate image analysis, magnetic resonance imaging, microarray image segmentation, statistical process control, and food authenticity. We review model-based clustering and related methods for density estimation and discriminant analysis, and show how the R package mclust can be applied in each instance.
Data mining: a tool for detecting cyclical disturbances in supply networks.
Disturbances in supply chains may be either exogenous or endogenous. The ability automatically to detect, diagnose, and distinguish between the causes of disturbances is of prime importance to decision makers in order to avoid uncertainty. The spectral principal component analysis (SPCA) technique has been utilized to distinguish between real and rogue disturbances in a steel supply network. The data set used was collected from four different business units in the network and consists of 43 variables; each is described by 72 data points. The present paper will utilize the same data set to test an alternative approach to SPCA in detecting the disturbances. The new approach employs statistical data pre-processing, clustering, and classification learning techniques to analyse the supply network data. In particular, the incremental k-means
clustering and the RULES-6 classification rule-learning algorithms, developed by the present authorsâ team, have been applied to identify important patterns in the data set. Results show that the proposed approach has the capability automatically to detect and characterize network-wide cyclical disturbances and generate hypotheses about their root cause
Optimal Clustering under Uncertainty
Classical clustering algorithms typically either lack an underlying
probability framework to make them predictive or focus on parameter estimation
rather than defining and minimizing a notion of error. Recent work addresses
these issues by developing a probabilistic framework based on the theory of
random labeled point processes and characterizing a Bayes clusterer that
minimizes the number of misclustered points. The Bayes clusterer is analogous
to the Bayes classifier. Whereas determining a Bayes classifier requires full
knowledge of the feature-label distribution, deriving a Bayes clusterer
requires full knowledge of the point process. When uncertain of the point
process, one would like to find a robust clusterer that is optimal over the
uncertainty, just as one may find optimal robust classifiers with uncertain
feature-label distributions. Herein, we derive an optimal robust clusterer by
first finding an effective random point process that incorporates all
randomness within its own probabilistic structure and from which a Bayes
clusterer can be derived that provides an optimal robust clusterer relative to
the uncertainty. This is analogous to the use of effective class-conditional
distributions in robust classification. After evaluating the performance of
robust clusterers in synthetic mixtures of Gaussians models, we apply the
framework to granular imaging, where we make use of the asymptotic
granulometric moment theory for granular images to relate robust clustering
theory to the application.Comment: 19 pages, 5 eps figures, 1 tabl
Model-based Methods of Classification: Using the mclust Software in Chemometrics
Due to recent advances in methods and software for model-based clustering, and to the interpretability of the results, clustering procedures based on probability models are increasingly preferred over heuristic methods. The clustering process estimates a model for the data that allows for overlapping clusters, producing a probabilistic clustering that quantifies the uncertainty of observations belonging to components of the mixture. The resulting clustering model can also be used for some other important problems in multivariate analysis, including density estimation and discriminant analysis. Examples of the use of model-based clustering and classification techniques in chemometric studies include multivariate image analysis, magnetic resonance imaging, microarray image segmentation, statistical process control, and food authenticity. We review model-based clustering and related methods for density estimation and discriminant analysis, and show how the R package mclust can be applied in each instance
Credal Fusion of Classifications for Noisy and Uncertain Data
This paper reports on an investigation in classification technique employed to classify noised and uncertain data. However, classification is not an easy task. It is a significant challenge to discover knowledge from uncertain data. In fact, we can find many problems. More time we don't have a good or a big learning database for supervised classification. Also, when training data contains noise or missing values, classification accuracy will be affected dramatically. So to extract groups from data is not easy to do. They are overlapped and not very separated from each other. Another problem which can be cited here is the uncertainty due to measuring devices. Consequentially classification model is not so robust and strong to classify new objects. In this work, we present a novel classification algorithm to cover these problems. We materialize our main idea by using belief function theory to do combination between classification and clustering. This theory treats very well imprecision and uncertainty linked to classification. Experimental results show that our approach has ability to significantly improve the quality of classification of generic database
- âŠ