2,220 research outputs found
Robust Classification for Imprecise Environments
In real-world environments it usually is difficult to specify target
operating conditions precisely, for example, target misclassification costs.
This uncertainty makes building robust classification systems problematic. We
show that it is possible to build a hybrid classifier that will perform at
least as well as the best available classifier for any target conditions. In
some cases, the performance of the hybrid actually can surpass that of the best
known classifier. This robust performance extends across a wide variety of
comparison frameworks, including the optimization of metrics such as accuracy,
expected cost, lift, precision, recall, and workforce utilization. The hybrid
also is efficient to build, to store, and to update. The hybrid is based on a
method for the comparison of classifier performance that is robust to imprecise
class distributions and misclassification costs. The ROC convex hull (ROCCH)
method combines techniques from ROC analysis, decision analysis and
computational geometry, and adapts them to the particulars of analyzing learned
classifiers. The method is efficient and incremental, minimizes the management
of classifier performance data, and allows for clear visual comparisons and
sensitivity analyses. Finally, we point to empirical evidence that a robust
hybrid classifier indeed is needed for many real-world problems.Comment: 24 pages, 12 figures. To be published in Machine Learning Journal.
For related papers, see http://www.hpl.hp.com/personal/Tom_Fawcett/ROCCH
Recommended from our members
Loss-size and Reliability Trade-offs Amongst Diverse Redundant Binary Classifiers
Many applications involve the use of binary classifiers, including applications where safety and security are critical. The quantitative assessment of such classifiers typically involves receiver operator characteristic (ROC) methods and the estimation of sensitivity/specificity. But such techniques have their limitations. For safety/security critical applications, more relevant measures of reliability and risk should be estimated. Moreover, ROC techniques do not explicitly account for: 1) inherent uncertainties one faces during assessments, 2) reliability evidence other than the observed failure behaviour of the classifier, and 3) how this observed failure behaviour alters one's uncertainty about classifier reliability. We address these limitations using conservative Bayesian inference (CBI) methods, producing statistically principled, conservative values for risk/reliability measures of interest. Our analyses reveals trade-offs amongst all binary classifiers with the same expected loss { the most reliable classifiers are those most likely to experience high impact failures. This trade-off is harnessed by using diverse redundant binary classifiers
Neural-Augmented Static Analysis of Android Communication
We address the problem of discovering communication links between
applications in the popular Android mobile operating system, an important
problem for security and privacy in Android. Any scalable static analysis in
this complex setting is bound to produce an excessive amount of
false-positives, rendering it impractical. To improve precision, we propose to
augment static analysis with a trained neural-network model that estimates the
probability that a communication link truly exists. We describe a
neural-network architecture that encodes abstractions of communicating objects
in two applications and estimates the probability with which a link indeed
exists. At the heart of our architecture are type-directed encoders (TDE), a
general framework for elegantly constructing encoders of a compound data type
by recursively composing encoders for its constituent types. We evaluate our
approach on a large corpus of Android applications, and demonstrate that it
achieves very high accuracy. Further, we conduct thorough interpretability
studies to understand the internals of the learned neural networks.Comment: Appears in Proceedings of the 2018 ACM Joint European Software
Engineering Conference and Symposium on the Foundations of Software
Engineering (ESEC/FSE
Epistemic irrelevance in credal nets: the case of imprecise Markov trees
We focus on credal nets, which are graphical models that generalise Bayesian
nets to imprecise probability. We replace the notion of strong independence
commonly used in credal nets with the weaker notion of epistemic irrelevance,
which is arguably more suited for a behavioural theory of probability. Focusing
on directed trees, we show how to combine the given local uncertainty models in
the nodes of the graph into a global model, and we use this to construct and
justify an exact message-passing algorithm that computes updated beliefs for a
variable in the tree. The algorithm, which is linear in the number of nodes, is
formulated entirely in terms of coherent lower previsions, and is shown to
satisfy a number of rationality requirements. We supply examples of the
algorithm's operation, and report an application to on-line character
recognition that illustrates the advantages of our approach for prediction. We
comment on the perspectives, opened by the availability, for the first time, of
a truly efficient algorithm based on epistemic irrelevance.Comment: 29 pages, 5 figures, 1 tabl
Automated reliability assessment for spectroscopic redshift measurements
We present a new approach to automate the spectroscopic redshift reliability
assessment based on machine learning (ML) and characteristics of the redshift
probability density function (PDF).
We propose to rephrase the spectroscopic redshift estimation into a Bayesian
framework, in order to incorporate all sources of information and uncertainties
related to the redshift estimation process, and produce a redshift posterior
PDF that will be the starting-point for ML algorithms to provide an automated
assessment of a redshift reliability.
As a use case, public data from the VIMOS VLT Deep Survey is exploited to
present and test this new methodology. We first tried to reproduce the existing
reliability flags using supervised classification to describe different types
of redshift PDFs, but due to the subjective definition of these flags, soon
opted for a new homogeneous partitioning of the data into distinct clusters via
unsupervised classification. After assessing the accuracy of the new clusters
via resubstitution and test predictions, unlabelled data from preliminary mock
simulations for the Euclid space mission are projected into this mapping to
predict their redshift reliability labels.Comment: Submitted on 02 June 2017 (v1). Revised on 08 September 2017 (v2).
Latest version 28 September 2017 (this version v3
A robust dynamic classifier selection approach for hyperspectral images with imprecise label information
Supervised hyperspectral image (HSI) classification relies on accurate label information. However, it is not always possible to collect perfectly accurate labels for training samples. This motivates the development of classifiers that are sufficiently robust to some reasonable amounts of errors in data labels. Despite the growing importance of this aspect, it has not been sufficiently studied in the literature yet. In this paper, we analyze the effect of erroneous sample labels on probability distributions of the principal components of HSIs, and provide in this way a statistical analysis of the resulting uncertainty in classifiers. Building on the theory of imprecise probabilities, we develop a novel robust dynamic classifier selection (R-DCS) model for data classification with erroneous labels. Particularly, spectral and spatial features are extracted from HSIs to construct two individual classifiers for the dynamic selection, respectively. The proposed R-DCS model is based on the robustness of the classifiers’ predictions: the extent to which a classifier can be altered without changing its prediction. We provide three possible selection strategies for the proposed model with different computational complexities and apply them on three benchmark data sets. Experimental results demonstrate that the proposed model outperforms the individual classifiers it selects from and is more robust to errors in labels compared to widely adopted approaches
- …