32,388 research outputs found

    Binary Classifier Calibration using an Ensemble of Near Isotonic Regression Models

    Full text link
    Learning accurate probabilistic models from data is crucial in many practical tasks in data mining. In this paper we present a new non-parametric calibration method called \textit{ensemble of near isotonic regression} (ENIR). The method can be considered as an extension of BBQ, a recently proposed calibration method, as well as the commonly used calibration method based on isotonic regression. ENIR is designed to address the key limitation of isotonic regression which is the monotonicity assumption of the predictions. Similar to BBQ, the method post-processes the output of a binary classifier to obtain calibrated probabilities. Thus it can be combined with many existing classification models. We demonstrate the performance of ENIR on synthetic and real datasets for the commonly used binary classification models. Experimental results show that the method outperforms several common binary classifier calibration methods. In particular on the real data, ENIR commonly performs statistically significantly better than the other methods, and never worse. It is able to improve the calibration power of classifiers, while retaining their discrimination power. The method is also computationally tractable for large scale datasets, as it is O(NlogN)O(N \log N) time, where NN is the number of samples

    Hedging predictions in machine learning

    Get PDF
    Recent advances in machine learning make it possible to design efficient prediction algorithms for data sets with huge numbers of parameters. This paper describes a new technique for "hedging" the predictions output by many such algorithms, including support vector machines, kernel ridge regression, kernel nearest neighbours, and by many other state-of-the-art methods. The hedged predictions for the labels of new objects include quantitative measures of their own accuracy and reliability. These measures are provably valid under the assumption of randomness, traditional in machine learning: the objects and their labels are assumed to be generated independently from the same probability distribution. In particular, it becomes possible to control (up to statistical fluctuations) the number of erroneous predictions by selecting a suitable confidence level. Validity being achieved automatically, the remaining goal of hedged prediction is efficiency: taking full account of the new objects' features and other available information to produce as accurate predictions as possible. This can be done successfully using the powerful machinery of modern machine learning.Comment: 24 pages; 9 figures; 2 tables; a version of this paper (with discussion and rejoinder) is to appear in "The Computer Journal

    Characteristics, accuracy and reverification of robotised articulated arm CMMs

    Get PDF
    VDI article 2617 specifies characteristics to describe the accuracy of articulated arm coordinate measuring machines (AACMMs) and outlines procedures for checking them. However the VDI prescription was written with a former generation of machines in mind: manual arms exploiting traditional touch probe technologies. Recent advances in metrology have given rise to noncontact laser scanning tools and robotic automation of articulated arms – technologies which are not adequately characterised using the VDI specification. In this paper we examine the “guidelines” presented in VDI 2617, finding many of them to be ambiguous and open to interpretation, with some tests appearing even to be optional. The engineer is left significant flexibility in the execution of the test procedures and the manufacturer is free to specify many of the test parameters. Such flexibility renders the VDI tests of limited value and the results can be misleading. We illustrate, with examples using the Nikon RCA, how a liberal interpretation of the VDI guidelines can significantly improve accuracy characterisation and suggest ways in which to mitigate this problem. We propose a series of stringent tests and revised definitions, in the same vein as VDI 2617 and similar US standards, to clarify the accuracy characterisation process. The revised methodology includes modified acceptance and reverification tests which aim to accommodate emerging technologies, laser scanning devices in particular, while maintaining the spirit of the existing and established standards. We seek to supply robust re-definitions for the accepted terms “zero point” and “useful arm length”, pre-supposing nothing about the geometry of the measuring device. We also identify a source of error unique to robotised AACMMs employing laser scanners – the forward-reverse pass error. We show how eliminating this error significantly improves the repeatability of a device and propose a novel approach to the testing of probing error based on statistical uncertainty

    Estimating Uncertainty Online Against an Adversary

    Full text link
    Assessing uncertainty is an important step towards ensuring the safety and reliability of machine learning systems. Existing uncertainty estimation techniques may fail when their modeling assumptions are not met, e.g. when the data distribution differs from the one seen at training time. Here, we propose techniques that assess a classification algorithm's uncertainty via calibrated probabilities (i.e. probabilities that match empirical outcome frequencies in the long run) and which are guaranteed to be reliable (i.e. accurate and calibrated) on out-of-distribution input, including input generated by an adversary. This represents an extension of classical online learning that handles uncertainty in addition to guaranteeing accuracy under adversarial assumptions. We establish formal guarantees for our methods, and we validate them on two real-world problems: question answering and medical diagnosis from genomic data
    corecore