30 research outputs found
Hedging predictions in machine learning
Recent advances in machine learning make it possible to design efficient
prediction algorithms for data sets with huge numbers of parameters. This paper
describes a new technique for "hedging" the predictions output by many such
algorithms, including support vector machines, kernel ridge regression, kernel
nearest neighbours, and by many other state-of-the-art methods. The hedged
predictions for the labels of new objects include quantitative measures of
their own accuracy and reliability. These measures are provably valid under the
assumption of randomness, traditional in machine learning: the objects and
their labels are assumed to be generated independently from the same
probability distribution. In particular, it becomes possible to control (up to
statistical fluctuations) the number of erroneous predictions by selecting a
suitable confidence level. Validity being achieved automatically, the remaining
goal of hedged prediction is efficiency: taking full account of the new
objects' features and other available information to produce as accurate
predictions as possible. This can be done successfully using the powerful
machinery of modern machine learning.Comment: 24 pages; 9 figures; 2 tables; a version of this paper (with
discussion and rejoinder) is to appear in "The Computer Journal
Conformal anomaly detection for visual reconstruction using gestalt principles
In this paper, we combine a modern machine learning technique called conformal predictors (CP) with elements of gestalt detection and apply them to the problem of visual perception in digital images. Our main task is to quantify several gestalt principles of visual reconstruction. We interpret an image/shape as being perceivable (meaningful) if it sufficiently deviates from randomness - in other words, the image could hardly happen by chance. These deviations from randomness are measured by using conformal prediction technique that can guarantee the validity under certain assumptions. The technique describes the detection of perceivable images that allows to bound the number of false alarms, i.e. the proportion of non-perceivable images wrongly detected as perceivable
Inductive Conformal Martingales for Change-Point Detection
We consider the problem of quickest change-point detection in data streams.
Classical change-point detection procedures, such as CUSUM, Shiryaev-Roberts
and Posterior Probability statistics, are optimal only if the change-point
model is known, which is an unrealistic assumption in typical applied problems.
Instead we propose a new method for change-point detection based on Inductive
Conformal Martingales, which requires only the independence and identical
distribution of observations. We compare the proposed approach to standard
methods, as well as to change-point detection oracles, which model a typical
practical situation when we have only imprecise (albeit parametric) information
about pre- and post-change data distributions. Results of comparison provide
evidence that change-point detection based on Inductive Conformal Martingales
is an efficient tool, capable to work under quite general conditions unlike
traditional approaches.Comment: 22 pages, 9 figures, 5 table
Hedging predictions in machine learning
Recent advances in machine learning make it possible to design efficient prediction algorithms for data sets with huge numbers of parameters. This paper describes a new technique for "hedging" the predictions output by many such algorithms, including support vector machines, kernel ridge regression, kernel nearest neighbours, and by many other state-of-the-art methods. The hedged predictions for the labels of new objects include quantitative measures of their own accuracy and reliability. These measures are provably valid under the assumption of randomness, traditional in machine learning: the objects and their labels are assumed to be generated independently from the same probability distribution. In particular, it becomes possible to control (up to statistical fluctuations) the number of erroneous predictions by selecting a suitable confidence level. Validity being achieved automatically, the remaining goal of hedged prediction is efficiency: taking full account of the new objects' features and other available information to produce as accurate predictions as possible. This can be done successfully using the powerful machinery of modern machine learning