Search CORE

7,103 research outputs found

Hedging predictions in machine learning

Author: Gammerman Alexander
Vovk Vladimir
Publication venue: 'Oxford University Press (OUP)'
Publication date: 11/02/2006
Field of study

Recent advances in machine learning make it possible to design efficient prediction algorithms for data sets with huge numbers of parameters. This paper describes a new technique for "hedging" the predictions output by many such algorithms, including support vector machines, kernel ridge regression, kernel nearest neighbours, and by many other state-of-the-art methods. The hedged predictions for the labels of new objects include quantitative measures of their own accuracy and reliability. These measures are provably valid under the assumption of randomness, traditional in machine learning: the objects and their labels are assumed to be generated independently from the same probability distribution. In particular, it becomes possible to control (up to statistical fluctuations) the number of erroneous predictions by selecting a suitable confidence level. Validity being achieved automatically, the remaining goal of hedged prediction is efficiency: taking full account of the new objects' features and other available information to produce as accurate predictions as possible. This can be done successfully using the powerful machinery of modern machine learning.Comment: 24 pages; 9 figures; 2 tables; a version of this paper (with discussion and rejoinder) is to appear in "The Computer Journal

arXiv.org e-Print Archive

CiteSeerX

Royal Holloway Research Online

Conformal Prediction: a Unified Review of Theory and New Challenges

Author: Fontana Matteo
Vantini Simone
Zeni Gianluca
Publication venue
Publication date: 16/05/2020
Field of study

In this work we provide a review of basic ideas and novel developments about Conformal Prediction -- an innovative distribution-free, non-parametric forecasting method, based on minimal assumptions -- that is able to yield in a very straightforward way predictions sets that are valid in a statistical sense also in in the finite sample case. The in-depth discussion provided in the paper covers the theoretical underpinnings of Conformal Prediction, and then proceeds to list the more advanced developments and adaptations of the original idea.Comment: arXiv admin note: text overlap with arXiv:0706.3188, arXiv:1604.04173, arXiv:1709.06233, arXiv:1203.5422 by other author

arXiv.org e-Print Archive

Estimating labels from label proportions

Author: Caetano TS
Le QV
Quadrianto N
Smola AJ
Publication venue: Microtome Publishing
Publication date: 01/01/2008
Field of study

Consider the following problem: given sets of unlabeled observations, each set with known label proportions, predict the labels of another set of observations, also with known label proportions. This problem appears in areas like e-commerce, spam filtering and improper content detection. We present consistent estimators which can reconstruct the correct labels with high probability in a uniform convergence sense. Experiments show that our method works well in practice.

CiteSeerX

Semi-Supervised Kernel PCA

Author: Christian Walder
Lars Kai Hansen
Mathematical Modelling
Morten Mørup
Ricardo Henao
Publication venue
Publication date: 01/01/2010
Field of study

We present three generalisations of Kernel Principal Components Analysis (KPCA) which incorporate knowledge of the class labels of a subset of the data points. The first, MV-KPCA, penalises within class variances similar to Fisher discriminant analysis. The second, LSKPCA is a hybrid of least squares regression and kernel PCA. The final LR-KPCA is an iteratively reweighted version of the previous which achieves a sigmoid loss function on the labeled points. We provide a theoretical risk bound as well as illustrative experiments on real and toy data sets

arXiv.org e-Print Archive

CiteSeerX