161 research outputs found

    Conformal Prediction: a Unified Review of Theory and New Challenges

    Full text link
    In this work we provide a review of basic ideas and novel developments about Conformal Prediction -- an innovative distribution-free, non-parametric forecasting method, based on minimal assumptions -- that is able to yield in a very straightforward way predictions sets that are valid in a statistical sense also in in the finite sample case. The in-depth discussion provided in the paper covers the theoretical underpinnings of Conformal Prediction, and then proceeds to list the more advanced developments and adaptations of the original idea.Comment: arXiv admin note: text overlap with arXiv:0706.3188, arXiv:1604.04173, arXiv:1709.06233, arXiv:1203.5422 by other author

    Hedging predictions in machine learning

    Get PDF
    Recent advances in machine learning make it possible to design efficient prediction algorithms for data sets with huge numbers of parameters. This paper describes a new technique for "hedging" the predictions output by many such algorithms, including support vector machines, kernel ridge regression, kernel nearest neighbours, and by many other state-of-the-art methods. The hedged predictions for the labels of new objects include quantitative measures of their own accuracy and reliability. These measures are provably valid under the assumption of randomness, traditional in machine learning: the objects and their labels are assumed to be generated independently from the same probability distribution. In particular, it becomes possible to control (up to statistical fluctuations) the number of erroneous predictions by selecting a suitable confidence level. Validity being achieved automatically, the remaining goal of hedged prediction is efficiency: taking full account of the new objects' features and other available information to produce as accurate predictions as possible. This can be done successfully using the powerful machinery of modern machine learning.Comment: 24 pages; 9 figures; 2 tables; a version of this paper (with discussion and rejoinder) is to appear in "The Computer Journal

    Automated Active Learning with a Robot

    Get PDF
    In the field of automated processes in industry, a major goal is for robots to solve new tasks without costly adaptions. Therefore, it is of advantage if the robot can perform new tasks independently while the learning process is intuitively understandable for humans. In this article, we present a highly automated and intuitive active learning algorithm for robots. It learns new classification tasks by asking questions to a human teacher and automatically decides when to stop the learning process by self-assessing its confidence. This so-called stopping criterion is required to guarantee a fully automated procedure. Our approach is highly interactive as we use speech for communication and a graphical visualization tool. The latter provides information about the learning progress and the stopping criterion, which helps the human teacher in understanding the training process better. The applicability of our approach is shown and evaluated on a real Baxter robot

    A survey on online active learning

    Full text link
    Online active learning is a paradigm in machine learning that aims to select the most informative data points to label from a data stream. The problem of minimizing the cost associated with collecting labeled observations has gained a lot of attention in recent years, particularly in real-world applications where data is only available in an unlabeled form. Annotating each observation can be time-consuming and costly, making it difficult to obtain large amounts of labeled data. To overcome this issue, many active learning strategies have been proposed in the last decades, aiming to select the most informative observations for labeling in order to improve the performance of machine learning models. These approaches can be broadly divided into two categories: static pool-based and stream-based active learning. Pool-based active learning involves selecting a subset of observations from a closed pool of unlabeled data, and it has been the focus of many surveys and literature reviews. However, the growing availability of data streams has led to an increase in the number of approaches that focus on online active learning, which involves continuously selecting and labeling observations as they arrive in a stream. This work aims to provide an overview of the most recently proposed approaches for selecting the most informative observations from data streams in the context of online active learning. We review the various techniques that have been proposed and discuss their strengths and limitations, as well as the challenges and opportunities that exist in this area of research. Our review aims to provide a comprehensive and up-to-date overview of the field and to highlight directions for future work

    Large-scale inference in the focally damaged human brain

    Get PDF
    Clinical outcomes in focal brain injury reflect the interactions between two distinct anatomically distributed patterns: the functional organisation of the brain and the structural distribution of injury. The challenge of understanding the functional architecture of the brain is familiar; that of understanding the lesion architecture is barely acknowledged. Yet, models of the functional consequences of focal injury are critically dependent on our knowledge of both. The studies described in this thesis seek to show how machine learning-enabled high-dimensional multivariate analysis powered by large-scale data can enhance our ability to model the relation between focal brain injury and clinical outcomes across an array of modelling applications. All studies are conducted on internationally the largest available set of MR imaging data of focal brain injury in the context of acute stroke (N=1333) and employ kernel machines at the principal modelling architecture. First, I examine lesion-deficit prediction, quantifying the ceiling on achievable predictive fidelity for high-dimensional and low-dimensional models, demonstrating the former to be substantially higher than the latter. Second, I determine the marginal value of adding unlabelled imaging data to predictive models within a semi-supervised framework, quantifying the benefit of assembling unlabelled collections of clinical imaging. Third, I compare high- and low-dimensional approaches to modelling response to therapy in two contexts: quantifying the effect of treatment at the population level (therapeutic inference) and predicting the optimal treatment in an individual patient (prescriptive inference). I demonstrate the superiority of the high-dimensional approach in both settings

    Probabilistic Load Forecasting with Deep Conformalized Quantile Regression

    Get PDF
    The establishment of smart grids and the introduction of distributed generation posed new challenges in energy analytics that can be tackled with machine learning algorithms. The latter, are able to handle a combination of weather and consumption data, grid measurements, and their historical records to compute inference and make predictions. An accurate energy load forecasting is essential to assure reliable grid operation and power provision at peak times when power consumption is high. However, most of the existing load forecasting algorithms provide only point estimates or probabilistic forecasting methods that construct prediction intervals without coverage guarantee. Nevertheless, information about uncertainty and prediction intervals is very useful to grid operators to evaluate the reliability of operations in the power network and to enable a risk-based strategy for configuring the grid over a conservative one. There are two popular statistical methods used to generate prediction intervals in regression tasks: Quantile regression is a non-parametric probabilistic forecasting technique producing prediction intervals adaptive to local variability within the data by estimating quantile functions directly from the data. However, the actual coverage of the prediction intervals obtained via quantile regression is not guaranteed to satisfy the designed coverage level for finite samples. Conformal prediction is an on-top probabilistic forecasting framework producing symmetric prediction intervals, most often with a fixed length, guaranteed to marginally satisfy the designed coverage level for finite samples. This thesis proposes a probabilistic load forecasting method for constructing marginally valid prediction intervals adaptive to local variability and suitable for data characterized by temporal dependencies. The method is applied in conjunction with recurrent neural networks, deep learning architectures for sequential data, which are mostly used to compute point forecasts rather than probabilistic forecasts. Specifically, the use of an ensemble of pinball-loss guided deep neural networks performing quantile regression is used together with conformal prediction to address the individual shortcomings of both techniques
    • …
    corecore