2,427 research outputs found

    Hedging predictions in machine learning

    Get PDF
    Recent advances in machine learning make it possible to design efficient prediction algorithms for data sets with huge numbers of parameters. This paper describes a new technique for "hedging" the predictions output by many such algorithms, including support vector machines, kernel ridge regression, kernel nearest neighbours, and by many other state-of-the-art methods. The hedged predictions for the labels of new objects include quantitative measures of their own accuracy and reliability. These measures are provably valid under the assumption of randomness, traditional in machine learning: the objects and their labels are assumed to be generated independently from the same probability distribution. In particular, it becomes possible to control (up to statistical fluctuations) the number of erroneous predictions by selecting a suitable confidence level. Validity being achieved automatically, the remaining goal of hedged prediction is efficiency: taking full account of the new objects' features and other available information to produce as accurate predictions as possible. This can be done successfully using the powerful machinery of modern machine learning.Comment: 24 pages; 9 figures; 2 tables; a version of this paper (with discussion and rejoinder) is to appear in "The Computer Journal

    ELM regime classification by conformal prediction on an information manifold

    Get PDF
    Characterization and control of plasma instabilities known as edge-localized modes (ELMs) is crucial for the operation of fusion reactors. Recently, machine learning methods have demonstrated good potential in making useful inferences from stochastic fusion data sets. However, traditional classification methods do not offer an inherent estimate of the goodness of their prediction. In this paper, a distance-based conformal predictor classifier integrated with a geometric-probabilistic framework is presented. The first benefit of the approach lies in its comprehensive treatment of highly stochastic fusion data sets, by modeling the measurements with probability distributions in a metric space. This enables calculation of a natural distance measure between probability distributions: the Rao geodesic distance. Second, the predictions are accompanied by estimates of their accuracy and reliability. The method is applied to the classification of regimes characterized by different types of ELMs based on the measurements of global parameters and their error bars. This yields promising success rates and outperforms state-of-the-art automatic techniques for recognizing ELM signatures. The estimates of goodness of the predictions increase the confidence of classification by ELM experts, while allowing more reliable decisions regarding plasma control and at the same time increasing the robustness of the control system

    Conformal Prediction: a Unified Review of Theory and New Challenges

    Full text link
    In this work we provide a review of basic ideas and novel developments about Conformal Prediction -- an innovative distribution-free, non-parametric forecasting method, based on minimal assumptions -- that is able to yield in a very straightforward way predictions sets that are valid in a statistical sense also in in the finite sample case. The in-depth discussion provided in the paper covers the theoretical underpinnings of Conformal Prediction, and then proceeds to list the more advanced developments and adaptations of the original idea.Comment: arXiv admin note: text overlap with arXiv:0706.3188, arXiv:1604.04173, arXiv:1709.06233, arXiv:1203.5422 by other author

    Toward autonomous spacecraft

    Get PDF
    Ways in which autonomous behavior of spacecraft can be extended to treat situations wherein a closed loop control by a human may not be appropriate or even possible are explored. Predictive models that minimize mean least squared error and arbitrary cost functions are discussed. A methodology for extracting cyclic components for an arbitrary environment with respect to usual and arbitrary criteria is developed. An approach to prediction and control based on evolutionary programming is outlined. A computer program capable of predicting time series is presented. A design of a control system for a robotic dense with partially unknown physical properties is presented

    Engineering simulations for cancer systems biology

    Get PDF
    Computer simulation can be used to inform in vivo and in vitro experimentation, enabling rapid, low-cost hypothesis generation and directing experimental design in order to test those hypotheses. In this way, in silico models become a scientific instrument for investigation, and so should be developed to high standards, be carefully calibrated and their findings presented in such that they may be reproduced. Here, we outline a framework that supports developing simulations as scientific instruments, and we select cancer systems biology as an exemplar domain, with a particular focus on cellular signalling models. We consider the challenges of lack of data, incomplete knowledge and modelling in the context of a rapidly changing knowledge base. Our framework comprises a process to clearly separate scientific and engineering concerns in model and simulation development, and an argumentation approach to documenting models for rigorous way of recording assumptions and knowledge gaps. We propose interactive, dynamic visualisation tools to enable the biological community to interact with cellular signalling models directly for experimental design. There is a mismatch in scale between these cellular models and tissue structures that are affected by tumours, and bridging this gap requires substantial computational resource. We present concurrent programming as a technology to link scales without losing important details through model simplification. We discuss the value of combining this technology, interactive visualisation, argumentation and model separation to support development of multi-scale models that represent biologically plausible cells arranged in biologically plausible structures that model cell behaviour, interactions and response to therapeutic interventions

    Criteria of efficiency for conformal prediction

    Get PDF
    We study optimal conformity measures for various criteria of efficiency of classification in an idealised setting. This leads to an important class of criteria of efficiency that we call probabilistic; it turns out that the most standard criteria of efficiency used in literature on conformal prediction are not probabilistic unless the problem of classification is binary. We consider both unconditional and label-conditional conformal prediction.Comment: 31 page

    Conformal Prediction with Orange

    Get PDF
    Conformal predictors estimate the reliability of outcomes made by supervised machine learning models. Instead of a point value, conformal prediction defines an outcome region that meets a user-specified reliability threshold. Provided that the data are independently and identically distributed, the user can control the level of the prediction errors and adjust it following the requirements of a given application. The quality of conformal predictions often depends on the choice of nonconformity estimate for a given machine learning method. To promote the selection of a successful approach, we have developed Orange3-Conformal, a Python library that provides a range of conformal prediction methods for classification and regression. The library also implements several nonconformity scores. It has a modular design and can be extended to add new conformal prediction methods and nonconformities
    corecore