2,427 research outputs found
Hedging predictions in machine learning
Recent advances in machine learning make it possible to design efficient
prediction algorithms for data sets with huge numbers of parameters. This paper
describes a new technique for "hedging" the predictions output by many such
algorithms, including support vector machines, kernel ridge regression, kernel
nearest neighbours, and by many other state-of-the-art methods. The hedged
predictions for the labels of new objects include quantitative measures of
their own accuracy and reliability. These measures are provably valid under the
assumption of randomness, traditional in machine learning: the objects and
their labels are assumed to be generated independently from the same
probability distribution. In particular, it becomes possible to control (up to
statistical fluctuations) the number of erroneous predictions by selecting a
suitable confidence level. Validity being achieved automatically, the remaining
goal of hedged prediction is efficiency: taking full account of the new
objects' features and other available information to produce as accurate
predictions as possible. This can be done successfully using the powerful
machinery of modern machine learning.Comment: 24 pages; 9 figures; 2 tables; a version of this paper (with
discussion and rejoinder) is to appear in "The Computer Journal
ELM regime classification by conformal prediction on an information manifold
Characterization and control of plasma instabilities known as edge-localized modes (ELMs) is crucial for the operation of fusion reactors. Recently, machine learning methods have demonstrated good potential in making useful inferences from stochastic fusion data sets. However, traditional classification methods do not offer an inherent estimate of the goodness of their prediction. In this paper, a distance-based conformal predictor classifier integrated with a geometric-probabilistic framework is presented. The first benefit of the approach lies in its comprehensive treatment of highly stochastic fusion data sets, by modeling the measurements with probability distributions in a metric space. This enables calculation of a natural distance measure between probability distributions: the Rao geodesic distance. Second, the predictions are accompanied by estimates of their accuracy and reliability. The method is applied to the classification of regimes characterized by different types of ELMs based on the measurements of global parameters and their error bars. This yields promising success rates and outperforms state-of-the-art automatic techniques for recognizing ELM signatures. The estimates of goodness of the predictions increase the confidence of classification by ELM experts, while allowing more reliable decisions regarding plasma control and at the same time increasing the robustness of the control system
Conformal Prediction: a Unified Review of Theory and New Challenges
In this work we provide a review of basic ideas and novel developments about
Conformal Prediction -- an innovative distribution-free, non-parametric
forecasting method, based on minimal assumptions -- that is able to yield in a
very straightforward way predictions sets that are valid in a statistical sense
also in in the finite sample case. The in-depth discussion provided in the
paper covers the theoretical underpinnings of Conformal Prediction, and then
proceeds to list the more advanced developments and adaptations of the original
idea.Comment: arXiv admin note: text overlap with arXiv:0706.3188,
arXiv:1604.04173, arXiv:1709.06233, arXiv:1203.5422 by other author
Toward autonomous spacecraft
Ways in which autonomous behavior of spacecraft can be extended to treat situations wherein a closed loop control by a human may not be appropriate or even possible are explored. Predictive models that minimize mean least squared error and arbitrary cost functions are discussed. A methodology for extracting cyclic components for an arbitrary environment with respect to usual and arbitrary criteria is developed. An approach to prediction and control based on evolutionary programming is outlined. A computer program capable of predicting time series is presented. A design of a control system for a robotic dense with partially unknown physical properties is presented
Engineering simulations for cancer systems biology
Computer simulation can be used to inform in vivo and in vitro experimentation, enabling rapid, low-cost hypothesis generation and directing experimental design in order to test those hypotheses. In this way, in silico models become a scientific instrument for investigation, and so should be developed to high standards, be carefully calibrated and their findings presented in such that they may be reproduced. Here, we outline a framework that supports developing simulations as scientific instruments, and we select cancer systems biology as an exemplar domain, with a particular focus on cellular signalling models. We consider the challenges of lack of data, incomplete knowledge and modelling in the context of a rapidly changing knowledge base. Our framework comprises a process to clearly separate scientific and engineering concerns in model and simulation development, and an argumentation approach to documenting models for rigorous way of recording assumptions and knowledge gaps. We propose interactive, dynamic visualisation tools to enable the biological community to interact with cellular signalling models directly for experimental design. There is a mismatch in scale between these cellular models and tissue structures that are affected by tumours, and bridging this gap requires substantial computational resource. We present concurrent programming as a technology to link scales without losing important details through model simplification. We discuss the value of combining this technology, interactive visualisation, argumentation and model separation to support development of multi-scale models that represent biologically plausible cells arranged in biologically plausible structures that model cell behaviour, interactions and response to therapeutic interventions
Criteria of efficiency for conformal prediction
We study optimal conformity measures for various criteria of efficiency of
classification in an idealised setting. This leads to an important class of
criteria of efficiency that we call probabilistic; it turns out that the most
standard criteria of efficiency used in literature on conformal prediction are
not probabilistic unless the problem of classification is binary. We consider
both unconditional and label-conditional conformal prediction.Comment: 31 page
Conformal Prediction with Orange
Conformal predictors estimate the reliability of outcomes made by supervised machine learning models. Instead of a point value, conformal prediction defines an outcome region that meets a user-specified reliability threshold. Provided that the data are independently and identically distributed, the user can control the level of the prediction errors and adjust it following the requirements of a given application. The quality of conformal predictions often depends on the choice of nonconformity estimate for a given machine learning method. To promote the selection of a successful approach, we have developed Orange3-Conformal, a Python library that provides a range of conformal prediction methods for classification and regression. The library also implements several nonconformity scores. It has a modular design and can be extended to add new conformal prediction methods and nonconformities
Characterization of the definitive classical calpain family of vertebrates using phylogenetic, evolutionary and expression analyses
Peer reviewedPublisher PD
- …