Search CORE

446 research outputs found

Conformal Prediction: a Unified Review of Theory and New Challenges

Author: Fontana Matteo
Vantini Simone
Zeni Gianluca
Publication venue
Publication date: 16/05/2020
Field of study

In this work we provide a review of basic ideas and novel developments about Conformal Prediction -- an innovative distribution-free, non-parametric forecasting method, based on minimal assumptions -- that is able to yield in a very straightforward way predictions sets that are valid in a statistical sense also in in the finite sample case. The in-depth discussion provided in the paper covers the theoretical underpinnings of Conformal Prediction, and then proceeds to list the more advanced developments and adaptations of the original idea.Comment: arXiv admin note: text overlap with arXiv:0706.3188, arXiv:1604.04173, arXiv:1709.06233, arXiv:1203.5422 by other author

arXiv.org e-Print Archive

Calibrated Explanations: with Uncertainty Information and Counterfactuals

Author: Johansson Ulf
Lofstrom Helena
Lofstrom Tuwe
Sonstrod Cecilia
Publication venue
Publication date: 04/11/2023
Field of study

While local explanations for AI models can offer insights into individual predictions, such as feature importance, they are plagued by issues like instability. The unreliability of feature weights, often skewed due to poorly calibrated ML models, deepens these challenges. Moreover, the critical aspect of feature importance uncertainty remains mostly unaddressed in Explainable AI (XAI). The novel feature importance explanation method presented in this paper, called Calibrated Explanations (CE), is designed to tackle these issues head-on. Built on the foundation of Venn-Abers, CE not only calibrates the underlying model but also delivers reliable feature importance explanations with an exact definition of the feature weights. CE goes beyond conventional solutions by addressing output uncertainty. It accomplishes this by providing uncertainty quantification for both feature weights and the model's probability estimates. Additionally, CE is model-agnostic, featuring easily comprehensible conditional rules and the ability to generate counterfactual explanations with embedded uncertainty quantification. Results from an evaluation with 25 benchmark datasets underscore the efficacy of CE, making it stand as a fast, reliable, stable, and robust solution.Comment: 19 pages, 6 figures, 3 tables, submitted to journa

arXiv.org e-Print Archive

Conformal and Venn Predictors for large, imbalanced and sparse chemoinformatics data

Author: Toccaceli Paolo
Publication venue
Publication date: 01/01/2021
Field of study

Royal Holloway - Pure

Calibrated Explanations for Regression

Author: Johansson Ulf
Löfström Helena
Löfström Tuwe
Matela Rudy
Sönströd Cecilia
Publication venue
Publication date: 01/09/2023
Field of study

Artificial Intelligence (AI) is often an integral part of modern decision support systems (DSSs). The best-performing predictive models used in AI-based DSSs lack transparency. Explainable Artificial Intelligence (XAI) aims to create AI systems that can explain their rationale to human users. Local explanations in XAI can provide information about the causes of individual predictions in terms of feature importance. However, a critical drawback of existing local explanation methods is their inability to quantify the uncertainty associated with a feature's importance. This paper introduces an extension of a feature importance explanation method, Calibrated Explanations (CE), previously only supporting classification, with support for standard regression and probabilistic regression, i.e., the probability that the target is above an arbitrary threshold. The extension for regression keeps all the benefits of CE, such as calibration of the prediction from the underlying model with confidence intervals, uncertainty quantification of feature importance, and allows both factual and counterfactual explanations. CE for standard regression provides fast, reliable, stable, and robust explanations. CE for probabilistic regression provides an entirely new way of creating probabilistic explanations from any ordinary regression model and with a dynamic selection of thresholds. The performance of CE for probabilistic regression regarding stability and speed is comparable to LIME. The method is model agnostic with easily understood conditional rules. An implementation in Python is freely available on GitHub and for installation using pip making the results in this paper easily replicable.Comment: 30 pages, 11 figures (replaced due to omitted author, which is the only change made

arXiv.org e-Print Archive

Small and Large Scale Probabilistic Classifiers with Guarantees of Validity

Author: Petej Ivan
Publication venue
Publication date: 01/01/2018
Field of study

Royal Holloway - Pure

Machine learning framework to extract the biomarker potential of plasma IgG N-glycans towards disease risk stratification

Author: Davies Joseph
Dunlop Malcolm g.
Flevaris Konstantinos
Kontoravdi Cleo
Lauc Gordan
Nakai Shoh
Vučković Frano
Publication venue
Publication date: 11/03/2024
Field of study

Effective management of chronic diseases and cancer can greatly benefit from disease-specific biomarkers that enable informative screening and timely diagnosis. IgG N-glycans found in human plasma have the potential to be minimally invasive disease-specific biomarkers for all stages of disease development due to their plasticity in response to various genetic and environmental stimuli. Data analysis and machine learning (ML) approaches can assist in harnessing the potential of IgG glycomics towards biomarker discovery and the development of reliable predictive tools for disease screening. This study proposes an ML-based N-glycomic analysis framework that can be employed to build, optimise, and evaluate multiple ML pipelines to stratify patients based on disease risk in an interpretable manner. To design and test this framework, a published colorectal cancer (CRC) dataset from the Study of Colorectal Cancer in Scotland (SOCCS) cohort (1999-2006) was used. In particular, among the different pipelines tested, an XGBoost-based ML pipeline, which was tuned using multi-objective optimisation, calibrated using an inductive Venn-Abers predictor (IVAP), and evaluated via a nested cross-validation (NCV) scheme, achieved a mean area under the Receiver Operating Characteristic Curve (AUC-ROC) of 0.771 when classifying between age-, and sex-matched healthy controls and CRC patients. This performance suggests the potential of using the relative abundance of IgG N-glycans to define populations at elevated CRC risk who merit investigation or surveillance. Finally, the IgG N-glycans that highly impact CRC classification decisions were identified using a global model-agnostic interpretability technique, namely Accumulated Local Effects (ALE). We envision that open-source computational frameworks, such as the one presented herein, will be useful in supporting the translation of glycan-based biomarkers into clinical applications

Edinburgh Research Explorer

Classifier Calibration: A survey on how to assess and improve predicted class probabilities

Author: Filho Telmo Silva
Flach Peter
Kull Meelis
Perello-Nieto Miquel
Santos-Rodriguez Raul
Song Hao
Publication venue
Publication date: 16/02/2023
Field of study

This paper provides both an introduction to and a detailed overview of the principles and practice of classifier calibration. A well-calibrated classifier correctly quantifies the level of uncertainty or confidence associated with its instance-wise predictions. This is essential for critical applications, optimal decision making, cost-sensitive classification, and for some types of context change. Calibration research has a rich history which predates the birth of machine learning as an academic field by decades. However, a recent increase in the interest on calibration has led to new methods and the extension from binary to the multiclass setting. The space of options and issues to consider is large, and navigating it requires the right set of concepts and tools. We provide both introductory material and up-to-date technical details of the main concepts and methods, including proper scoring rules and other evaluation metrics, visualisation approaches, a comprehensive account of post-hoc calibration methods for binary and multiclass classification, and several advanced topics

arXiv.org e-Print Archive

Explore Bristol Research

Coreset-based Protocols for Machine Learning Classification

Author: Riquelme Granada Nery
Publication venue
Publication date: 01/01/2022
Field of study

Royal Holloway - Pure

Machine-Learning-based Prediction of Sepsis Events from Vertical Clinical Trial Data: a Naïve Approach

Author: Gaddis Tyler Michael
Publication venue: UWM Digital Commons
Publication date: 01/08/2020
Field of study

Sepsis is a potentially life-threatening condition characterized by a dysregulated, disproportionate immune response to infection by which the afflicted body attacks its own tissues, sometimes to the point of organ failure, and in the worst cases, death. According to the Centers for Disease Control and Prevention (CDC) Sepsis is reported to kill upwards of 270,000 Americans annually, though this figure may be greater given certain ambiguities in the current accepted diagnostic framework of the disease. This study attempted to first establish an understanding of past definitions of sepsis, and to then recommend use of machine learning as integral in an eventual amended disease definition. Longitudinal clinical trial data (ntrials=30,915) were vectorized into a machine-readable format compatible with predictive modeling, selected and reduced in dimension, and used to predict incidences of sepsis via application of several machine learning models: logistic regression, support vector machines (SVM), naïve Bayes Classifier, decision trees, and random forests. The intent of the study was to identify possible predictive features for sepsis via comparative analysis of different machine learning models, and to recommend subsequent study of sepsis prediction using the training model on new data (non-clinical-trial-derived) in the same format. If the models can be generalized to new data, it stands to assume they could eventually become clinically useful. In referencing F1 scores and recall scores, the random forest classifier was the best performer among this cohort of models

University of Wisconsin-Milwaukee