12 research outputs found
Evaluation of Predictive Reliability to Foster Trust in Artificial Intelligence. A case study in Multiple Sclerosis
Applying Artificial Intelligence (AI) and Machine Learning (ML) in critical
contexts, such as medicine, requires the implementation of safety measures to
reduce risks of harm in case of prediction errors. Spotting ML failures is of
paramount importance when ML predictions are used to drive clinical decisions.
ML predictive reliability measures the degree of trust of a ML prediction on a
new instance, thus allowing decision-makers to accept or reject it based on its
reliability. To assess reliability, we propose a method that implements two
principles. First, our approach evaluates whether an instance to be classified
is coming from the same distribution of the training set. To do this, we
leverage Autoencoders (AEs) ability to reconstruct the training set with low
error. An instance is considered Out-of-Distribution (OOD) if the AE
reconstructs it with a high error. Second, it is evaluated whether the ML
classifier has good performances on samples similar to the newly classified
instance by using a proxy model. We show that this approach is able to assess
reliability both in a simulated scenario and on a model trained to predict
disease progression of Multiple Sclerosis patients. We also developed a Python
package, named relAI, to embed reliability measures into ML pipelines. We
propose a simple approach that can be used in the deployment phase of any ML
model to suggest whether to trust predictions or not. Our method holds the
promise to provide effective support to clinicians by spotting potential ML
failures during deployment.Comment: 20 pages, 7 figure
Conformalized Credal Set Predictors
Credal sets are sets of probability distributions that are considered as
candidates for an imprecisely known ground-truth distribution. In machine
learning, they have recently attracted attention as an appealing formalism for
uncertainty representation, in particular due to their ability to represent
both the aleatoric and epistemic uncertainty in a prediction. However, the
design of methods for learning credal set predictors remains a challenging
problem. In this paper, we make use of conformal prediction for this purpose.
More specifically, we propose a method for predicting credal sets in the
classification task, given training data labeled by probability distributions.
Since our method inherits the coverage guarantees of conformal prediction, our
conformal credal sets are guaranteed to be valid with high probability (without
any assumptions on model or distribution). We demonstrate the applicability of
our method to natural language inference, a highly ambiguous natural language
task where it is common to obtain multiple annotations per example
Uncertainty-Aware Mixed-Variable Machine Learning for Materials Design
Data-driven design shows the promise of accelerating materials discovery but
is challenging due to the prohibitive cost of searching the vast design space
of chemistry, structure, and synthesis methods. Bayesian Optimization (BO)
employs uncertainty-aware machine learning models to select promising designs
to evaluate, hence reducing the cost. However, BO with mixed numerical and
categorical variables, which is of particular interest in materials design, has
not been well studied. In this work, we survey frequentist and Bayesian
approaches to uncertainty quantification of machine learning with mixed
variables. We then conduct a systematic comparative study of their performances
in BO using a popular representative model from each group, the random
forest-based Lolo model (frequentist) and the latent variable Gaussian process
model (Bayesian). We examine the efficacy of the two models in the optimization
of mathematical functions, as well as properties of structural and functional
materials, where we observe performance differences as related to problem
dimensionality and complexity. By investigating the machine learning models'
predictive and uncertainty estimation capabilities, we provide interpretations
of the observed performance differences. Our results provide practical guidance
on choosing between frequentist and Bayesian uncertainty-aware machine learning
models for mixed-variable BO in materials design
Uncertainty-Based Rejection in Machine Learning: Implications for Model Development and Interpretability
POCI-01-0247-FEDER-033479Uncertainty is present in every single prediction of Machine Learning (ML) models. Uncertainty Quantification (UQ) is arguably relevant, in particular for safety-critical applications. Prior research focused on the development of methods to quantify uncertainty; however, less attention has been given to how to leverage the knowledge of uncertainty in the process of model development. This work focused on applying UQ into practice, closing the gap of its utility in the ML pipeline and giving insights into how UQ is used to improve model development and its interpretability. We identified three main research questions: (1) How can UQ contribute to choosing the most suitable model for a given classification task? (2) Can UQ be used to combine different models in a principled manner? (3) Can visualization techniques improve UQ’s interpretability? These questions are answered by applying several methods to quantify uncertainty in both a simulated dataset and a real-world dataset of Human Activity Recognition (HAR). Our results showed that uncertainty quantification can increase model robustness and interpretability.publishersversionpublishe
Supporting High-Uncertainty Decisions through AI and Logic-Style Explanations
A common criteria for Explainable AI (XAI) is to support users in establishing appropriate trust in the AI - rejecting advice when it is incorrect, and accepting advice when it is correct. Previous findings suggest that explanations can cause an over-reliance on AI (overly accepting advice). Explanations that evoke appropriate trust are even more challenging for decision-making tasks that are difficult for humans and AI. For this reason, we study decision-making by non-experts in the high-uncertainty domain of stock trading. We compare the effectiveness of three different explanation styles (influenced by inductive, abductive, and deductive reasoning) and the role of AI confidence in terms of a) the users' reliance on the XAI interface elements (charts with indicators, AI prediction, explanation), b) the correctness of the decision (task performance), and c) the agreement with the AI's prediction. In contrast to previous work, we look at interactions between different aspects of decision-making, including AI correctness, and the combined effects of AI confidence and explanations styles. Our results show that specific explanation styles (abductive and deductive) improve the user's task performance in the case of high AI confidence compared to inductive explanations. In other words, these styles of explanations were able to invoke correct decisions (for both positive and negative decisions) when the system was certain. In such a condition, the agreement between the user's decision and the AI prediction confirms this finding, highlighting a significant agreement increase when the AI is correct. This suggests that both explanation styles are suitable for evoking appropriate trust in a confident AI. Our findings further indicate a need to consider AI confidence as a criterion for including or excluding explanations from AI interfaces. In addition, this paper highlights the importance of carefully selecting an explanation style according to the characteristics of the task and data
Towards Knowledge Uncertainty Estimation for Open Set Recognition
POCI-01-0247-FEDER-033479Uncertainty is ubiquitous and happens in every single prediction of Machine Learning models. The ability to estimate and quantify the uncertainty of individual predictions is arguably relevant, all the more in safety-critical applications. Real-world recognition poses multiple challenges since a model's knowledge about physical phenomenon is not complete, and observations are incomplete by definition. However, Machine Learning algorithms often assume that train and test data distributions are the same and that all testing classes are present during training. A more realistic scenario is the Open Set Recognition, where unknown classes can be submitted to an algorithm during testing. In this paper, we propose a Knowledge Uncertainty Estimation (KUE) method to quantify knowledge uncertainty and reject out-of-distribution inputs. Additionally, we quantify and distinguish aleatoric and epistemic uncertainty with the classical information-theoretical measures of entropy by means of ensemble techniques. We performed experiments on four datasets with different data modalities and compared our results with distance-based classifiers, SVM-based approaches and ensemble techniques using entropy measures. Overall, the effectiveness of KUE in distinguishing in- and out-distribution inputs obtained better results in most cases and was at least comparable in others. Furthermore, a classification with rejection option based on a proposed combination strategy between different measures of uncertainty is an application of uncertainty with proven results.publishersversionpublishe
Uncertainty quantification for probabilistic machine learning in earth observation using conformal prediction
Unreliable predictions can occur when using artificial intelligence (AI)
systems with negative consequences for downstream applications, particularly
when employed for decision-making. Conformal prediction provides a
model-agnostic framework for uncertainty quantification that can be applied to
any dataset, irrespective of its distribution, post hoc. In contrast to other
pixel-level uncertainty quantification methods, conformal prediction operates
without requiring access to the underlying model and training dataset,
concurrently offering statistically valid and informative prediction regions,
all while maintaining computational efficiency. In response to the increased
need to report uncertainty alongside point predictions, we bring attention to
the promise of conformal prediction within the domain of Earth Observation (EO)
applications. To accomplish this, we assess the current state of uncertainty
quantification in the EO domain and found that only 20% of the reviewed Google
Earth Engine (GEE) datasets incorporated a degree of uncertainty information,
with unreliable methods prevalent. Next, we introduce modules that seamlessly
integrate into existing GEE predictive modelling workflows and demonstrate the
application of these tools for datasets spanning local to global scales,
including the Dynamic World and Global Ecosystem Dynamics Investigation (GEDI)
datasets. These case studies encompass regression and classification tasks,
featuring both traditional and deep learning-based workflows. Subsequently, we
discuss the opportunities arising from the use of conformal prediction in EO.
We anticipate that the increased availability of easy-to-use implementations of
conformal predictors, such as those provided here, will drive wider adoption of
rigorous uncertainty quantification in EO, thereby enhancing the reliability of
uses such as operational monitoring and decision making