128 research outputs found
Calibrated Explanations for Regression
Artificial Intelligence (AI) is often an integral part of modern decision
support systems (DSSs). The best-performing predictive models used in AI-based
DSSs lack transparency. Explainable Artificial Intelligence (XAI) aims to
create AI systems that can explain their rationale to human users. Local
explanations in XAI can provide information about the causes of individual
predictions in terms of feature importance. However, a critical drawback of
existing local explanation methods is their inability to quantify the
uncertainty associated with a feature's importance. This paper introduces an
extension of a feature importance explanation method, Calibrated Explanations
(CE), previously only supporting classification, with support for standard
regression and probabilistic regression, i.e., the probability that the target
is above an arbitrary threshold. The extension for regression keeps all the
benefits of CE, such as calibration of the prediction from the underlying model
with confidence intervals, uncertainty quantification of feature importance,
and allows both factual and counterfactual explanations. CE for standard
regression provides fast, reliable, stable, and robust explanations. CE for
probabilistic regression provides an entirely new way of creating probabilistic
explanations from any ordinary regression model and with a dynamic selection of
thresholds. The performance of CE for probabilistic regression regarding
stability and speed is comparable to LIME. The method is model agnostic with
easily understood conditional rules. An implementation in Python is freely
available on GitHub and for installation using pip making the results in this
paper easily replicable.Comment: 30 pages, 11 figures (replaced due to omitted author, which is the
only change made
Federated Conformal Predictors for Distributed Uncertainty Quantification
Conformal prediction is emerging as a popular paradigm for providing rigorous
uncertainty quantification in machine learning since it can be easily applied
as a post-processing step to already trained models.
In this paper, we extend conformal prediction to the federated learning
setting.
The main challenge we face is data heterogeneity across the clients -- this
violates the fundamental tenet of \emph{exchangeability} required for conformal
prediction.
We propose a weaker notion of \emph{partial exchangeability}, better suited
to the FL setting, and use it to develop the Federated Conformal Prediction
(FCP) framework.
We show FCP enjoys rigorous theoretical guarantees and excellent empirical
performance on several computer vision and medical imaging datasets.
Our results demonstrate a practical approach to incorporating meaningful
uncertainty quantification in distributed and heterogeneous environments.
We provide code used in our experiments
\url{https://github.com/clu5/federated-conformal}.Comment: 23 pages, 18 figures, accepted to International Conference on Machine
Learning (ICML) 202
A review of probabilistic forecasting and prediction with machine learning
Predictions and forecasts of machine learning models should take the form of
probability distributions, aiming to increase the quantity of information
communicated to end users. Although applications of probabilistic prediction
and forecasting with machine learning models in academia and industry are
becoming more frequent, related concepts and methods have not been formalized
and structured under a holistic view of the entire field. Here, we review the
topic of predictive uncertainty estimation with machine learning algorithms, as
well as the related metrics (consistent scoring functions and proper scoring
rules) for assessing probabilistic predictions. The review covers a time period
spanning from the introduction of early statistical (linear regression and time
series models, based on Bayesian statistics or quantile regression) to recent
machine learning algorithms (including generalized additive models for
location, scale and shape, random forests, boosting and deep learning
algorithms) that are more flexible by nature. The review of the progress in the
field, expedites our understanding on how to develop new algorithms tailored to
users' needs, since the latest advancements are based on some fundamental
concepts applied to more complex algorithms. We conclude by classifying the
material and discussing challenges that are becoming a hot topic of research.Comment: 83 pages, 5 figure
Forecasting hospital discharges for respiratory conditions in Costa Rica using climate and pollution data
Respiratory diseases represent one of the most significant economic burdens
on healthcare systems worldwide. The variation in the increasing number of
cases depends greatly on climatic seasonal effects, socioeconomic factors, and
pollution. Therefore, understanding these variations and obtaining precise
forecasts allows health authorities to make correct decisions regarding the
allocation of limited economic and human resources. This study aims to model
and forecast weekly hospitalizations due to respiratory conditions in seven
regional hospitals in Costa Rica using four statistical learning techniques
(Random Forest, XGboost, Facebook's Prophet forecasting model, and an ensemble
method combining the above methods), along with 22 climate change indices and
aerosol optical depth as an indicator of pollution. Models are trained using
data from 2000 to 2018 and are evaluated using data from 2019 as testing data.
Reliable predictions are obtained for each of the seven regional hospital
Bayesian Optimization with Conformal Prediction Sets
Bayesian optimization is a coherent, ubiquitous approach to decision-making
under uncertainty, with applications including multi-arm bandits, active
learning, and black-box optimization. Bayesian optimization selects decisions
(i.e. objective function queries) with maximal expected utility with respect to
the posterior distribution of a Bayesian model, which quantifies reducible,
epistemic uncertainty about query outcomes. In practice, subjectively
implausible outcomes can occur regularly for two reasons: 1) model
misspecification and 2) covariate shift. Conformal prediction is an uncertainty
quantification method with coverage guarantees even for misspecified models and
a simple mechanism to correct for covariate shift. We propose conformal
Bayesian optimization, which directs queries towards regions of search space
where the model predictions have guaranteed validity, and investigate its
behavior on a suite of black-box optimization tasks and tabular ranking tasks.
In many cases we find that query coverage can be significantly improved without
harming sample-efficiency.Comment: For code, see
https://www.github.com/samuelstanton/conformal-bayesopt.gi
Conformal prediction beyond exchangeability
Conformal prediction is a popular, modern technique for providing valid
predictive inference for arbitrary machine learning models. Its validity relies
on the assumptions of exchangeability of the data, and symmetry of the given
model fitting algorithm as a function of the data. However, exchangeability is
often violated when predictive models are deployed in practice. For example, if
the data distribution drifts over time, then the data points are no longer
exchangeable; moreover, in such settings, we might want to use a nonsymmetric
algorithm that treats recent observations as more relevant. This paper
generalizes conformal prediction to deal with both aspects: we employ weighted
quantiles to introduce robustness against distribution drift, and design a new
randomization technique to allow for algorithms that do not treat data points
symmetrically. Our new methods are provably robust, with substantially less
loss of coverage when exchangeability is violated due to distribution drift or
other challenging features of real data, while also achieving the same coverage
guarantees as existing conformal prediction methods if the data points are in
fact exchangeable. We demonstrate the practical utility of these new tools with
simulations and real-data experiments on electricity and election forecasting
Big-Data Science in Porous Materials: Materials Genomics and Machine Learning
By combining metal nodes with organic linkers we can potentially synthesize
millions of possible metal organic frameworks (MOFs). At present, we have
libraries of over ten thousand synthesized materials and millions of in-silico
predicted materials. The fact that we have so many materials opens many
exciting avenues to tailor make a material that is optimal for a given
application. However, from an experimental and computational point of view we
simply have too many materials to screen using brute-force techniques. In this
review, we show that having so many materials allows us to use big-data methods
as a powerful technique to study these materials and to discover complex
correlations. The first part of the review gives an introduction to the
principles of big-data science. We emphasize the importance of data collection,
methods to augment small data sets, how to select appropriate training sets. An
important part of this review are the different approaches that are used to
represent these materials in feature space. The review also includes a general
overview of the different ML techniques, but as most applications in porous
materials use supervised ML our review is focused on the different approaches
for supervised ML. In particular, we review the different method to optimize
the ML process and how to quantify the performance of the different methods. In
the second part, we review how the different approaches of ML have been applied
to porous materials. In particular, we discuss applications in the field of gas
storage and separation, the stability of these materials, their electronic
properties, and their synthesis. The range of topics illustrates the large
variety of topics that can be studied with big-data science. Given the
increasing interest of the scientific community in ML, we expect this list to
rapidly expand in the coming years.Comment: Editorial changes (typos fixed, minor adjustments to figures
How can humans leverage machine learning? From Medical Data Wrangling to Learning to Defer to Multiple Experts
Mención Internacional en el título de doctorThe irruption of the smartphone into everyone’s life and the ease with which we digitise or record
any data supposed an explosion of quantities of data. Smartphones, equipped with advanced
cameras and sensors, have empowered individuals to capture moments and contribute to the
growing pool of data. This data-rich landscape holds great promise for research, decision-making,
and personalized applications. By carefully analyzing and interpreting this wealth of information,
valuable insights, patterns, and trends can be uncovered.
However, big data is worthless in a vacuum. Its potential value is unlocked only when leveraged
to drive decision-making. In recent times we have been participants of the outburst of artificial
intelligence: the development of computer systems and algorithms capable of perceiving, reasoning,
learning, and problem-solving, emulating certain aspects of human cognitive abilities. Nevertheless,
our focus tends to be limited, merely skimming the surface of the problem, while the reality
is that the application of machine learning models to data introduces is usually fraught. More
specifically, there are two crucial pitfalls frequently neglected in the field of machine learning:
the quality of the data and the erroneous assumption that machine learning models operate
autonomously. These two issues have established the foundation for the motivation driving this
thesis, which strives to offer solutions to two major associated challenges: 1) dealing with irregular
observations and 2) learning when and who should we trust.
The first challenge originates from our observation that the majority of machine learning
research primarily concentrates on handling regular observations, neglecting a crucial technological
obstacle encountered in practical big-data scenarios: the aggregation and curation of heterogeneous
streams of information. Before applying machine learning algorithms, it is crucial to establish
robust techniques for handling big data, as this specific aspect presents a notable bottleneck in
the creation of robust algorithms. Data wrangling, which encompasses the extraction, integration,
and cleaning processes necessary for data analysis, plays a crucial role in this regard. Therefore,
the first objective of this thesis is to tackle the frequently disregarded challenge of addressing
irregularities within the context of medical data. We will focus on three specific aspects. Firstly,
we will tackle the issue of missing data by developing a framework that facilitates the imputation
of missing data points using relevant information derived from alternative data sources or past
observations. Secondly, we will move beyond the assumption of homogeneous observations,
where only one statistical data type (such as Gaussian) is considered, and instead, work with
heterogeneous observations. This means that different data sources can be represented by various
statistical likelihoods, such as Gaussian, Bernoulli, categorical, etc. Lastly, considering the
temporal enrichment of todays collected data and our focus on medical data, we will develop a novel algorithm capable of capturing and propagating correlations among different data streams
over time. All these three problems are addressed in our first contribution which involves the
development of a novel method based on Deep Generative Models (DGM) using Variational
Autoencoders (VAE). The proposed model, the Sequential Heterogeneous Incomplete VAE (Shi-
VAE), enables the aggregation of multiple heterogeneous data streams in a modular manner,
taking into consideration the presence of potential missing data. To demonstrate the feasibility
of our approach, we present proof-of-concept results obtained from a real database generated
through continuous passive monitoring of psychiatric patients.
Our second challenge relates to the misbelief that machine learning algorithms can perform
independently. However, this notion that AI systems can solely account for automated decisionmaking,
especially in critical domains such as healthcare, is far from reality. Our focus now shifts
towards a specific scenario where the algorithm has the ability to make predictions independently
or alternatively defer the responsibility to a human expert. The purpose of including the human
is not to obtain jsut better performance, but also more reliable and trustworthy predictions we
can rely on. In reality, however, important decisions are not made by one person but are usually
committed by an ensemble of human experts. With this in mind, two important questions arise:
1) When should the human or the machine bear responsibility and 2) among the experts, who
should we trust? To answer the first question, we will employ a recent theory known as Learning
to defer (L2D). In L2D we are not only interested in abstaining from prediction but also in
understanding the humans confidence for making such prediction. thus deferring only when the
human is more likely to be correct. The second question about who to defer among a pool of
experts has not been yet answered in the L2D literature, and this is what our contributions
aim to provide. First, we extend the two yet proposed consistent surrogate losses in the L2D
literature to the multiple-expert setting. Second, we study the frameworks ability to estimate
the probability that a given expert correctly predicts and assess whether the two surrogate losses
are confidence calibrated. Finally, we propose a conformal inference technique that chooses a
subset of experts to query when the system defers. Ensembling experts based on confidence
levels is vital to optimize human-machine collaboration.
In conclusion, this doctoral thesis has investigated two cases where humans can leverage the
power of machine learning: first, as a tool to assist in data wrangling and data understanding
problems and second, as a collaborative tool where decision-making can be automated by the
machine or delegated to human experts, fostering more transparent and trustworthy solutions.La irrupción de los smartphones en la vida de todos y la facilidad con la que digitalizamos o
registramos cualquier situación ha supuesto una explosión en la cantidad de datos. Los teléfonos,
equipados con cámaras y sensores avanzados, han contribuido a que las personas puedann capturar
más momentos, favoreciendo así el creciente conjunto de datos. Este panorama repleto de datos
aporta un gran potencial de cara a la investigación, la toma de decisiones y las aplicaciones
personalizadas. Mediante el análisis minucioso y una cuidada interpretación de esta abundante
información, podemos descubrir valiosos patrones, tendencias y conclusiones
Sin embargo, este gran volumen de datos no tiene valor por si solo. Su potencial se desbloquea
solo cuando se aprovecha para impulsar la toma de decisiones. En tiempos recientes, hemos sido
testigos del auge de la inteligencia artificial: el desarrollo de sistemas informáticos y algoritmos
capaces de percibir, razonar, aprender y resolver problemas, emulando ciertos aspectos de las
capacidades cognitivas humanas. No obstante, solemos centrarnos solo en la superficie del problema
mientras que la realidad es que la aplicación de modelos de aprendizaje automático a los datos
presenta desafíos significativos. Concretamente, se suelen pasar por alto dos problemas cruciales
en el campo del aprendizaje automático: la calidad de los datos y la suposición errónea de
que los modelos de aprendizaje automático pueden funcionar de manera autónoma. Estos dos
problemas han sido el fundamento de la motivación que impulsa esta tesis, que se esfuerza
en ofrecer soluciones a dos desafíos importantes asociados: 1) lidiar con datos irregulares y 2)
aprender cuándo y en quién debemos confiar.
El primer desafío surge de nuestra observación de que la mayoría de las investigaciones en
aprendizaje automático se centran principalmente en manejar datos regulares, descuidando un
obstáculo tecnológico crucial que se encuentra en escenarios prácticos con gran cantidad de
datos: la agregación y el curado de secuencias heterogéneas. Antes de aplicar algoritmos de
aprendizaje automático, es crucial establecer técnicas robustas para manejar estos datos, ya que
est problemática representa un cuello de botella claro en la creación de algoritmos robustos. El
procesamiento de datos (en concreto, nos centraremos en el término inglés data wrangling), que
abarca los procesos de extracción, integración y limpieza necesarios para el análisis de datos,
desempeña un papel crucial en este sentido. Por lo tanto, el primer objetivo de esta tesis es
abordar el desafío normalmente paso por alto de tratar datos irregulare. Específicamente, bajo
el contexto de datos médicos. Nos centraremos en tres aspectos principales. En primer lugar,
abordaremos el problema de los datos perdidos mediante el desarrollo de un marco que facilite
la imputación de estos datos perdidos utilizando información relevante obtenida de fuentes de
datos de diferente naturalaeza u observaciones pasadas. En segundo lugar, iremos más allá de la suposición de lidiar con observaciones homogéneas, donde solo se considera un tipo de dato
estadístico (como Gaussianos) y, en su lugar, trabajaremos con observaciones heterogéneas. Esto
significa que diferentes fuentes de datos pueden estar representadas por diversas distribuciones
de probabilidad, como Gaussianas, Bernoulli, categóricas, etc. Por último, teniendo en cuenta
el enriquecimiento temporal de los datos hoy en día y nuestro enfoque directo sobre los datos
médicos, propondremos un algoritmo innovador capaz de capturar y propagar la correlación
entre diferentes flujos de datos a lo largo del tiempo. Todos estos tres problemas se abordan
en nuestra primera contribución, que implica el desarrollo de un método basado en Modelos
Generativos Profundos (Deep Genarative Model en inglés) utilizando Autoencoders Variacionales
(Variational Autoencoders en ingés). El modelo propuesto, Sequential Heterogeneous Incomplete
VAE (Shi-VAE), permite la agregación de múltiples flujos de datos heterogéneos de manera
modular, teniendo en cuenta la posible presencia de datos perdidos. Para demostrar la viabilidad
de nuestro enfoque, presentamos resultados de prueba de concepto obtenidos de una base de datos
real generada a través del monitoreo continuo pasivo de pacientes psiquiátricos.
Nuestro segundo desafío está relacionado con la creencia errónea de que los algoritmos de
aprendizaje automático pueden funcionar de manera independiente. Sin embargo, esta idea de que
los sistemas de inteligencia artificial pueden ser los únicos responsables en la toma de decisione,
especialmente en dominios críticos como la atención médica, está lejos de la realidad. Ahora,
nuestro enfoque se centra en un escenario específico donde el algoritmo tiene la capacidad de
realizar predicciones de manera independiente o, alternativamente, delegar la responsabilidad
en un experto humano. La inclusión del ser humano no solo tiene como objetivo obtener un
mejor rendimiento, sino también obtener predicciones más transparentes y seguras en las que
podamos confiar. En la realidad, sin embargo, las decisiones importantes no las toma una sola
persona, sino que generalmente son el resultado de la colaboración de un conjunto de expertos.
Con esto en mente, surgen dos preguntas importantes: 1) ¿Cuándo debe asumir la responsabilidad
el ser humano o cuándo la máquina? y 2) de entre los expertos, ¿en quién debemos confiar?
Para responder a la primera pregunta, emplearemos una nueva teoría llamada Learning to defer
(L2D). En L2D, no solo estamos interesados en abstenernos de hacer predicciones, sino también
en comprender cómo de seguro estará el experto para hacer dichas predicciones, diferiendo solo
cuando el humano sea más probable en predecir correcatmente. La segunda pregunta sobre a quién
deferir entre un conjunto de expertos aún no ha sido respondida en la literatura de L2D, y esto es
precisamente lo que nuestras contribuciones pretenden proporcionar. En primer lugar, extendemos
las dos primeras surrogate losses consistentes propuestas hasta ahora en la literatura de L2D al
contexto de múltiples expertos. En segundo lugar, estudiamos la capacidad de estos modelos para
estimar la probabilidad de que un experto dado haga predicciones correctas y evaluamos si estas
surrogate losses están calibradas en términos de confianza. Finalmente, proponemos una técnica
de conformal inference que elige un subconjunto de expertos para consultar cuando el sistema
decide diferir. Esta combinación de expertos basada en los respectivos niveles de confianza es
fundamental para optimizar la colaboración entre humanos y máquinas En conclusión, esta tesis doctoral ha investigado dos casos en los que los humanos pueden
aprovechar el poder del aprendizaje automático: primero, como herramienta para ayudar en
problemas de procesamiento y comprensión de datos y, segundo, como herramienta colaborativa en
la que la toma de decisiones puede ser automatizada para ser realizada por la máquina o delegada
a expertos humanos, fomentando soluciones más transparentes y seguras.Programa de Doctorado en Multimedia y Comunicaciones por la Universidad Carlos III de Madrid y la Universidad Rey Juan CarlosPresidente: Joaquín Míguez Arenas.- Secretario: Juan José Murillo Fuentes.- Vocal: Mélanie Natividad Fernández Pradie
Modeling contributions of major sources to local and regional air pollutant exposures and health impacts
Elevated concentrations of ambient fine particulate matter (PM2.5) and ozone (O3) contribute to adverse health outcomes in exposed populations. Anthropogenic source sectors, including aviation, residential combustion (RC), and electricity generating units (EGUs), lead to increased concentrations of these combustion-related pollutants. Quantification of the influence of emissions from specific source sectors on ambient pollutant concentrations can be very useful in better informing public health policy decision making on air quality improvements. Due to complex emissions dynamics, background concentrations, and meteorology, determining contributions of these sources to related health risks is challenging.
To assess local impacts of aviation activity, concentrations of nitrogen oxides (NOx) and the PM2.5 constituent black carbon (BC) were monitored near airports. Moreover, aviation-attributable fractions were derived from monitored concentrations using regression modeling, and values were compared with predicted aviation-attributable concentrations from a near-field dispersion model. Regional impacts of aviation, RC, and EGUs were assessed using the Community Multiscale Air Quality (CMAQ) atmospheric chemistry and transport model with the Direct Decoupled Method (DDM) to determine sensitivity of ambient PM2.5 and O3 concentrations to emissions from individual sources. Health damage functions, quantified as mortality per thousand tons of emitted precursor species, were created by individual airport for 66 of the highest fuel-burning airports in the United States and by state for RC and EGUs. Physically-interpretable regression models were built to predict aviation-related health damage functions.
With local aviation, comparisons of regression-predicted and dispersion-predicted BC and NOx concentrations are similar when aggregated, though diurnal patterns show potential weaknesses in near-field dispersion and emissions inventory accuracy. For regional aviation impacts, health damage function values varied by more than an order of magnitude across airports for each precursor-ambient pollutant pair, with seasonal effects present in secondary pollutant formation. Health damage functions were predicted by combinations of upwind and downwind population, meteorology, and atmospheric chemistry regime. State-resolution contributions of RC and EGUs varied both within and between source sectors, based on local characteristics including population density and EGU location. These findings reinforce the importance of quantification of source-specific air quality and health impacts in the design of health-maximizing emissions control policies
- …