299 research outputs found
(Un)reasonable Allure of Ante-hoc Interpretability for High-stakes Domains: Transparency Is Necessary but Insufficient for Explainability
Ante-hoc interpretability has become the holy grail of explainable machine
learning for high-stakes domains such as healthcare; however, this notion is
elusive, lacks a widely-accepted definition and depends on the deployment
context. It can refer to predictive models whose structure adheres to
domain-specific constraints, or ones that are inherently transparent. The
latter notion assumes observers who judge this quality, whereas the former
presupposes them to have technical and domain expertise, in certain cases
rendering such models unintelligible. Additionally, its distinction from the
less desirable post-hoc explainability, which refers to methods that construct
a separate explanatory model, is vague given that transparent predictors may
still require (post-)processing to yield satisfactory explanatory insights.
Ante-hoc interpretability is thus an overloaded concept that comprises a range
of implicit properties, which we unpack in this paper to better understand what
is needed for its safe deployment across high-stakes domains. To this end, we
outline model- and explainer-specific desiderata that allow us to navigate its
distinct realisations in view of the envisaged application and audience
Interpretability and Explainability: A Machine Learning Zoo Mini-tour
In this review, we examine the problem of designing interpretable and
explainable machine learning models. Interpretability and explainability lie at
the core of many machine learning and statistical applications in medicine,
economics, law, and natural sciences. Although interpretability and
explainability have escaped a clear universal definition, many techniques
motivated by these properties have been developed over the recent 30 years with
the focus currently shifting towards deep learning methods. In this review, we
emphasise the divide between interpretability and explainability and illustrate
these two different research directions with concrete examples of the
state-of-the-art. The review is intended for a general machine learning
audience with interest in exploring the problems of interpretation and
explanation beyond logistic regression or random forest variable importance.
This work is not an exhaustive literature survey, but rather a primer focusing
selectively on certain lines of research which the authors found interesting or
informative
Generalized Multimodal ELBO
Multiple data types naturally co-occur when describing real-world phenomena
and learning from them is a long-standing goal in machine learning research.
However, existing self-supervised generative models approximating an ELBO are
not able to fulfill all desired requirements of multimodal models: their
posterior approximation functions lead to a trade-off between the semantic
coherence and the ability to learn the joint data distribution. We propose a
new, generalized ELBO formulation for multimodal data that overcomes these
limitations. The new objective encompasses two previous methods as special
cases and combines their benefits without compromises. In extensive
experiments, we demonstrate the advantage of the proposed method compared to
state-of-the-art models in self-supervised, generative learning tasks.Comment: 2021 ICL
Multimodal Generative Learning Utilizing Jensen-Shannon-Divergence
Learning from different data types is a long-standing goal in machine
learning research, as multiple information sources co-occur when describing
natural phenomena. However, existing generative models that approximate a
multimodal ELBO rely on difficult or inefficient training schemes to learn a
joint distribution and the dependencies between modalities. In this work, we
propose a novel, efficient objective function that utilizes the Jensen-Shannon
divergence for multiple distributions. It simultaneously approximates the
unimodal and joint multimodal posteriors directly via a dynamic prior. In
addition, we theoretically prove that the new multimodal JS-divergence (mmJSD)
objective optimizes an ELBO. In extensive experiments, we demonstrate the
advantage of the proposed mmJSD model compared to previous work in
unsupervised, generative learning tasks.Comment: Accepted at NeurIPS 2020, camera-ready versio
Decoupling State Representation Methods from Reinforcement Learning in Car Racing
In the quest for efficient and robust learning methods, combining unsupervised state representation learning and reinforcement learning (RL) could offer advantages for scaling RL algorithms by providing the models with a useful inductive bias. For achieving this, an encoder is trained in an unsupervised manner with two state representation methods, a variational autoencoder and a contrastive estimator. The learned features are then fed to the actor-critic RL algorithm Proximal Policy Optimization (PPO) to learn a policy for playing Open AI's car racing environment. Hence, such procedure permits to decouple state representations from RL-controllers. For the integration of RL with unsupervised learning, we explore various designs for variational autoencoders and contrastive learning. The proposed method is compared to a deep network trained directly on pixel inputs with PPO. The results show that the proposed method performs slightly worse than directly learning from pixel inputs; however, it has a more stable learning curve, a substantial reduction of the buffer size, and requires optimizing 88% fewer parameters. These results indicate that the use of pre-trained state representations has several benefits for solving RL tasks.</p
scTree: Discovering Cellular Hierarchies in the Presence of Batch Effects in scRNA-seq Data
We propose a novel method, scTree, for single-cell Tree Variational
Autoencoders, extending a hierarchical clustering approach to single-cell RNA
sequencing data. scTree corrects for batch effects while simultaneously
learning a tree-structured data representation. This VAE-based method allows
for a more in-depth understanding of complex cellular landscapes independently
of the biasing effects of batches. We show empirically on seven datasets that
scTree discovers the underlying clusters of the data and the hierarchical
relations between them, as well as outperforms established baseline methods
across these datasets. Additionally, we analyze the learned hierarchy to
understand its biological relevance, thus underpinning the importance of
integrating batch correction directly into the clustering procedure
Beyond Normal: On the Evaluation of Mutual Information Estimators
Mutual information is a general statistical dependency measure which has
found applications in representation learning, causality, domain generalization
and computational biology. However, mutual information estimators are typically
evaluated on simple families of probability distributions, namely multivariate
normal distribution and selected distributions with one-dimensional random
variables. In this paper, we show how to construct a diverse family of
distributions with known ground-truth mutual information and propose a
language-independent benchmarking platform for mutual information estimators.
We discuss the general applicability and limitations of classical and neural
estimators in settings involving high dimensions, sparse interactions,
long-tailed distributions, and high mutual information. Finally, we provide
guidelines for practitioners on how to select appropriate estimator adapted to
the difficulty of problem considered and issues one needs to consider when
applying an estimator to a new data set.Comment: Accepted at NeurIPS 2023. Code available at
https://github.com/cbg-ethz/bm
Benchmarking the Fairness of Image Upsampling Methods
Recent years have witnessed a rapid development of deep generative models for
creating synthetic media, such as images and videos. While the practical
applications of these models in everyday tasks are enticing, it is crucial to
assess the inherent risks regarding their fairness. In this work, we introduce
a comprehensive framework for benchmarking the performance and fairness of
conditional generative models. We develop a set of
metrics\unicode{x2013}inspired by their supervised fairness
counterparts\unicode{x2013}to evaluate the models on their fairness and
diversity. Focusing on the specific application of image upsampling, we create
a benchmark covering a wide variety of modern upsampling methods. As part of
the benchmark, we introduce UnfairFace, a subset of FairFace that replicates
the racial distribution of common large-scale face datasets. Our empirical
study highlights the importance of using an unbiased training set and reveals
variations in how the algorithms respond to dataset imbalances. Alarmingly, we
find that none of the considered methods produces statistically fair and
diverse results. All experiments can be reproduced using our provided
repository.Comment: This is the author's version of the work. It is posted here for your
personal use. Not for redistribution. The definitive Version of Record was
published at the 2024 ACM Conference on Fairness, Accountability, and
Transparency (FAccT '24
- …