53 research outputs found
Explaining Black-Box Models through Counterfactuals
We present CounterfactualExplanations.jl: a package for generating
Counterfactual Explanations (CE) and Algorithmic Recourse (AR) for black-box
models in Julia. CE explain how inputs into a model need to change to yield
specific model predictions. Explanations that involve realistic and actionable
changes can be used to provide AR: a set of proposed actions for individuals to
change an undesirable outcome for the better. In this article, we discuss the
usefulness of CE for Explainable Artificial Intelligence and demonstrate the
functionality of our package. The package is straightforward to use and
designed with a focus on customization and extensibility. We envision it to one
day be the go-to place for explaining arbitrary predictive models in Julia
through a diverse suite of counterfactual generators.Comment: 13 pages, 9 figures, originally published in The Proceedings of the
JuliaCon Conferences (JCON
One Deep Music Representation to Rule Them All? : A comparative analysis of different representation learning strategies
Inspired by the success of deploying deep learning in the fields of Computer
Vision and Natural Language Processing, this learning paradigm has also found
its way into the field of Music Information Retrieval. In order to benefit from
deep learning in an effective, but also efficient manner, deep transfer
learning has become a common approach. In this approach, it is possible to
reuse the output of a pre-trained neural network as the basis for a new
learning task. The underlying hypothesis is that if the initial and new
learning tasks show commonalities and are applied to the same type of input
data (e.g. music audio), the generated deep representation of the data is also
informative for the new task. Since, however, most of the networks used to
generate deep representations are trained using a single initial learning
source, their representation is unlikely to be informative for all possible
future tasks. In this paper, we present the results of our investigation of
what are the most important factors to generate deep representations for the
data and learning tasks in the music domain. We conducted this investigation
via an extensive empirical study that involves multiple learning sources, as
well as multiple deep learning architectures with varying levels of information
sharing between sources, in order to learn music representations. We then
validate these representations considering multiple target datasets for
evaluation. The results of our experiments yield several insights on how to
approach the design of methods for learning widely deployable deep data
representations in the music domain.Comment: This work has been accepted to "Neural Computing and Applications:
Special Issue on Deep Learning for Music and Audio
The Biased Journey of MSD_AUDIO.ZIP
The equitable distribution of academic data is crucial for ensuring equal
research opportunities, and ultimately further progress. Yet, due to the
complexity of using the API for audio data that corresponds to the Million Song
Dataset along with its misreporting (before 2016) and the discontinuation of
this API (after 2016), access to this data has become restricted to those
within certain affiliations that are connected peer-to-peer. In this paper, we
delve into this issue, drawing insights from the experiences of 22 individuals
who either attempted to access the data or played a role in its creation. With
this, we hope to initiate more critical dialogue and more thoughtful
consideration with regard to access privilege in the MIR community
Endogenous Macrodynamics in Algorithmic Recourse
Existing work on Counterfactual Explanations (CE) and Algorithmic Recourse
(AR) has largely focused on single individuals in a static environment: given
some estimated model, the goal is to find valid counterfactuals for an
individual instance that fulfill various desiderata. The ability of such
counterfactuals to handle dynamics like data and model drift remains a largely
unexplored research challenge. There has also been surprisingly little work on
the related question of how the actual implementation of recourse by one
individual may affect other individuals. Through this work, we aim to close
that gap. We first show that many of the existing methodologies can be
collectively described by a generalized framework. We then argue that the
existing framework does not account for a hidden external cost of recourse,
that only reveals itself when studying the endogenous dynamics of recourse at
the group level. Through simulation experiments involving various state-of
the-art counterfactual generators and several benchmark datasets, we generate
large numbers of counterfactuals and study the resulting domain and model
shifts. We find that the induced shifts are substantial enough to likely impede
the applicability of Algorithmic Recourse in some situations. Fortunately, we
find various strategies to mitigate these concerns. Our simulation framework
for studying recourse dynamics is fast and opensourced.Comment: 12 pages, 11 figures. Originally published at the 2023 IEEE
Conference on Secure and Trustworthy Machine Learning (SaTML). IEEE holds the
copyrigh
Hidden Author Bias in Book Recommendation
Collaborative filtering algorithms have the advantage of not requiring sensitive user or item information to provide recommendations. However, they still suffer from fairness related issues, like popularity bias. In this work, we argue that popularity bias often leads to other biases that are not obvious when additional user or item information is not provided to the researcher. We examine our hypothesis in the book recommendation case on a commonly used dataset with book ratings. We enrich it with author information using publicly available external sources. We find that popular books are mainly written by US citizens in the dataset, and that these books tend to be recommended disproportionally by popular collaborative filtering algorithms compared to the users' profiles. We conclude that the societal implications of popularity bias should be further examined by the scholar community
Some Advice for Psychologists Who Want to Work With Computer Scientists on Big Data
This article is based on conversations from the project “Big Data in Psychological Assessment” (BDPA) funded by the European Union, which was initiated because of the advances in data science and artificial intelligence that offer tremendous opportunities for personnel assessment practice in handling and interpreting this kind of data. We argue that psychologists and computer scientists can benefit from interdisciplinary collaboration. This article aims to inform psychologists who are interested in working with computer scientists about the potentials of interdisciplinary collaboration, as well as the challenges such as differing terminologies, foci of interest, data quality standards, approaches to data analyses, and diverging publication practices. Finally, we provide recommendations preparing psychologists who want to engage in collaborations with computer scientists. We argue that psychologists should proactively approach computer scientists, learn computer scientific fundamentals, appreciate that research interests are likely to converge, and prepare novice psychologists for a data-oriented scientific future
Some Advice for Psychologists Who Want to Work with Computer Scientists on Big Data
This article is based on conversations from the project “Big Data in Psychological
Assessment” (BDPA) funded by the European Union, which was initiated because of the
advances in data science and artificial intelligence that offer tremendous opportunities for
personnel assessment practice in handling and interpreting this kind of data. We argue that
psychologists and computer scientists can benefit from interdiscip
- …