17 research outputs found
Teaching Categories to Human Learners with Visual Explanations
We study the problem of computer-assisted teaching with explanations.
Conventional approaches for machine teaching typically only provide feedback at
the instance level e.g., the category or label of the instance. However, it is
intuitive that clear explanations from a knowledgeable teacher can
significantly improve a student's ability to learn a new concept. To address
these existing limitations, we propose a teaching framework that provides
interpretable explanations as feedback and models how the learner incorporates
this additional information. In the case of images, we show that we can
automatically generate explanations that highlight the parts of the image that
are responsible for the class label. Experiments on human learners illustrate
that, on average, participants achieve better test set performance on
challenging categorization tasks when taught with our interpretable approach
compared to existing methods
Teaching categories to human learners with visual explanations
We study the problem of computer-assisted teaching with explanations. Conventional approaches for machine teaching typically only provide feedback at the instance level e.g., the category or label of the instance. However, it is intuitive that clear explanations from a knowledgeable teacher can significantly improve a student's ability to learn a new concept. To address these existing limitations, we propose a teaching framework that provides interpretable explanations as feedback and models how the learner incorporates this additional information. In the case of images, we show that we can automatically generate explanations that highlight the parts of the image that are responsible for the class label. Experiments on human learners illustrate that, on average, participants achieve better test set performance on challenging categorization tasks when taught with our interpretable approach compared to existing methods
Teaching categories to human learners with visual explanations
We study the problem of computer-assisted teaching with explanations. Conventional approaches for machine teaching typically only provide feedback at the instance level e.g., the category or label of the instance. However, it is intuitive that clear explanations from a knowledgeable teacher can significantly improve a student's ability to learn a new concept. To address these existing limitations, we propose a teaching framework that provides interpretable explanations as feedback and models how the learner incorporates this additional information. In the case of images, we show that we can automatically generate explanations that highlight the parts of the image that are responsible for the class label. Experiments on human learners illustrate that, on average, participants achieve better test set performance on challenging categorization tasks when taught with our interpretable approach compared to existing methods
Teaching Inverse Reinforcement Learners via Features and Demonstrations
Learning near-optimal behaviour from an expert's demonstrations typically
relies on the assumption that the learner knows the features that the true
reward function depends on. In this paper, we study the problem of learning
from demonstrations in the setting where this is not the case, i.e., where
there is a mismatch between the worldviews of the learner and the expert. We
introduce a natural quantity, the teaching risk, which measures the potential
suboptimality of policies that look optimal to the learner in this setting. We
show that bounds on the teaching risk guarantee that the learner is able to
find a near-optimal policy using standard algorithms based on inverse
reinforcement learning. Based on these findings, we suggest a teaching scheme
in which the expert can decrease the teaching risk by updating the learner's
worldview, and thus ultimately enable her to find a near-optimal policy.Comment: NeurIPS'2018 (extended version
On Human Predictions with Explanations and Predictions of Machine Learning Models: A Case Study on Deception Detection
Humans are the final decision makers in critical tasks that involve ethical
and legal concerns, ranging from recidivism prediction, to medical diagnosis,
to fighting against fake news. Although machine learning models can sometimes
achieve impressive performance in these tasks, these tasks are not amenable to
full automation. To realize the potential of machine learning for improving
human decisions, it is important to understand how assistance from machine
learning models affects human performance and human agency.
In this paper, we use deception detection as a testbed and investigate how we
can harness explanations and predictions of machine learning models to improve
human performance while retaining human agency. We propose a spectrum between
full human agency and full automation, and develop varying levels of machine
assistance along the spectrum that gradually increase the influence of machine
predictions. We find that without showing predicted labels, explanations alone
slightly improve human performance in the end task. In comparison, human
performance is greatly improved by showing predicted labels (>20% relative
improvement) and can be further improved by explicitly suggesting strong
machine performance. Interestingly, when predicted labels are shown,
explanations of machine predictions induce a similar level of accuracy as an
explicit statement of strong machine performance. Our results demonstrate a
tradeoff between human performance and human agency and show that explanations
of machine predictions can moderate this tradeoff.Comment: 17 pages, 19 figures, in Proceedings of ACM FAT* 2019, dataset & demo
available at https://deception.machineintheloop.co
Near-Optimal Machine Teaching via Explanatory Teaching Sets
Modern applications of machine teaching for humans often involve domain-specific, non- trivial target hypothesis classes. To facilitate understanding of the target hypothesis, it is crucial for the teaching algorithm to use examples which are interpretable to the human learner. In this paper, we propose NOTES, a principled framework for constructing interpretable teaching sets, utilizing explanations to accelerate the teaching process. Our algorithm is built upon a natural stochastic model of learners and a novel submodular surrogate objective function which greedily selects interpretable teaching examples. We prove that NOTES is competitive with the optimal explanation-based teaching strategy. We further instantiate NOTES with a specific hypothesis class, which can be viewed as an interpretable approximation of any hypothesis class, allowing us to handle complex hypothesis in practice. We demonstrate the effectiveness of NOTES on several image classification tasks, for both simulated and real human learners. Our experimental results suggest that by leveraging explanations, one can significantly speed up teaching