753 research outputs found
On Cognitive Preferences and the Plausibility of Rule-based Models
It is conventional wisdom in machine learning and data mining that logical
models such as rule sets are more interpretable than other models, and that
among such rule-based models, simpler models are more interpretable than more
complex ones. In this position paper, we question this latter assumption by
focusing on one particular aspect of interpretability, namely the plausibility
of models. Roughly speaking, we equate the plausibility of a model with the
likeliness that a user accepts it as an explanation for a prediction. In
particular, we argue that, all other things being equal, longer explanations
may be more convincing than shorter ones, and that the predominant bias for
shorter models, which is typically necessary for learning powerful
discriminative models, may not be suitable when it comes to user acceptance of
the learned models. To that end, we first recapitulate evidence for and against
this postulate, and then report the results of an evaluation in a
crowd-sourcing study based on about 3.000 judgments. The results do not reveal
a strong preference for simple rules, whereas we can observe a weak preference
for longer rules in some domains. We then relate these results to well-known
cognitive biases such as the conjunction fallacy, the representative heuristic,
or the recogition heuristic, and investigate their relation to rule length and
plausibility.Comment: V4: Another rewrite of section on interpretability to clarify focus
on plausibility and relation to interpretability, comprehensibility, and
justifiabilit
Non-distributional Word Vector Representations
Data-driven representation learning for words is a technique of central
importance in NLP. While indisputably useful as a source of features in
downstream tasks, such vectors tend to consist of uninterpretable components
whose relationship to the categories of traditional lexical semantic theories
is tenuous at best. We present a method for constructing interpretable word
vectors from hand-crafted linguistic resources like WordNet, FrameNet etc.
These vectors are binary (i.e, contain only 0 and 1) and are 99.9% sparse. We
analyze their performance on state-of-the-art evaluation methods for
distributional models of word vectors and find they are competitive to standard
distributional approaches.Comment: Proceedings of ACL 201
The Cure: Making a game of gene selection for breast cancer survival prediction
Motivation: Molecular signatures for predicting breast cancer prognosis could
greatly improve care through personalization of treatment. Computational
analyses of genome-wide expression datasets have identified such signatures,
but these signatures leave much to be desired in terms of accuracy,
reproducibility and biological interpretability. Methods that take advantage of
structured prior knowledge (e.g. protein interaction networks) show promise in
helping to define better signatures but most knowledge remains unstructured.
Crowdsourcing via scientific discovery games is an emerging methodology that
has the potential to tap into human intelligence at scales and in modes
previously unheard of. Here, we developed and evaluated a game called The Cure
on the task of gene selection for breast cancer survival prediction. Our
central hypothesis was that knowledge linking expression patterns of specific
genes to breast cancer outcomes could be captured from game players. We
envisioned capturing knowledge both from the players prior experience and from
their ability to interpret text related to candidate genes presented to them in
the context of the game.
Results: Between its launch in Sept. 2012 and Sept. 2013, The Cure attracted
more than 1,000 registered players who collectively played nearly 10,000 games.
Gene sets assembled through aggregation of the collected data clearly
demonstrated the accumulation of relevant expert knowledge. In terms of
predictive accuracy, these gene sets provided comparable performance to gene
sets generated using other methods including those used in commercial tests.
The Cure is available at http://genegames.org/cure
Explaining computer predictions with augmented appraisal degrees
An augmented appraisal degree (AAD) has been conceived as a mathematical representation of the connotative meaning in an experience-based evaluation, which depends on a particular experience or knowledge. Aiming to improve the interpretability of computer predictions, we explore the use of AADs to represent evaluations that are per- formed by a machine to predict the class of a particular object. Hence, we propose a novel method whereby predictions made using a support vector machine classification process are augmented through AADs. An illustra- tive example, in which the classes of handwritten digits are predicted, shows how the augmentation of such predictions can favor their interpretability
- …