871 research outputs found
Explaining Machine Learning Classifiers through Diverse Counterfactual Explanations
Post-hoc explanations of machine learning models are crucial for people to
understand and act on algorithmic predictions. An intriguing class of
explanations is through counterfactuals, hypothetical examples that show people
how to obtain a different prediction. We posit that effective counterfactual
explanations should satisfy two properties: feasibility of the counterfactual
actions given user context and constraints, and diversity among the
counterfactuals presented. To this end, we propose a framework for generating
and evaluating a diverse set of counterfactual explanations based on
determinantal point processes. To evaluate the actionability of
counterfactuals, we provide metrics that enable comparison of
counterfactual-based methods to other local explanation methods. We further
address necessary tradeoffs and point to causal implications in optimizing for
counterfactuals. Our experiments on four real-world datasets show that our
framework can generate a set of counterfactuals that are diverse and well
approximate local decision boundaries, outperforming prior approaches to
generating diverse counterfactuals. We provide an implementation of the
framework at https://github.com/microsoft/DiCE.Comment: 13 page
Recurrent Latent Variable Networks for Session-Based Recommendation
In this work, we attempt to ameliorate the impact of data sparsity in the
context of session-based recommendation. Specifically, we seek to devise a
machine learning mechanism capable of extracting subtle and complex underlying
temporal dynamics in the observed session data, so as to inform the
recommendation algorithm. To this end, we improve upon systems that utilize
deep learning techniques with recurrently connected units; we do so by adopting
concepts from the field of Bayesian statistics, namely variational inference.
Our proposed approach consists in treating the network recurrent units as
stochastic latent variables with a prior distribution imposed over them. On
this basis, we proceed to infer corresponding posteriors; these can be used for
prediction and recommendation generation, in a way that accounts for the
uncertainty in the available sparse training data. To allow for our approach to
easily scale to large real-world datasets, we perform inference under an
approximate amortized variational inference (AVI) setup, whereby the learned
posteriors are parameterized via (conventional) neural networks. We perform an
extensive experimental evaluation of our approach using challenging benchmark
datasets, and illustrate its superiority over existing state-of-the-art
techniques
First principles calculation of vibrational Raman spectra in large systems: signature of small rings in crystalline SiO2
We present an approach for the efficient calculation of vibrational Raman
intensities in periodic systems within density functional theory. The Raman
intensities are computed from the second order derivative of the electronic
density matrix with respect to a uniform electric field. In contrast to
previous approaches, the computational effort required by our method for the
evaluation of the intensities is negligible compared to that required for the
calculation of vibrational frequencies. As a first application, we study the
signature of 3- and 4-membered rings in the the Raman spectra of several
polymorphs of SiO2, including a zeolite having 102 atoms per unit cell.Comment: 4 pages, 2 figures, revtex4 Minor corrections; accepted in Phys. Rev.
Let
Where are we now? A large benchmark study of recent symbolic regression methods
In this paper we provide a broad benchmarking of recent genetic programming
approaches to symbolic regression in the context of state of the art machine
learning approaches. We use a set of nearly 100 regression benchmark problems
culled from open source repositories across the web. We conduct a rigorous
benchmarking of four recent symbolic regression approaches as well as nine
machine learning approaches from scikit-learn. The results suggest that
symbolic regression performs strongly compared to state-of-the-art gradient
boosting algorithms, although in terms of running times is among the slowest of
the available methodologies. We discuss the results in detail and point to
future research directions that may allow symbolic regression to gain wider
adoption in the machine learning community.Comment: 8 pages, 4 figures. GECCO 201
Neural Attentive Session-based Recommendation
Given e-commerce scenarios that user profiles are invisible, session-based
recommendation is proposed to generate recommendation results from short
sessions. Previous work only considers the user's sequential behavior in the
current session, whereas the user's main purpose in the current session is not
emphasized. In this paper, we propose a novel neural networks framework, i.e.,
Neural Attentive Recommendation Machine (NARM), to tackle this problem.
Specifically, we explore a hybrid encoder with an attention mechanism to model
the user's sequential behavior and capture the user's main purpose in the
current session, which are combined as a unified session representation later.
We then compute the recommendation scores for each candidate item with a
bi-linear matching scheme based on this unified session representation. We
train NARM by jointly learning the item and session representations as well as
their matchings. We carried out extensive experiments on two benchmark
datasets. Our experimental results show that NARM outperforms state-of-the-art
baselines on both datasets. Furthermore, we also find that NARM achieves a
significant improvement on long sessions, which demonstrates its advantages in
modeling the user's sequential behavior and main purpose simultaneously.Comment: Proceedings of the 2017 ACM on Conference on Information and
Knowledge Management. arXiv admin note: text overlap with arXiv:1511.06939,
arXiv:1606.08117 by other author
Effects of a passive back exoskeleton on the mechanical loading of the low-back during symmetric lifting
Low-back pain is the number one cause of disability in the world, with mechanical loading as one of the major risk factors. Exoskeletons have been introduced in the workplace to reduce low back loading. During static forward bending, exoskeletons have been shown to reduce back muscle activity by 10% to 40%. However, effects during dynamic lifting are not well documented. Relative support of the exoskeleton might be smaller in lifting compared to static bending due to higher peak loads. In addition, exoskeletons might also result in changes in lifting behavior, which in turn could affect low back loading. The present study investigated the effect of a passive exoskeleton on peak compression forces, moments, muscle activity and kinematics during symmetric lifting. Two types (LOW and HIGH) of the device, which generate peak support moments at large and moderate flexion angles, respectively, were tested during lifts from knee and ankle height from a near and far horizontal position, with a load of 10 kg. Both types of the trunk exoskeleton tested here reduced the peak L5S1 compression force by around 5-10% for lifts from the FAR position from both KNEE and ANKLE height. Subjects did adjust their lifting style when wearing the device with a 17% reduced peak trunk angular velocity and 5 degrees increased lumbar flexion, especially during ANKLE height lifts. In conclusion, the exoskeleton had a minor and varying effect on the peak L5S1 compression force with only significant differences in the FAR lifts
Biomechanical Evaluation of the Effect of Three Trunk Support Exoskeletons on Spine Loading During Lifting
Isolated Character Forms from Dated Syriac Manuscripts
This paper describes a set of hand-isolated character samples selected from securely dated manuscripts written in Syriac between 300 and 1300 C.E., which are being made available for research purposes. The collection can be used for a number of applications, including ground truth for character segmentation and form analysis for paleographical dating. Several applications based upon convolutional neural networks demonstrate the possibilities of the data set
User Intent Prediction in Information-seeking Conversations
Conversational assistants are being progressively adopted by the general
population. However, they are not capable of handling complicated
information-seeking tasks that involve multiple turns of information exchange.
Due to the limited communication bandwidth in conversational search, it is
important for conversational assistants to accurately detect and predict user
intent in information-seeking conversations. In this paper, we investigate two
aspects of user intent prediction in an information-seeking setting. First, we
extract features based on the content, structural, and sentiment
characteristics of a given utterance, and use classic machine learning methods
to perform user intent prediction. We then conduct an in-depth feature
importance analysis to identify key features in this prediction task. We find
that structural features contribute most to the prediction performance. Given
this finding, we construct neural classifiers to incorporate context
information and achieve better performance without feature engineering. Our
findings can provide insights into the important factors and effective methods
of user intent prediction in information-seeking conversations.Comment: Accepted to CHIIR 201
Dual Averaging Method for Online Graph-structured Sparsity
Online learning algorithms update models via one sample per iteration, thus
efficient to process large-scale datasets and useful to detect malicious events
for social benefits, such as disease outbreak and traffic congestion on the
fly. However, existing algorithms for graph-structured models focused on the
offline setting and the least square loss, incapable for online setting, while
methods designed for online setting cannot be directly applied to the problem
of complex (usually non-convex) graph-structured sparsity model. To address
these limitations, in this paper we propose a new algorithm for
graph-structured sparsity constraint problems under online setting, which we
call \textsc{GraphDA}. The key part in \textsc{GraphDA} is to project both
averaging gradient (in dual space) and primal variables (in primal space) onto
lower dimensional subspaces, thus capturing the graph-structured sparsity
effectively. Furthermore, the objective functions assumed here are generally
convex so as to handle different losses for online learning settings. To the
best of our knowledge, \textsc{GraphDA} is the first online learning algorithm
for graph-structure constrained optimization problems. To validate our method,
we conduct extensive experiments on both benchmark graph and real-world graph
datasets. Our experiment results show that, compared to other baseline methods,
\textsc{GraphDA} not only improves classification performance, but also
successfully captures graph-structured features more effectively, hence
stronger interpretability.Comment: 11 pages, 14 figure
- …