46 research outputs found
Interpreting Recurrent and Attention-Based Neural Models: a Case Study on Natural Language Inference
Deep learning models have achieved remarkable success in natural language
inference (NLI) tasks. While these models are widely explored, they are hard to
interpret and it is often unclear how and why they actually work. In this
paper, we take a step toward explaining such deep learning based models through
a case study on a popular neural model for NLI. In particular, we propose to
interpret the intermediate layers of NLI models by visualizing the saliency of
attention and LSTM gating signals. We present several examples for which our
methods are able to reveal interesting insights and identify the critical
information contributing to the model decisions.Comment: 11 pages, 11 figures, accepted as a short paper at EMNLP 201
Recommended from our members
Improving and Understanding Deep Models for Natural Language Comprehension
Natural Language Comprehension is a challenging domain of Natural Language Processing. To improve a model’s language comprehension/understanding, one approach would be to enrich the structure of the model to enhance its capability in learning the latent rules of the language.
In this dissertation, we will first introduce several deep models for a variety of natural language comprehension tasks including natural language inference and question answering. Previous approaches employ reading mechanisms that do not fully exploit the interdependencies between the input sources like “premise and hypothesis” or “document and query”. In contrast, we explore more sophisticated reading mechanisms to efficiently model the relationships between input sources (e.g. “premise and hypothesis” or “document and query”). These mechanisms and models yield better empirical performances, however, due to the black-box nature of deep learning, it is difficult to assess whether the improved models indeed acquire a better understanding of language. Meanwhile, data is often plagued by meaningless or even harmful statistical biases and deep models might achieve high performance by focusing on the biases. This motivates us to study methods for “peaking inside” the black-box deep models to provide explanation and understanding of the models’ behavior. The proposed method (a.k.a. saliency) takes a step toward explaining deep learning-based models based on gradient of the model output with respect to different components like the input layer and inter-mediate layers. Saliency reveals interesting insights and identifies critical information contributing to the model decisions. Besides proposing a model-agnostic interpretation method (saliency), we study model-dependent interpretation solutions and propose two interpretable designs and structures. Finally, we introduce a novel mechanism (saliency learning), which learns from ground-truth explanation signal such that the learned model will not only make the right prediction but also for the right reason. Our experimental results on multiple tasks and datasets demonstrate the effectiveness of the proposed methods, which produce more faithful to right reasons and evidences predictions while delivering better results compared to traditionally trained models
Smooth Transition of Vehicles' Maximum Speed for Lane Detection based on Computer Vision
This paper presents a prototype electric scooter designed to detect the driving lane via computer vision and automatically set the vehicular configuration. The electric scooter can drive on the pedestrian, bicycle, or car lanes. The government enforces maximum speeds on each lane for the electric scooter.
Our prototype scooter would apply those regulations securely, with the help of a computer vision component. However, the safety of such a system is still part of the concern and research is going on the security and safety aspects of such vehicular systems. The maximum speed changes while the driver is riding the vehicle at the fastest possible speed could cause a safety hazard. To prevent that, we proposed to use the logarithmic speed reduction or acceleration. The results show that such an algorithm will smooth the transition between the maximum of the vehicle