3,038 research outputs found
Understanding Hidden Memories of Recurrent Neural Networks
Recurrent neural networks (RNNs) have been successfully applied to various
natural language processing (NLP) tasks and achieved better results than
conventional methods. However, the lack of understanding of the mechanisms
behind their effectiveness limits further improvements on their architectures.
In this paper, we present a visual analytics method for understanding and
comparing RNN models for NLP tasks. We propose a technique to explain the
function of individual hidden state units based on their expected response to
input texts. We then co-cluster hidden state units and words based on the
expected response and visualize co-clustering results as memory chips and word
clouds to provide more structured knowledge on RNNs' hidden states. We also
propose a glyph-based sequence visualization based on aggregate information to
analyze the behavior of an RNN's hidden state at the sentence-level. The
usability and effectiveness of our method are demonstrated through case studies
and reviews from domain experts.Comment: Published at IEEE Conference on Visual Analytics Science and
Technology (IEEE VAST 2017
Explaining Aviation Safety Incidents Using Deep Temporal Multiple Instance Learning
Although aviation accidents are rare, safety incidents occur more frequently
and require a careful analysis to detect and mitigate risks in a timely manner.
Analyzing safety incidents using operational data and producing event-based
explanations is invaluable to airline companies as well as to governing
organizations such as the Federal Aviation Administration (FAA) in the United
States. However, this task is challenging because of the complexity involved in
mining multi-dimensional heterogeneous time series data, the lack of
time-step-wise annotation of events in a flight, and the lack of scalable tools
to perform analysis over a large number of events. In this work, we propose a
precursor mining algorithm that identifies events in the multidimensional time
series that are correlated with the safety incident. Precursors are valuable to
systems health and safety monitoring and in explaining and forecasting safety
incidents. Current methods suffer from poor scalability to high dimensional
time series data and are inefficient in capturing temporal behavior. We propose
an approach by combining multiple-instance learning (MIL) and deep recurrent
neural networks (DRNN) to take advantage of MIL's ability to learn using weakly
supervised data and DRNN's ability to model temporal behavior. We describe the
algorithm, the data, the intuition behind taking a MIL approach, and a
comparative analysis of the proposed algorithm with baseline models. We also
discuss the application to a real-world aviation safety problem using data from
a commercial airline company and discuss the model's abilities and
shortcomings, with some final remarks about possible deployment directions
Personalized Dialogue Generation with Diversified Traits
Endowing a dialogue system with particular personality traits is essential to
deliver more human-like conversations. However, due to the challenge of
embodying personality via language expression and the lack of large-scale
persona-labeled dialogue data, this research problem is still far from
well-studied. In this paper, we investigate the problem of incorporating
explicit personality traits in dialogue generation to deliver personalized
dialogues.
To this end, firstly, we construct PersonalDialog, a large-scale multi-turn
dialogue dataset containing various traits from a large number of speakers. The
dataset consists of 20.83M sessions and 56.25M utterances from 8.47M speakers.
Each utterance is associated with a speaker who is marked with traits like Age,
Gender, Location, Interest Tags, etc. Several anonymization schemes are
designed to protect the privacy of each speaker. This large-scale dataset will
facilitate not only the study of personalized dialogue generation, but also
other researches on sociolinguistics or social science.
Secondly, to study how personality traits can be captured and addressed in
dialogue generation, we propose persona-aware dialogue generation models within
the sequence to sequence learning framework. Explicit personality traits
(structured by key-value pairs) are embedded using a trait fusion module.
During the decoding process, two techniques, namely persona-aware attention and
persona-aware bias, are devised to capture and address trait-related
information. Experiments demonstrate that our model is able to address proper
traits in different contexts. Case studies also show interesting results for
this challenging research problem.Comment: Please contact [zhengyinhe1 at 163 dot com] for the PersonalDialog
datase
- …