11,331 research outputs found
End-to-End Knowledge-Routed Relational Dialogue System for Automatic Diagnosis
Beyond current conversational chatbots or task-oriented dialogue systems that
have attracted increasing attention, we move forward to develop a dialogue
system for automatic medical diagnosis that converses with patients to collect
additional symptoms beyond their self-reports and automatically makes a
diagnosis. Besides the challenges for conversational dialogue systems (e.g.
topic transition coherency and question understanding), automatic medical
diagnosis further poses more critical requirements for the dialogue rationality
in the context of medical knowledge and symptom-disease relations. Existing
dialogue systems (Madotto, Wu, and Fung 2018; Wei et al. 2018; Li et al. 2017)
mostly rely on data-driven learning and cannot be able to encode extra expert
knowledge graph. In this work, we propose an End-to-End Knowledge-routed
Relational Dialogue System (KR-DS) that seamlessly incorporates rich medical
knowledge graph into the topic transition in dialogue management, and makes it
cooperative with natural language understanding and natural language
generation. A novel Knowledge-routed Deep Q-network (KR-DQN) is introduced to
manage topic transitions, which integrates a relational refinement branch for
encoding relations among different symptoms and symptom-disease pairs, and a
knowledge-routed graph branch for topic decision-making. Extensive experiments
on a public medical dialogue dataset show our KR-DS significantly beats
state-of-the-art methods (by more than 8% in diagnosis accuracy). We further
show the superiority of our KR-DS on a newly collected medical dialogue system
dataset, which is more challenging retaining original self-reports and
conversational data between patients and doctors.Comment: 8 pages, 5 figues, AAA
A Controllable Model of Grounded Response Generation
Current end-to-end neural conversation models inherently lack the flexibility
to impose semantic control in the response generation process, often resulting
in uninteresting responses. Attempts to boost informativeness alone come at the
expense of factual accuracy, as attested by pretrained language models'
propensity to "hallucinate" facts. While this may be mitigated by access to
background knowledge, there is scant guarantee of relevance and informativeness
in generated responses. We propose a framework that we call controllable
grounded response generation (CGRG), in which lexical control phrases are
either provided by a user or automatically extracted by a control phrase
predictor from dialogue context and grounding knowledge. Quantitative and
qualitative results show that, using this framework, a transformer based model
with a novel inductive attention mechanism, trained on a conversation-like
Reddit dataset, outperforms strong generation baselines.Comment: AAAI 202
Copy mechanism and tailored training for character-based data-to-text generation
In the last few years, many different methods have been focusing on using
deep recurrent neural networks for natural language generation. The most widely
used sequence-to-sequence neural methods are word-based: as such, they need a
pre-processing step called delexicalization (conversely, relexicalization) to
deal with uncommon or unknown words. These forms of processing, however, give
rise to models that depend on the vocabulary used and are not completely
neural.
In this work, we present an end-to-end sequence-to-sequence model with
attention mechanism which reads and generates at a character level, no longer
requiring delexicalization, tokenization, nor even lowercasing. Moreover, since
characters constitute the common "building blocks" of every text, it also
allows a more general approach to text generation, enabling the possibility to
exploit transfer learning for training. These skills are obtained thanks to two
major features: (i) the possibility to alternate between the standard
generation mechanism and a copy one, which allows to directly copy input facts
to produce outputs, and (ii) the use of an original training pipeline that
further improves the quality of the generated texts.
We also introduce a new dataset called E2E+, designed to highlight the
copying capabilities of character-based models, that is a modified version of
the well-known E2E dataset used in the E2E Challenge. We tested our model
according to five broadly accepted metrics (including the widely used BLEU),
showing that it yields competitive performance with respect to both
character-based and word-based approaches.Comment: ECML-PKDD 2019 (Camera ready version
Incorporating Structured Commonsense Knowledge in Story Completion
The ability to select an appropriate story ending is the first step towards
perfect narrative comprehension. Story ending prediction requires not only the
explicit clues within the context, but also the implicit knowledge (such as
commonsense) to construct a reasonable and consistent story. However, most
previous approaches do not explicitly use background commonsense knowledge. We
present a neural story ending selection model that integrates three types of
information: narrative sequence, sentiment evolution and commonsense knowledge.
Experiments show that our model outperforms state-of-the-art approaches on a
public dataset, ROCStory Cloze Task , and the performance gain from adding the
additional commonsense knowledge is significant
- …