1,014 research outputs found
Dialogue Coherence Assessment Without Explicit Dialogue Act Labels
Recent dialogue coherence models use the coherence features designed for
monologue texts, e.g. nominal entities, to represent utterances and then
explicitly augment them with dialogue-relevant features, e.g., dialogue act
labels. It indicates two drawbacks, (a) semantics of utterances is limited to
entity mentions, and (b) the performance of coherence models strongly relies on
the quality of the input dialogue act labels. We address these issues by
introducing a novel approach to dialogue coherence assessment. We use dialogue
act prediction as an auxiliary task in a multi-task learning scenario to obtain
informative utterance representations for coherence assessment. Our approach
alleviates the need for explicit dialogue act labels during evaluation. The
results of our experiments show that our model substantially (more than 20
accuracy points) outperforms its strong competitors on the DailyDialogue
corpus, and performs on par with them on the SwitchBoard corpus for ranking
dialogues concerning their coherence.Comment: Accepted at ACL 202
Foundation Metrics: Quantifying Effectiveness of Healthcare Conversations powered by Generative AI
Generative Artificial Intelligence is set to revolutionize healthcare
delivery by transforming traditional patient care into a more personalized,
efficient, and proactive process. Chatbots, serving as interactive
conversational models, will probably drive this patient-centered transformation
in healthcare. Through the provision of various services, including diagnosis,
personalized lifestyle recommendations, and mental health support, the
objective is to substantially augment patient health outcomes, all the while
mitigating the workload burden on healthcare providers. The life-critical
nature of healthcare applications necessitates establishing a unified and
comprehensive set of evaluation metrics for conversational models. Existing
evaluation metrics proposed for various generic large language models (LLMs)
demonstrate a lack of comprehension regarding medical and health concepts and
their significance in promoting patients' well-being. Moreover, these metrics
neglect pivotal user-centered aspects, including trust-building, ethics,
personalization, empathy, user comprehension, and emotional support. The
purpose of this paper is to explore state-of-the-art LLM-based evaluation
metrics that are specifically applicable to the assessment of interactive
conversational models in healthcare. Subsequently, we present an comprehensive
set of evaluation metrics designed to thoroughly assess the performance of
healthcare chatbots from an end-user perspective. These metrics encompass an
evaluation of language processing abilities, impact on real-world clinical
tasks, and effectiveness in user-interactive conversations. Finally, we engage
in a discussion concerning the challenges associated with defining and
implementing these metrics, with particular emphasis on confounding factors
such as the target audience, evaluation methods, and prompt techniques involved
in the evaluation process.Comment: 13 pages, 4 figures, 2 tables, journal pape
Open-world Story Generation with Structured Knowledge Enhancement: A Comprehensive Survey
Storytelling and narrative are fundamental to human experience, intertwined
with our social and cultural engagement. As such, researchers have long
attempted to create systems that can generate stories automatically. In recent
years, powered by deep learning and massive data resources, automatic story
generation has shown significant advances. However, considerable challenges,
like the need for global coherence in generated stories, still hamper
generative models from reaching the same storytelling ability as human
narrators. To tackle these challenges, many studies seek to inject structured
knowledge into the generation process, which is referred to as structure
knowledge-enhanced story generation. Incorporating external knowledge can
enhance the logical coherence among story events, achieve better knowledge
grounding, and alleviate over-generalization and repetition problems in
stories. This survey provides the latest and comprehensive review of this
research field: (i) we present a systematical taxonomy regarding how existing
methods integrate structured knowledge into story generation; (ii) we summarize
involved story corpora, structured knowledge datasets, and evaluation metrics;
(iii) we give multidimensional insights into the challenges of
knowledge-enhanced story generation and cast light on promising directions for
future study
Survey on reinforcement learning for language processing
In recent years some researchers have explored the use of reinforcement
learning (RL) algorithms as key components in the solution of various natural
language processing tasks. For instance, some of these algorithms leveraging
deep neural learning have found their way into conversational systems. This
paper reviews the state of the art of RL methods for their possible use for
different problems of natural language processing, focusing primarily on
conversational systems, mainly due to their growing relevance. We provide
detailed descriptions of the problems as well as discussions of why RL is
well-suited to solve them. Also, we analyze the advantages and limitations of
these methods. Finally, we elaborate on promising research directions in
natural language processing that might benefit from reinforcement learning
- …