2,037 research outputs found
Deep Dialog Act Recognition using Multiple Token, Segment, and Context Information Representations
Dialog act (DA) recognition is a task that has been widely explored over the
years. Recently, most approaches to the task explored different DNN
architectures to combine the representations of the words in a segment and
generate a segment representation that provides cues for intention. In this
study, we explore means to generate more informative segment representations,
not only by exploring different network architectures, but also by considering
different token representations, not only at the word level, but also at the
character and functional levels. At the word level, in addition to the commonly
used uncontextualized embeddings, we explore the use of contextualized
representations, which provide information concerning word sense and segment
structure. Character-level tokenization is important to capture
intention-related morphological aspects that cannot be captured at the word
level. Finally, the functional level provides an abstraction from words, which
shifts the focus to the structure of the segment. We also explore approaches to
enrich the segment representation with context information from the history of
the dialog, both in terms of the classifications of the surrounding segments
and the turn-taking history. This kind of information has already been proved
important for the disambiguation of DAs in previous studies. Nevertheless, we
are able to capture additional information by considering a summary of the
dialog history and a wider turn-taking context. By combining the best
approaches at each step, we achieve results that surpass the previous
state-of-the-art on generic DA recognition on both SwDA and MRDA, two of the
most widely explored corpora for the task. Furthermore, by considering both
past and future context, simulating annotation scenario, our approach achieves
a performance similar to that of a human annotator on SwDA and surpasses it on
MRDA.Comment: 38 pages, 7 figures, 9 tables, submitted to JAI
A Study on Dialog Act Recognition using Character-Level Tokenization
Dialog act recognition is an important step for dialog systems since it
reveals the intention behind the uttered words. Most approaches on the task use
word-level tokenization. In contrast, this paper explores the use of
character-level tokenization. This is relevant since there is information at
the sub-word level that is related to the function of the words and, thus,
their intention. We also explore the use of different context windows around
each token, which are able to capture important elements, such as affixes.
Furthermore, we assess the importance of punctuation and capitalization. We
performed experiments on both the Switchboard Dialog Act Corpus and the DIHANA
Corpus. In both cases, the experiments not only show that character-level
tokenization leads to better performance than the typical word-level
approaches, but also that both approaches are able to capture complementary
information. Thus, the best results are achieved by combining tokenization at
both levels.Comment: 11 pages, 2 figures, 4 tables, AIMSA 201
Hierarchical Multi-Label Dialog Act Recognition on Spanish Data
Dialog acts reveal the intention behind the uttered words. Thus, their
automatic recognition is important for a dialog system trying to understand its
conversational partner. The study presented in this article approaches that
task on the DIHANA corpus, whose three-level dialog act annotation scheme poses
problems which have not been explored in recent studies. In addition to the
hierarchical problem, the two lower levels pose multi-label classification
problems. Furthermore, each level in the hierarchy refers to a different aspect
concerning the intention of the speaker both in terms of the structure of the
dialog and the task. Also, since its dialogs are in Spanish, it allows us to
assess whether the state-of-the-art approaches on English data generalize to a
different language. More specifically, we compare the performance of different
segment representation approaches focusing on both sequences and patterns of
words and assess the importance of the dialog history and the relations between
the multiple levels of the hierarchy. Concerning the single-label
classification problem posed by the top level, we show that the conclusions
drawn on English data also hold on Spanish data. Furthermore, we show that the
approaches can be adapted to multi-label scenarios. Finally, by hierarchically
combining the best classifiers for each level, we achieve the best results
reported for this corpus.Comment: 21 pages, 4 figures, 17 tables, translated version of the article
published in Linguam\'atica 11(1
End-to-end multi-level dialog act recognition
The three-level dialog act annotation scheme of the DIHANA corpus poses a multi-level classification problem in which the bottom levels allow multiple or no labels for a single segment. We approach automatic dialog act recognition on the three levels using an end-to-end approach, in order to implicitly capture relations between them. Our deep neural network classifier uses a combination of word- and character-based segment representation approaches, together with a summary of the dialog history and information concerning speaker changes. We show that it is important to specialize the generic segment representation in order to capture the most relevant information for each level. On the other hand, the summary of the dialog history should combine information from the three levels to capture dependencies between them. Furthermore, the labels generated for each level help in the prediction of those of the lower levels. Overall, we achieve results which surpass those of our previous approach using the hierarchical combination of three independent per-level classifiers. Furthermore, the results even surpass the results achieved on the simplified version of the problem approached by previous studies, which neglected the multi-label nature of the bottom levels and only considered the label combinations present in the corpus.info:eu-repo/semantics/publishedVersio
Stochastic Language Generation in Dialogue using Recurrent Neural Networks with Convolutional Sentence Reranking
The natural language generation (NLG) component of a spoken dialogue system
(SDS) usually needs a substantial amount of handcrafting or a well-labeled
dataset to be trained on. These limitations add significantly to development
costs and make cross-domain, multi-lingual dialogue systems intractable.
Moreover, human languages are context-aware. The most natural response should
be directly learned from data rather than depending on predefined syntaxes or
rules. This paper presents a statistical language generator based on a joint
recurrent and convolutional neural network structure which can be trained on
dialogue act-utterance pairs without any semantic alignments or predefined
grammar trees. Objective metrics suggest that this new model outperforms
previous methods under the same experimental conditions. Results of an
evaluation by human judges indicate that it produces not only high quality but
linguistically varied utterances which are preferred compared to n-gram and
rule-based systems.Comment: To be appear in SigDial 201
- …