484 research outputs found
A Study on Dialog Act Recognition using Character-Level Tokenization
Dialog act recognition is an important step for dialog systems since it
reveals the intention behind the uttered words. Most approaches on the task use
word-level tokenization. In contrast, this paper explores the use of
character-level tokenization. This is relevant since there is information at
the sub-word level that is related to the function of the words and, thus,
their intention. We also explore the use of different context windows around
each token, which are able to capture important elements, such as affixes.
Furthermore, we assess the importance of punctuation and capitalization. We
performed experiments on both the Switchboard Dialog Act Corpus and the DIHANA
Corpus. In both cases, the experiments not only show that character-level
tokenization leads to better performance than the typical word-level
approaches, but also that both approaches are able to capture complementary
information. Thus, the best results are achieved by combining tokenization at
both levels.Comment: 11 pages, 2 figures, 4 tables, AIMSA 201
Deep Dialog Act Recognition using Multiple Token, Segment, and Context Information Representations
Dialog act (DA) recognition is a task that has been widely explored over the
years. Recently, most approaches to the task explored different DNN
architectures to combine the representations of the words in a segment and
generate a segment representation that provides cues for intention. In this
study, we explore means to generate more informative segment representations,
not only by exploring different network architectures, but also by considering
different token representations, not only at the word level, but also at the
character and functional levels. At the word level, in addition to the commonly
used uncontextualized embeddings, we explore the use of contextualized
representations, which provide information concerning word sense and segment
structure. Character-level tokenization is important to capture
intention-related morphological aspects that cannot be captured at the word
level. Finally, the functional level provides an abstraction from words, which
shifts the focus to the structure of the segment. We also explore approaches to
enrich the segment representation with context information from the history of
the dialog, both in terms of the classifications of the surrounding segments
and the turn-taking history. This kind of information has already been proved
important for the disambiguation of DAs in previous studies. Nevertheless, we
are able to capture additional information by considering a summary of the
dialog history and a wider turn-taking context. By combining the best
approaches at each step, we achieve results that surpass the previous
state-of-the-art on generic DA recognition on both SwDA and MRDA, two of the
most widely explored corpora for the task. Furthermore, by considering both
past and future context, simulating annotation scenario, our approach achieves
a performance similar to that of a human annotator on SwDA and surpasses it on
MRDA.Comment: 38 pages, 7 figures, 9 tables, submitted to JAI
Automatic recognition of the general-purpose communicative functions defined by the ISO 24617-2 standard for dialog act annotation
From the perspective of a dialog system, it is important to identify the intention behind the segments in a dialog, since it provides an important cue regarding the information that is present in the segments and how they should be interpreted. ISO 24617-2, the standard for dialog act annotation, defines a hierarchically organized set of general-purpose communicative functions which correspond to different intentions that are relevant in the context of a dialog. We explore the automatic recognition of these communicative functions in the DialogBank, which is a reference set of dialogs annotated according to this standard. To do so, we propose adaptations of existing approaches to flat dialog act recognition that allow them to deal with the hierarchical classification problem. More specifically, we propose the use of an end-to-end hierarchical network with cascading outputs and maximum a posteriori path estimation to predict the communicative function at each level of the hierarchy, preserve the dependencies between the functions in the path, and decide at which level to stop. Furthermore, since the amount of dialogs in the DialogBank is small, we rely on transfer learning processes to reduce overfitting and improve performance. The results of our experiments show that our approach outperforms both a flat one and hierarchical approaches based on multiple classifiers and that each of its components plays an important role towards the recognition of general-purpose communicative functions.info:eu-repo/semantics/publishedVersio
Hierarchical Multi-Label Dialog Act Recognition on Spanish Data
Dialog acts reveal the intention behind the uttered words. Thus, their
automatic recognition is important for a dialog system trying to understand its
conversational partner. The study presented in this article approaches that
task on the DIHANA corpus, whose three-level dialog act annotation scheme poses
problems which have not been explored in recent studies. In addition to the
hierarchical problem, the two lower levels pose multi-label classification
problems. Furthermore, each level in the hierarchy refers to a different aspect
concerning the intention of the speaker both in terms of the structure of the
dialog and the task. Also, since its dialogs are in Spanish, it allows us to
assess whether the state-of-the-art approaches on English data generalize to a
different language. More specifically, we compare the performance of different
segment representation approaches focusing on both sequences and patterns of
words and assess the importance of the dialog history and the relations between
the multiple levels of the hierarchy. Concerning the single-label
classification problem posed by the top level, we show that the conclusions
drawn on English data also hold on Spanish data. Furthermore, we show that the
approaches can be adapted to multi-label scenarios. Finally, by hierarchically
combining the best classifiers for each level, we achieve the best results
reported for this corpus.Comment: 21 pages, 4 figures, 17 tables, translated version of the article
published in Linguam\'atica 11(1
Mapping the dialog act annotations of the LEGO corpus into ISO 24617-2 communicative functions
ISO 24617-2, the ISO standard for dialog act annotation, sets the ground for more comparable research in the area. However, the amount of data annotated according to it is still reduced, which impairs the development of approaches for automatic recognition. In this paper, we describe a mapping of the original dialog act labels of the LEGO corpus, which have been neglected, into the communicative functions of the standard. Although this does not lead to a complete annotation according to the standard, the 347 dialogs provide a relevant amount of data that can be used in the development of automatic communicative function recognition approaches, which may lead to a wider adoption of the standard. Using the 17 English dialogs of the DialogBank as gold standard, our preliminary experiments have shown that including the mapped dialogs during the training phase leads to improved performance while recognizing communicative functions in the Task dimension.info:eu-repo/semantics/publishedVersio
End-to-end multi-level dialog act recognition
The three-level dialog act annotation scheme of the DIHANA corpus poses a multi-level classification problem in which the bottom levels allow multiple or no labels for a single segment. We approach automatic dialog act recognition on the three levels using an end-to-end approach, in order to implicitly capture relations between them. Our deep neural network classifier uses a combination of word- and character-based segment representation approaches, together with a summary of the dialog history and information concerning speaker changes. We show that it is important to specialize the generic segment representation in order to capture the most relevant information for each level. On the other hand, the summary of the dialog history should combine information from the three levels to capture dependencies between them. Furthermore, the labels generated for each level help in the prediction of those of the lower levels. Overall, we achieve results which surpass those of our previous approach using the hierarchical combination of three independent per-level classifiers. Furthermore, the results even surpass the results achieved on the simplified version of the problem approached by previous studies, which neglected the multi-label nature of the bottom levels and only considered the label combinations present in the corpus.info:eu-repo/semantics/publishedVersio
Learning to Create and Reuse Words in Open-Vocabulary Neural Language Modeling
Fixed-vocabulary language models fail to account for one of the most
characteristic statistical facts of natural language: the frequent creation and
reuse of new word types. Although character-level language models offer a
partial solution in that they can create word types not attested in the
training corpus, they do not capture the "bursty" distribution of such words.
In this paper, we augment a hierarchical LSTM language model that generates
sequences of word tokens character by character with a caching mechanism that
learns to reuse previously generated words. To validate our model we construct
a new open-vocabulary language modeling corpus (the Multilingual Wikipedia
Corpus, MWC) from comparable Wikipedia articles in 7 typologically diverse
languages and demonstrate the effectiveness of our model across this range of
languages.Comment: ACL 201
Automatic recognition of the general-purpose communicative functions defined by the ISO 24617-2 standard for dialog act annotation (Extended abstract)
From the perspective of a dialog system, the identification of the intention behind the segments in a dialog is important, as it provides cues regarding the information present in the segments and how they should be interpreted. The ISO 24617-2 standard for dialog act annotation defines a hierarchically organized set of general-purpose communicative functions that correspond to different intentions that are relevant in the context of a dialog. In this paper, we explore the automatic recognition of these functions. To do so, we propose to adapt existing approaches to dialog act recognition, so that they can deal with the hierarchical classification problem. More specifically, we propose the use of an end-to-end hierarchical network with cascading outputs and maximum a posteriori path estimation to predict the communicative function at each level of the hierarchy, preserve the dependencies between the functions in the path, and decide at which level to stop. Additionally, we rely on transfer learning processes to address the data scarcity problem. Our experiments on the DialogBank show that this approach outperforms both flat and hierarchical approaches based on multiple classifiers and that each of its components plays an important role in the recognition of general-purpose communicative functionsinfo:eu-repo/semantics/publishedVersio
- …