122,152 research outputs found
Active Learning for Dialogue Act Classification
Active learning techniques were employed for classification of dialogue acts over two dialogue corpora, the English human-human Switchboard corpus and the Spanish human-machine Dihana corpus. It is shown clearly that active learning improves on a baseline obtained through a passive learning approach to tagging the same data sets. An error reduction of 7% was obtained on Switchboard, while a factor 5 reduction in the amount of labeled data needed for classification was achieved on Dihana. The passive Support Vector Machine learner used as baseline in itself significantly improves the state of the art in dialogue act classification on both corpora. On Switchboard it gives a 31% error reduction compared to the previously best reported result
Survey on Evaluation Methods for Dialogue Systems
In this paper we survey the methods and concepts developed for the evaluation
of dialogue systems. Evaluation is a crucial part during the development
process. Often, dialogue systems are evaluated by means of human evaluations
and questionnaires. However, this tends to be very cost and time intensive.
Thus, much work has been put into finding methods, which allow to reduce the
involvement of human labour. In this survey, we present the main concepts and
methods. For this, we differentiate between the various classes of dialogue
systems (task-oriented dialogue systems, conversational dialogue systems, and
question-answering dialogue systems). We cover each class by introducing the
main technologies developed for the dialogue systems and then by presenting the
evaluation methods regarding this class
Towards Understanding Egyptian Arabic Dialogues
Labelling of user's utterances to understanding his attends which called
Dialogue Act (DA) classification, it is considered the key player for dialogue
language understanding layer in automatic dialogue systems. In this paper, we
proposed a novel approach to user's utterances labeling for Egyptian
spontaneous dialogues and Instant Messages using Machine Learning (ML) approach
without relying on any special lexicons, cues, or rules. Due to the lack of
Egyptian dialect dialogue corpus, the system evaluated by multi-genre corpus
includes 4725 utterances for three domains, which are collected and annotated
manually from Egyptian call-centers. The system achieves F1 scores of 70. 36%
overall domains.Comment: arXiv admin note: substantial text overlap with arXiv:1505.0308
Analyzing collaborative learning processes automatically
In this article we describe the emerging area of text classification research focused on the problem of collaborative learning process analysis both from a broad perspective and more specifically in terms of a publicly available tool set called TagHelper tools. Analyzing the variety of pedagogically valuable facets of learners’ interactions is a time consuming and effortful process. Improving automated analyses of such highly valued processes of collaborative learning by adapting and applying recent text classification technologies would make it a less arduous task to obtain insights from corpus data. This endeavor also holds the potential for enabling substantially improved on-line instruction both by providing teachers and facilitators with reports about the groups they are moderating and by triggering context sensitive collaborative learning support on an as-needed basis. In this article, we report on an interdisciplinary research project, which has been investigating the effectiveness of applying text classification technology to a large CSCL corpus that has been analyzed by human coders using a theory-based multidimensional coding scheme. We report promising results and include an in-depth discussion of important issues such as reliability, validity, and efficiency that should be considered when deciding on the appropriateness of adopting a new technology such as TagHelper tools. One major technical contribution of this work is a demonstration that an important piece of the work towards making text classification technology effective for this purpose is designing and building linguistic pattern detectors, otherwise known as features, that can be extracted reliably from texts and that have high predictive power for the categories of discourse actions that the CSCL community is interested in
Conceptual spatial representations for indoor mobile robots
We present an approach for creating conceptual representations of human-made indoor environments using mobile
robots. The concepts refer to spatial and functional properties of typical indoor environments. Following findings
in cognitive psychology, our model is composed of layers representing maps at different levels of abstraction. The
complete system is integrated in a mobile robot endowed with laser and vision sensors for place and object recognition.
The system also incorporates a linguistic framework that actively supports the map acquisition process, and which
is used for situated dialogue. Finally, we discuss the capabilities of the integrated system
Learning About Meetings
Most people participate in meetings almost every day, multiple times a day.
The study of meetings is important, but also challenging, as it requires an
understanding of social signals and complex interpersonal dynamics. Our aim
this work is to use a data-driven approach to the science of meetings. We
provide tentative evidence that: i) it is possible to automatically detect when
during the meeting a key decision is taking place, from analyzing only the
local dialogue acts, ii) there are common patterns in the way social dialogue
acts are interspersed throughout a meeting, iii) at the time key decisions are
made, the amount of time left in the meeting can be predicted from the amount
of time that has passed, iv) it is often possible to predict whether a proposal
during a meeting will be accepted or rejected based entirely on the language
(the set of persuasive words) used by the speaker
Recommended from our members
Proceedings of QG2010: The Third Workshop on Question Generation
These are the peer-reviewed proceedings of "QG2010, The Third Workshop on Question Generation". The workshop included a special track for "QGSTEC2010: The First Question Generation Shared Task and Evaluation Challenge".
QG2010 was held as part of The Tenth International Conference on Intelligent Tutoring Systems (ITS2010)
Conversational Analysis using Utterance-level Attention-based Bidirectional Recurrent Neural Networks
Recent approaches for dialogue act recognition have shown that context from
preceding utterances is important to classify the subsequent one. It was shown
that the performance improves rapidly when the context is taken into account.
We propose an utterance-level attention-based bidirectional recurrent neural
network (Utt-Att-BiRNN) model to analyze the importance of preceding utterances
to classify the current one. In our setup, the BiRNN is given the input set of
current and preceding utterances. Our model outperforms previous models that
use only preceding utterances as context on the used corpus. Another
contribution of the article is to discover the amount of information in each
utterance to classify the subsequent one and to show that context-based
learning not only improves the performance but also achieves higher confidence
in the classification. We use character- and word-level features to represent
the utterances. The results are presented for character and word feature
representations and as an ensemble model of both representations. We found that
when classifying short utterances, the closest preceding utterances contributes
to a higher degree.Comment: Proceedings of INTERSPEECH 201
Deep Dialog Act Recognition using Multiple Token, Segment, and Context Information Representations
Dialog act (DA) recognition is a task that has been widely explored over the
years. Recently, most approaches to the task explored different DNN
architectures to combine the representations of the words in a segment and
generate a segment representation that provides cues for intention. In this
study, we explore means to generate more informative segment representations,
not only by exploring different network architectures, but also by considering
different token representations, not only at the word level, but also at the
character and functional levels. At the word level, in addition to the commonly
used uncontextualized embeddings, we explore the use of contextualized
representations, which provide information concerning word sense and segment
structure. Character-level tokenization is important to capture
intention-related morphological aspects that cannot be captured at the word
level. Finally, the functional level provides an abstraction from words, which
shifts the focus to the structure of the segment. We also explore approaches to
enrich the segment representation with context information from the history of
the dialog, both in terms of the classifications of the surrounding segments
and the turn-taking history. This kind of information has already been proved
important for the disambiguation of DAs in previous studies. Nevertheless, we
are able to capture additional information by considering a summary of the
dialog history and a wider turn-taking context. By combining the best
approaches at each step, we achieve results that surpass the previous
state-of-the-art on generic DA recognition on both SwDA and MRDA, two of the
most widely explored corpora for the task. Furthermore, by considering both
past and future context, simulating annotation scenario, our approach achieves
a performance similar to that of a human annotator on SwDA and surpasses it on
MRDA.Comment: 38 pages, 7 figures, 9 tables, submitted to JAI
- …