7 research outputs found
An Empirical Study of Topic Transition in Dialogue
Transitioning between topics is a natural component of human-human dialog.
Although topic transition has been studied in dialogue for decades, only a
handful of corpora based studies have been performed to investigate the
subtleties of topic transitions. Thus, this study annotates 215 conversations
from the switchboard corpus and investigates how variables such as length,
number of topic transitions, topic transitions share by participants and
turns/topic are related. This work presents an empirical study on topic
transition in switchboard corpus followed by modelling topic transition with a
precision of 83% for in-domain(id) test set and 82% on 10 out-of-domain}(ood)
test set. It is envisioned that this work will help in emulating human-human
like topic transition in open-domain dialog systems.Comment: 5 pages, 4 figures, 3 table
Inter-Coder Agreement for Computational Linguistics
This article is a survey of methods for measuring agreement among corpus annotators. It exposes the mathematics and underlying assumptions of agreement coefficients, covering Krippendorff's alpha as well as Scott's pi and Cohen's kappa; discusses the use of coefficients in several annotation tasks; and argues that weighted, alpha-like coefficients, traditionally less used than kappa-like measures in computational linguistics, may be more appropriate for many corpus annotation tasksâbut that their use makes the interpretation of the value of the coefficient even harder. </jats:p
A framework for structuring prerequisite relations between concepts in educational textbooks
In our age we are experiencing an increasing availability of digital educational resources and self-regulated learning. In this scenario, the development of automatic strategies for organizing the knowledge embodied in educational resources has a tremendous potential for building personalized learning paths and applications such as intelligent textbooks and recommender systems of learning materials. To this aim, a straightforward approach consists in enriching the educational materials with a concept graph, i.a. a knowledge structure where key concepts of the subject matter are represented as nodes and prerequisite dependencies among such concepts are also explicitly represented. This thesis focuses therefore on prerequisite relations in textbooks and it has two main research goals. The first goal is to define a methodology for systematically annotating prerequisite relations in textbooks, which is functional for analysing the prerequisite phenomenon and for evaluating and training automatic methods of extraction. The second goal concerns the automatic extraction of prerequisite relations from textbooks. These two research goals will guide towards the design of PRET, i.e. a comprehensive framework for supporting researchers involved in this research issue. The framework described in the present thesis allows indeed researchers to conduct the following tasks: 1) manual annotation of educational texts, in order to create datasets to be used for machine learning algorithms or for evaluation as gold standards; 2) annotation analysis, for investigating inter-annotator agreement, graph metrics and in-context linguistic features; 3) data visualization, for visually exploring datasets and gaining insights of the problem that may lead to improve algorithms; 4) automatic extraction of prerequisite relations. As for the automatic extraction, we developed a method that is based on burst analysis of concepts in the textbook and we used the gold dataset with PR annotation for its evaluation, comparing the method with other metrics for PR extraction
Accessing spoken interaction through dialogue processing [online]
Zusammenfassung
Unser Leben, unsere Leistungen und unsere Umgebung, alles wird
derzeit durch Schriftsprache dokumentiert. Die rasante
Fortentwicklung der technischen Möglichkeiten Audio, Bilder und
Video aufzunehmen, abzuspeichern und wiederzugeben kann genutzt
werden um die schriftliche Dokumentation von menschlicher
Kommunikation, zum Beispiel Meetings, zu unterstĂŒtzen, zu
ergÀnzen oder gar zu ersetzen. Diese neuen Technologien können
uns in die Lage versetzen Information aufzunehmen, die
anderweitig verloren gehen, die Kosten der Dokumentation zu
senken und hochwertige Dokumente mit audiovisuellem Material
anzureichern. Die Indizierung solcher Aufnahmen stellt die
Kerntechnologie dar um dieses Potential auszuschöpfen. Diese
Arbeit stellt effektive Alternativen zu schlĂŒsselwortbasierten
Indizes vor, die SuchraumeinschrÀnkungen bewirken und teilweise
mit einfachen Mitteln zu berechnen sind.
Die Indizierung von Sprachdokumenten kann auf verschiedenen
Ebenen erfolgen: Ein Dokument gehört stilistisch einer
bestimmten Datenbasis an, welche durch sehr einfache Merkmale
bei hoher Genauigkeit automatisch bestimmt werden kann.
Durch diese Art von Klassifikation kann eine Reduktion des
Suchraumes um einen Faktor der GröĂenordnung 4Â10 erfolgen. Die
Anwendung von thematischen Merkmalen zur Textklassifikation
bei einer Nachrichtendatenbank resultiert in einer Reduktion um
einen Faktor 18. Da Sprachdokumente sehr lang sein können mĂŒssen
sie in thematische Segmente unterteilt werden. Ein neuer
probabilistischer Ansatz sowie neue Merkmale (SprecherinitiaÂ
tive und Stil) liefern vergleichbare oder bessere Resultate als
traditionelle schlĂŒsselwortbasierte AnsĂ€tze. Diese thematische
Segmente können durch die vorherrschende AktivitÀt
charakterisiert werden (erzÀhlen, diskutieren, planen, ...),
die durch ein neuronales Netz detektiert werden kann. Die
Detektionsraten sind allerdings begrenzt da auch Menschen
diese AktivitÀten nur ungenau bestimmen. Eine maximale
Reduktion des Suchraumes um den Faktor 6 ist bei den verwendeten
Daten theoretisch möglich. Eine thematische Klassifikation
dieser Segmente wurde ebenfalls auf einer Datenbasis
durchgefĂŒhrt, die Detektionsraten fĂŒr diesen Index sind jedoch
gering.
Auf der Ebene der einzelnen ĂuĂerungen können Dialogakte wie
Aussagen, Fragen, RĂŒckmeldungen (aha, ach ja, echt?, ...) usw.
mit einem diskriminativ trainierten Hidden Markov Model erkannt
werden. Dieses Verfahren kann um die Erkennung von kurzen Folgen
wie Frage/AntwortÂSpielen erweitert werden (Dialogspiele).
Dialogakte und Âspiele können eingesetzt werden um
Klassifikatoren fĂŒr globale Sprechstile zu bauen. Ebenso
könnte ein Benutzer sich an eine bestimmte Dialogaktsequenz
erinnern und versuchen, diese in einer grafischen
ReprÀsentation wiederzufinden.
In einer Studie mit sehr pessimistischen Annahmen konnten
Benutzer eines aus vier Àhnlichen und gleichwahrscheinlichen
GesprÀchen mit einer Genauigkeit von ~ 43% durch eine graphische
ReprÀsentation von AktivitÀt bestimmt.
Dialogakte könnte in diesem Szenario ebenso nĂŒtzlich sein, die
Benutzerstudie konnte aufgrund der geringen Datenmenge darĂŒber
keinen endgĂŒltigen AufschluĂ geben. Die Studie konnte allerdings
fĂŒr detailierte Basismerkmale wie FormalitĂ€t und
SprecheridentitÀt keinen Effekt zeigen.
Abstract
Written language is one of our primary means for documenting our
lives, achievements, and environment. Our capabilities to
record, store and retrieve audio, still pictures, and video are
undergoing a revolution and may support, supplement or even
replace written documentation. This technology enables us to
record information that would otherwise be lost, lower the cost
of documentation and enhance highÂquality documents with
original audiovisual material.
The indexing of the audio material is the key technology to
realize those benefits. This work presents effective
alternatives to keyword based indices which restrict the search
space and may in part be calculated with very limited resources.
Indexing speech documents can be done at a various levels:
Stylistically a document belongs to a certain database which can
be determined automatically with high accuracy using very simple
features. The resulting factor in search space reduction is in
the order of 4Â10 while topic classification yielded a factor
of 18 in a news domain.
Since documents can be very long they need to be segmented into
topical regions. A new probabilistic segmentation framework as
well as new features (speaker initiative and style) prove to be
very effective compared to traditional keyword based methods. At
the topical segment level activities (storytelling, discussing,
planning, ...) can be detected using a machine learning approach
with limited accuracy; however even human annotators do not
annotate them very reliably. A maximum search space reduction
factor of 6 is theoretically possible on the databases used. A
topical classification of these regions has been attempted
on one database, the detection accuracy for that index, however,
was very low.
At the utterance level dialogue acts such as statements,
questions, backchannels (aha, yeah, ...), etc. are being
recognized using a novel discriminatively trained HMM procedure.
The procedure can be extended to recognize short sequences such
as question/answer pairs, so called dialogue games.
Dialog acts and games are useful for building classifiers for
speaking style. Similarily a user may remember a certain dialog
act sequence and may search for it in a graphical
representation.
In a study with very pessimistic assumptions users are able to
pick one out of four similar and equiprobable meetings correctly
with an accuracy ~ 43% using graphical activity information.
Dialogue acts may be useful in this situation as well but the
sample size did not allow to draw final conclusions. However the
user study fails to show any effect for detailed basic features
such as formality or speaker identity
Segmenting Conversations by Topic, Initiative and Style
Topical segmentation is a basic tool for information access to audio records of meetings and other types of speech documents which may be fairly long and contain multiple topics. Standard segmentation algorithms are typically based on keywords, pitch contours or pauses. This work demonstrates that speaker initiative and style may be used as segmentation criteria as well. A probabilistic segmentation procedure is presented which allows the integration and modeling of these features in a clean framework with good results
Toward summarization of communicative activities in spoken conversation
This thesis is an inquiry into the nature and structure of face-to-face conversation, with a
special focus on group meetings in the workplace. I argue that conversations are composed
of episodes, each of which corresponds to an identifiable communicative activity such as
giving instructions or telling a story. These activities are important because they are part
of participantsâ commonsense understanding of what happens in a conversation. They
appear in natural summaries of conversations such as meeting minutes, and participants
talk about them within the conversation itself. Episodic communicative activities therefore
represent an essential component of practical, commonsense descriptions of conversations.
The thesis objective is to provide a deeper understanding of how such activities may be
recognized and differentiated from one another, and to develop a computational method
for doing so automatically. The experiments are thus intended as initial steps toward future
applications that will require analysis of such activities, such as an automatic minute-taker
for workplace meetings, a browser for broadcast news archives, or an automatic decision
mapper for planning interactions.
My main theoretical contribution is to propose a novel analytical framework called participant
relational analysis. The proposal argues that communicative activities are principally
indicated through participant-relational features, i.e., expressions of relationships between
participants and the dialogue. Participant-relational features, such as subjective language,
verbal reference to the participants, and the distribution of speech activity amongst
the participants, are therefore argued to be a principal means for analyzing the nature and
structure of communicative activities.
I then apply the proposed framework to two computational problems: automatic discourse
segmentation and automatic discourse segment labeling. The first set of experiments
test whether participant-relational features can serve as a basis for automatically
segmenting conversations into discourse segments, e.g., activity episodes. Results show
that they are effective across different levels of segmentation and different corpora, and indeed sometimes more effective than the commonly-used method of using semantic links
between content words, i.e., lexical cohesion. They also show that feature performance is
highly dependent on segment type, suggesting that human-annotated âtopic segmentsâ are
in fact a multi-dimensional, heterogeneous collection of topic and activity-oriented units.
Analysis of commonly used evaluation measures, performed in conjunction with the
segmentation experiments, reveals that they fail to penalize substantially defective results
due to inherent biases in the measures. I therefore preface the experiments with a comprehensive
analysis of these biases and a proposal for a novel evaluation measure. A reevaluation
of state-of-the-art segmentation algorithms using the novel measure produces
substantially different results from previous studies. This raises serious questions about the
effectiveness of some state-of-the-art algorithms and helps to identify the most appropriate
ones to employ in the subsequent experiments.
I also preface the experiments with an investigation of participant reference, an important
type of participant-relational feature. I propose an annotation scheme with novel distinctions
for vagueness, discourse function, and addressing-based referent inclusion, each
of which are assessed for inter-coder reliability. The produced dataset includes annotations
of 11,000 occasions of person-referring.
The second set of experiments concern the use of participant-relational features to
automatically identify labels for discourse segments. In contrast to assigning semantic topic
labels, such as topical headlines, the proposed algorithm automatically labels segments
according to activity type, e.g., presentation, discussion, and evaluation. The method is
unsupervised and does not learn from annotated ground truth labels. Rather, it induces the
labels through correlations between discourse segment boundaries and the occurrence of
bracketing meta-discourse, i.e., occasions when the participants talk explicitly about what
has just occurred or what is about to occur. Results show that bracketing meta-discourse
is an effective basis for identifying some labels automatically, but that its use is limited if
global correlations to segment features are not employed.
This thesis addresses important pre-requisites to the automatic summarization of conversation.
What I provide is a novel activity-oriented perspective on how summarization
should be approached, and a novel participant-relational approach to conversational analysis.
The experimental results show that analysis of participant-relational features is