352 research outputs found
Measuring Semantic Textual Similarity and Automatic Answer Assessment in Dialogue Based Tutoring Systems
This dissertation presents methods and resources proposed to improve onmeasuring semantic textual similarity and their applications in student responseunderstanding in dialogue based Intelligent Tutoring Systems. In order to predict the extent of similarity between given pair of sentences,we have proposed machine learning models using dozens of features, such as thescores calculated using optimal multi-level alignment, vector based compositionalsemantics, and machine translation evaluation methods. Furthermore, we haveproposed models towards adding an interpretation layer on top of similaritymeasurement systems. Our models on predicting and interpreting the semanticsimilarity have been the top performing systems in SemEval (a premier venue for thesemantic evaluation) for the last three years. The correlations between our models\u27predictions and the human judgments were above 0.80 for several datasets while ourmodels being very robust than many other top performing systems. Moreover, wehave proposed Bayesian. We have also proposed a novel Neural Network based word representationmapping approach which allows us to map the vector based representation of a wordfound in one model to the another model where the word representation is missing,effectively pooling together the vocabularies and corresponding representationsacross models. Our experiments show that the model coverage increased by few toseveral times depending on which model\u27s vocabulary is taken as a reference. Also,the transformed representations were well correlated to the native target modelvectors showing that the mapped representations can be used with condence tosubstitute the missing word representations in the target model. models to adapt similarity models across domains. Furthermore, we have proposed methods to improve open-ended answersassessment in dialogue based tutoring systems which is very challenging because ofthe variations in student answers which often are not self contained and need thecontextual information (e.g., dialogue history) in order to better assess theircorrectness. In that, we have proposed Probabilistic Soft Logic (PSL) modelsaugmenting semantic similarity information with other knowledge. To detect intra- and inter-sentential negation scope and focus in tutorialdialogs, we have developed Conditional Random Fields (CRF) models. The resultsindicate that our approach is very effective in detecting negation scope and focus intutorial dialogue context and can be further developed to augment the naturallanguage understanding systems. Additionally, we created resources (datasets, models, and tools) for fosteringresearch in semantic similarity and student response understanding inconversational tutoring systems
TOWARDS BUILDING INTELLIGENT COLLABORATIVE PROBLEM SOLVING SYSTEMS
Historically, Collaborative Problem Solving (CPS) systems were more focused on Human Computer Interaction (HCI) issues, such as providing good experience of communication among the participants. Whereas, Intelligent Tutoring Systems (ITS) focus both on HCI issues as well as leveraging Artificial Intelligence (AI) techniques in their intelligent agents. This dissertation seeks to minimize the gap between CPS systems and ITS by adopting the methods used in ITS researches. To move towards this goal, we focus on analyzing interactions with textual inputs in online learning systems such as DeepTutor and Virtual Internships (VI) to understand their semantics and underlying intents. In order to address the problem of assessing the student generated short text, this research explores firstly data driven machine learning models coupled with expert generated as well as general text analysis features. Secondly it explores method to utilize knowledge graph embedding for assessing student answer in ITS. Finally, it also explores a method using only standard reference examples generated by human teacher. Such method is useful when a new system has been deployed and no student data were available.To handle negation in tutorial dialogue, this research explored a Long Short Term Memory (LSTM) based method. The advantage of this method is that it requires no human engineered features and performs comparably well with other models using human engineered features.Another important analysis done in this research is to find speech acts in conversation utterances of multiple players in VI. Among various models, a noise label trained neural network model performed better in categorizing the speech acts of the utterances.The learners\u27 professional skill development in VI is characterized by the distribution of SKIVE elements, the components of epistemic frames. Inferring the population distribution of these elements could help to assess the learners\u27 skill development. This research sought a Markov method to infer the population distribution of SKIVE elements, namely the stationary distribution of the elements.While studying various aspects of interactions in our targeted learning systems, we motivate our research to replace the human mentor or tutor with intelligent agent. Introducing intelligent agent in place of human helps to reduce the cost as well as scale up the system
Deeper Understanding of Tutorial Dialogues and Student Assessment
Bloom (1984) reported two standard deviation improvement with human tutoring which inspired many researchers to develop Intelligent Tutoring Systems (ITSs) that are as effective as human tutoring. However, recent studies suggest that the 2-sigma result was misleading and that current ITSs are as good as human tutors. Nevertheless, we can think of 2 standard deviations as the benchmark for tutoring effectiveness of ideal expert tutors. In the case of ITSs, there is still the possibility that ITSs could be better than humans.One way to improve the ITSs would be identifying, understanding, and then successfully implementing effective tutorial strategies that lead to learning gains. Another step towards improving the effectiveness of ITSs is an accurate assessment of student responses. However, evaluating student answers in tutorial dialogues is challenging. The student answers often refer to the entities in the previous dialogue turns and problem description. Therefore, the student answers should be evaluated by taking dialogue context into account. Moreover, the system should explain which parts of the student answer are correct and which are incorrect. Such explanation capability allows the ITSs to provide targeted feedback to help students reflect upon and correct their knowledge deficits. Furthermore, targeted feedback increases learners\u27 engagement, enabling them to persist in solving the instructional task at hand on their own. In this dissertation, we describe our approach to discover and understand effective tutorial strategies employed by effective human tutors while interacting with learners. We also present various approaches to automatically assess students\u27 contributions using general methods that we developed for semantic analysis of short texts. We explain our work using generic semantic similarity approaches to evaluate the semantic similarity between individual learner contributions and ideal answers provided by experts for target instructional tasks. We also describe our method to assess student performance based on tutorial dialogue context, accounting for linguistic phenomena such as ellipsis and pronouns. We then propose an approach to provide an explanatory capability for assessing student responses. Finally, we recommend a novel method based on concept maps for jointly evaluating and interpreting the correctness of student responses
Advancement Auto-Assessment of Students Knowledge States from Natural Language Input
Knowledge Assessment is a key element in adaptive instructional systems and in particular in Intelligent Tutoring Systems because fully adaptive tutoring presupposes accurate assessment. However, this is a challenging research problem as numerous factors affect students’ knowledge state estimation such as the difficulty level of the problem, time spent in solving the problem, etc. In this research work, we tackle this research problem from three perspectives: assessing the prior knowledge of students, assessing the natural language short and long students’ responses, and knowledge tracing.Prior knowledge assessment is an important component of knowledge assessment as it facilitates the adaptation of the instruction from the very beginning, i.e., when the student starts interacting with the (computer) tutor. Grouping students into groups with similar mental models and patterns of prior level of knowledge allows the system to select the right level of scaffolding for each group of students. While not adapting instruction to each individual learner, the advantage of adapting to groups of students based on a limited number of prior knowledge levels has the advantage of decreasing the authoring costs of the tutoring system. To achieve this goal of identifying or clustering students based on their prior knowledge, we have employed effective clustering algorithms. Automatically assessing open-ended student responses is another challenging aspect of knowledge assessment in ITSs. In dialogue-based ITSs, the main interaction between the learner and the system is natural language dialogue in which students freely respond to various system prompts or initiate dialogue moves in mixed-initiative dialogue systems. Assessing freely generated student responses in such contexts is challenging as students can express the same idea in different ways owing to different individual style preferences and varied individual cognitive abilities. To address this challenging task, we have proposed several novel deep learning models as they are capable to capture rich high-level semantic features of text. Knowledge tracing (KT) is an important type of knowledge assessment which consists of tracking students’ mastery of knowledge over time and predicting their future performances. Despite the state-of-the-art results of deep learning in this task, it has many limitations. For instance, most of the proposed methods ignore pertinent information (e.g., Prior knowledge) that can enhance the knowledge tracing capability and performance. Working toward this objective, we have proposed a generic deep learning framework that accounts for the engagement level of students, the difficulty of questions and the semantics of the questions and uses a novel times series model called Temporal Convolutional Network for future performance prediction. The advanced auto-assessment methods presented in this dissertation should enable better ways to estimate learner’s knowledge states and in turn the adaptive scaffolding those systems can provide which in turn should lead to more effective tutoring and better learning gains for students. Furthermore, the proposed method should enable more scalable development and deployment of ITSs across topics and domains for the benefit of all learners of all ages and backgrounds
Automatic Short Answer Grading Using Transformers
RÉSUMÉ : L’évaluation des réponses courtes en langage naturel est une tendance dominante dans tout environnement éducatif. Ces techniques ont le potentiel d’aider les enseignants à mieux comprendre les réussites et les échecs de leurs élèves. En comparaison, les autres types d’évaluation ne mesurent souvent pas adéquatement les compétences des élèves, telles que les questions à choix multiples ou celles où il faut combler des espaces. Cependant, ce sont les moyens les plus fréquemment utilisés pour évaluer les élèves, en particulier dans les envi-ronnements de cours en ligne ouverts (MOOCs). La raison de leur emploi fréquent est que ces questions sont plus simples à corriger avec un ordinateur. Comparativement, devoir com-prendre et noter manuellement des réponses courtes est une tâche plus diÿcile et plus longue, d’autant plus en considérant le nombre croissant d’élèves en classe. La notation automatique de réponses courtes, généralement abrégée de l’anglais par ASAG, est une solution parfaite-ment adaptée à ce problème. Dans ce mémoire, nous nous concentrons sur le ASAG basé sur la classification avec des notes nominales, telles que correct ou incorrect. Nous proposons une approche par référence basée sur un modèle d’apprentissage profond, que nous entraînons sur quatre ensembles de données ASAG de pointe, à savoir SemEval-2013 (SciEntBank et BEETLE), Dt-grade et un jeu de données sur la biologie. Notre approche utilise les modèles BERT Base (sensible à la casse ou non) et XLNET Base (seulement sensible à la casse). Notre analyse subséquente emploie les ensembles de données GLUE (General Language Un-derstanding Evaluation), incluant des tâches de questions-réponses, d’implication textuelle, d’identification de paraphrases et d’analyse de similitude textuelle sémantique (STS). Nous démontrons que celles-ci contribuent à une meilleure performance des modèles sur la tâche ASAG, surtout avec le jeu de données SciEntBank.---------- ABSTRACT : Assessment of short natural language answers is a prevailing trend in any educational envi-ronment. It helps teachers to understand better the success and failure of students. Other types of questions such as multiple-choice or fill-in-the-gap questions don’t provide adequate clues for evaluating the students’ proficiency exhaustively. However, they are common means of student evaluation especially in Massive Open Online Courses (MOOCs) environments. One of the major reasons is that they are fairly easy to be graded. Nonetheless, understand-ing and marking manually short answers are more challenging and time-consuming tasks, especially when the number of students grows in a class. Automatic Short Answer Grading, usually abbreviated to ASAG, is a highly demanding solution in this current context. In this thesis, we mainly concentrate on classification-based ASAG with nominal grades such as correct or not correct. We propose a reference-based approach based on a deep learn-ing model on four ASAG state-of-the-art datasets, namely SemEval-2013 (SciEntBank and BEETLE), Dt-grade and Biology dataset. Our approach is based on BERT (cased and un-cased) and XLNET (cased) models. Our secondary analysis includes how GLUE (General Language Understanding Evaluation) tasks such as question answering, entailment, para-phrase identification and semantic textual similarity analysis strengthen the ASAG task on SciEntBank dataset. We show that language models based on transformers such as BERT and XLNET outperform or equal the state-of-the-art feature-based approaches. We further indicate that the performance of our BERT model increases substantially when we fine-tune a BERT model on an entailment task such as the GLUE MNLI dataset and then on the ASAG task compared to the other GLUE models
Comprehension based adaptive learning systems
Conversational Intelligent Tutoring Systems aim to mimic the adaptive behaviour
of human tutors by delivering tutorial content as part of a dynamic
exchange of information conducted using natural language.
Deciding when it is beneficial to intervene in a student’s learning process is
an important skill for tutoring. Human tutors use prior knowledge about the
student, discourse content and learner non-verbal behaviour to choose when
intervention will help learners overcome impasse. Experienced human tutors
adapt discourse and pedagogy based on recognition of comprehension and
non-comprehension indicative learner behaviour.
In this research non-verbal behaviour is explored as a method of computationally
analysing reading comprehension so as to equip an intelligent
conversational agent with the human-like ability to estimate comprehension
from non-verbal behaviour as a decision making trigger for feedback, prompts
or hints.
This thesis presents research that combines a conversational intelligent
tutoring system (CITS) with near real-time comprehension classification based
on modelling of e-learner non-verbal behaviour to estimate learner comprehension
during on-screen conversational tutoring and to use comprehension
classifications as a trigger for intervening with hints, prompts or feedback for
the learner.
To improve the effectiveness of tuition in e-learning, this research aims to
design, develop and demonstrate novel computational methods for modelling
e-learner comprehension of on-screen information in near real-time and for adapting CITS tutorial discourse and pedagogy in response to perception of
comprehension indicative behaviour. The contribution of this research is to
detail the motivation for, design of, and evaluation of a system which has the
human-like ability to introduce micro-adaptive feedback into tutorial discourse
in response to automatic perception of e-learner reading comprehension.
This research evaluates empirically whether e-learner non-verbal behaviour
can be modelled to classify comprehension in near real-time and presents a
near real-time comprehension classification system which achieves normalised
comprehension classification accuracy of 75%. Understanding e-learner comprehension
creates exciting opportunities for advanced personalisation of materials,
discourse, challenge and the digital environment itself. The research suggests
a benefit is gained from comprehension based adaptation in conversational
intelligent tutoring systems, with a controlled trial of a comprehension based
adaptive CITS called Hendrix 2.0 showing increases in tutorial assessment scores
of up to 17% when comprehension based discourse adaptation is deployed to
scaffold the learning experience
Short Answer Assessment in Context: The Role of Information Structure
Short Answer Assessment (SAA), the computational task of judging the appro-
priateness of an answer to a question, has received much attention in recent
years (cf., e.g., Dzikovska et al. 2013; Burrows et al. 2015). Most researchers
have approached the problem as one similar to paraphrase recognition (cf.,
e.g., Brockett & Dolan 2005) or textual entailment (Dagan et al., 2006), where
the answer to be evaluated is aligned to another available utterance, such as a
target answer, in a sufficiently abstract way to capture form variation. While
this is a reasonable strategy, it fails to take the explicit context of an answer
into account: the question.
In this thesis, we present an attempt to change this situation by investigating
the role of Information Structure (IS, cf., e.g., Krifka 2007) in SAA. The basic
assumption adapted from IS here will be that the content of a linguistic ex-
pression is structured in a non-arbitrary way depending on its context (here:
the question), and thus it is possible to predetermine to some extent which
part of the expression’s content is relevant. In particular, we will adopt the
Question Under Discussion (QUD) approach advanced by Roberts (2012) where
the information structure of an answer is determined by an explicit or implicit
question in the discourse.
We proceed by first introducing the reader to the necessary prerequisites
in chapters 2 and 3. Since this is a computational linguistics thesis which
is inspired by theoretical linguistic research, we will provide an overview of
relevant work in both areas, discussing SAA and Information Structure (IS) in
sufficient detail, as well as existing attempts at annotating Information Structure
in corpora. After providing the reader with enough background to understand
the remainder of the thesis, we launch into a discussion of which IS notions and
dimensions are most relevant to our goal. We compare the given/new distinction
(information status) to the focus/background distinction and conclude that the
latter is better suited to our needs, as it captures requested information, which
can be either given or new in the context.
In chapter 4, we introduce the empirical basis of this work, the Corpus of
Reading Comprehension Exercises in German (CREG, Ott, Ziai & Meurers
2012). We outline how as a task-based corpus, CREG is particularly suited to
the analysis of language in context, and how it thus forms the basis of our
efforts in SAA and focus detection. Complementing this empirical basis, we
present the SAA system CoMiC in chapter 5, which is used to integrate focus
into SAA in chapter 8.
Chapter 6 then delves into the creation of a gold standard for automatic
focus detection. We describe what the desiderata for such a gold standard are
and how a subset of the CREG corpus is chosen for manual focus annotation.
Having determined these prerequisites, we proceed in detail to our novel
annotation scheme for focus, and its intrinsic evaluation in terms of inter-
annotator agreement. We also discuss explorations of using crowd-sourcing for
focus annotation.
After establishing the data basis, we turn to the task of automatic focus
detection in short answers in chapter 7. We first define the computational
task as classifying whether a given word of an answer is focused or not. We
experiment with several groups of features and explain in detail the motivation
for each: syntax and lexis of the question and the the answer, positional
features and givenness features, taking into account both question and answer
properties. Using the adjudicated gold standard we established in chapter 6, we
show that focus can be detected robustly using these features in a word-based
classifier in comparison to several baselines.
In chapter 8, we describe the integration of focus information into SAA,
which is both an extrinsic testbed for focus annotation and detection per se and
the computational task we originally set out to advance. We show that there
are several possible ways of integrating focus information into an alignment-
based SAA system, and discuss each one’s advantages and disadvantages.
We also experiment with using focus vs. using givenness in alignment before
concluding that a combination of both yields superior overall performance.
Finally, chapter 9 presents a summary of our main research findings along
with the contributions of this thesis. We conclude that analyzing focus in
authentic data is not only possible but necessary for a) developing context-
aware SAA approaches and b) grounding and testing linguistic theory. We give
an outlook on where future research needs to go and what particular avenues
could be explored.Short Answer Assessment (SAA), die computerlinguistische Aufgabe mit dem
Ziel, die Angemessenheit einer Antwort auf eine Frage zu bewerten, ist in
den letzten Jahren viel untersucht worden (siehe z.B. Dzikovska et al. 2013;
Burrows et al. 2015). Meist wird das Problem analog zur Paraphrase Recognition
(siehe z.B. Brockett & Dolan 2005) oder zum Textual Entailment (Dagan et al.,
2006) behandelt, indem die zu bewertende Antwort mit einer Referenzantwort
verglichen wird. Dies ist prinzipiell ein sinnvoller Ansatz, der jedoch den
expliziten Kontext einer Antwort außer Acht lässt: die Frage.
In der vorliegenden Arbeit wird ein Ansatz dargestellt, diesen Stand der
Forschung zu ändern, indem die Rolle der Informationsstruktur (IS, siehe z.B.
Krifka 2007) im SAA untersucht wird. Der Ansatz basiert auf der grundlegen-
den Annahme der IS, dass der Inhalt eines sprachlichen Ausdrucks auf einer
bestimmte Art und Weise durch seinen Kontext (hier: die Frage) strukturiert
wird, und dass man daher bis zu einem gewissen Grad vorhersagen kann,
welcher inhaltliche Teil des Ausdrucks relevant ist. Insbesondere wird der
Question Under Discussion (QUD) Ansatz (Roberts, 2012) ĂĽbernommen, bei
dem die Informationsstruktur einer Antwort durch eine explizite oder implizite
Frage im Diskurs bestimmt wird.
In Kapitel 2 und 3 wird der Leser zunächst in die relevanten wissenschaft-
lichen Bereiche dieser Dissertation eingefĂĽhrt. Da es sich um eine compu-
terlinguistische Arbeit handelt, die von theoretisch-linguistischer Forschung
inspiriert ist, werden sowohl SAA als auch IS in fĂĽr die Arbeit ausreichender
Tiefe diskutiert, sowie ein Überblick über aktuelle Ansätze zur Annotation
von IS-Kategorien gegeben. Anschließend wird erörtert, welche Begriffe und
Unterscheidungen der IS fĂĽr die Ziele dieser Arbeit zentral sind: Ein Vergleich
der given/new-Unterscheidung und der focus/background-Unterscheidung ergibt,
dass letztere das relevantere Kriterium darstellt, da sie erfragte Information
erfasst, welche im Kontext sowohl gegeben als auch neu sein kann.
Kapitel 4 stellt die empirische Basis dieser Arbeit vor, den Corpus of Reading
Comprehension Exercises in German (CREG, Ott, Ziai & Meurers 2012). Es
wird herausgearbeitet, warum ein task-basiertes Korpus wie CREG besonders
geeignet fĂĽr die linguistische Analyse von Sprache im Kontext ist, und dass es
daher die Basis fĂĽr die in dieser Arbeit dargestellten Untersuchungen zu SAA
und zur Fokusanalyse darstellt. Kapitel 5 präsentiert das SAA-System CoMiC
(Meurers, Ziai, Ott & Kopp, 2011b), welches fĂĽr die Integration von Fokus in
SAA in Kapitel 8 verwendet wird.
Kapitel 6 befasst sich mit der Annotation eines Korpus mit dem Ziel der
manuellen und automatischen Fokusanalyse. Es wird diskutiert, auf welchen
Kriterien ein Ansatz zur Annotation von Fokus sinnvoll aufbauen kann, bevor
ein neues Annotationsschema präsentiert und auf einen Teil von CREG ange-
wendet wird. Der Annotationsansatz wird erfolgreich intrinsisch validiert, und
neben Expertenannotation wird auĂźerdem ein Crowdsourcing-Experiment zur
Fokusannotation beschrieben.
Nachdem die Datengrundlage etabliert wurde, wendet sich Kapitel 7 der
automatischen Fokuserkennung in Antworten zu. Nach einem Ăśberblick ĂĽber
bisherige Arbeiten wird zunächst diskutiert, welche relevanten Eigenschaften
von Fragen und Antworten in einem automatischen Ansatz verwendet werden
können. Darauf folgt die Beschreibung eines wortbasierten Modells zur Foku-
serkennung, welches Merkmale der Syntax und Lexis von Frage und Antwort
einbezieht und mehrere Baselines in der Genauigkeit der Klassifikation klar
ĂĽbertrifft.
In Kapitel 8 wird die Integration von Fokusinformation in SAA anhand des
CoMiC-Systems dargestellt, welche sowohl als extrinsische Validierung von
manueller und automatischer Fokusanalyse dient, als auch die computerlin-
guistische Aufgabe darstellt, zu der diese Arbeit einen Beitrag leistet. Fokus
wird als Filter fĂĽr die Zuordnung von Lerner- und Musterantworten in CoMiC
integriert und diese Konfiguration wird benutzt, um den Einfluss von manu-
eller und automatischer Fokusannotation zu untersuchen, was zu positiven
Ergebnissen fĂĽhrt. Es wird auĂźerdem gezeigt, dass eine Kombination von Fokus
und Givenness bei verlässlicher Fokusinformation für bessere Ergebnisse sorgt
als jede Kategorie in Isolation erreichen kann.
SchlieĂźlich gibt Kapitel 9 nochmals einen Ăśberblick ĂĽber den Inhalt der
Arbeit und stellt die Hauptbeiträge heraus. Die Schlussfolgerung ist, dass
Fokusanalyse in authentischen Daten sowohl möglich als auch notwendig ist,
um a) den Kontext in SAA einzubeziehen und b) linguistische Theorien zu IS
zu validieren und zu testen. Basierend auf den Ergebnissen werden mehrere
mögliche Richtungen für zukünftige Forschung aufgezeigt
Accessing spoken interaction through dialogue processing [online]
Zusammenfassung
Unser Leben, unsere Leistungen und unsere Umgebung, alles wird
derzeit durch Schriftsprache dokumentiert. Die rasante
Fortentwicklung der technischen Möglichkeiten Audio, Bilder und
Video aufzunehmen, abzuspeichern und wiederzugeben kann genutzt
werden um die schriftliche Dokumentation von menschlicher
Kommunikation, zum Beispiel Meetings, zu unterstĂĽtzen, zu
ergänzen oder gar zu ersetzen. Diese neuen Technologien können
uns in die Lage versetzen Information aufzunehmen, die
anderweitig verloren gehen, die Kosten der Dokumentation zu
senken und hochwertige Dokumente mit audiovisuellem Material
anzureichern. Die Indizierung solcher Aufnahmen stellt die
Kerntechnologie dar um dieses Potential auszuschöpfen. Diese
Arbeit stellt effektive Alternativen zu schlĂĽsselwortbasierten
Indizes vor, die Suchraumeinschränkungen bewirken und teilweise
mit einfachen Mitteln zu berechnen sind.
Die Indizierung von Sprachdokumenten kann auf verschiedenen
Ebenen erfolgen: Ein Dokument gehört stilistisch einer
bestimmten Datenbasis an, welche durch sehr einfache Merkmale
bei hoher Genauigkeit automatisch bestimmt werden kann.
Durch diese Art von Klassifikation kann eine Reduktion des
Suchraumes um einen Faktor der Größenordnung 4Â10 erfolgen. Die
Anwendung von thematischen Merkmalen zur Textklassifikation
bei einer Nachrichtendatenbank resultiert in einer Reduktion um
einen Faktor 18. Da Sprachdokumente sehr lang sein können müssen
sie in thematische Segmente unterteilt werden. Ein neuer
probabilistischer Ansatz sowie neue Merkmale (SprecherinitiaÂ
tive und Stil) liefern vergleichbare oder bessere Resultate als
traditionelle schlüsselwortbasierte Ansätze. Diese thematische
Segmente können durch die vorherrschende Aktivität
charakterisiert werden (erzählen, diskutieren, planen, ...),
die durch ein neuronales Netz detektiert werden kann. Die
Detektionsraten sind allerdings begrenzt da auch Menschen
diese Aktivitäten nur ungenau bestimmen. Eine maximale
Reduktion des Suchraumes um den Faktor 6 ist bei den verwendeten
Daten theoretisch möglich. Eine thematische Klassifikation
dieser Segmente wurde ebenfalls auf einer Datenbasis
durchgefĂĽhrt, die Detektionsraten fĂĽr diesen Index sind jedoch
gering.
Auf der Ebene der einzelnen Äußerungen können Dialogakte wie
Aussagen, Fragen, RĂĽckmeldungen (aha, ach ja, echt?, ...) usw.
mit einem diskriminativ trainierten Hidden Markov Model erkannt
werden. Dieses Verfahren kann um die Erkennung von kurzen Folgen
wie Frage/AntwortÂSpielen erweitert werden (Dialogspiele).
Dialogakte und Âspiele können eingesetzt werden um
Klassifikatoren fĂĽr globale Sprechstile zu bauen. Ebenso
könnte ein Benutzer sich an eine bestimmte Dialogaktsequenz
erinnern und versuchen, diese in einer grafischen
Repräsentation wiederzufinden.
In einer Studie mit sehr pessimistischen Annahmen konnten
Benutzer eines aus vier ähnlichen und gleichwahrscheinlichen
Gesprächen mit einer Genauigkeit von ~ 43% durch eine graphische
Repräsentation von Aktivität bestimmt.
Dialogakte könnte in diesem Szenario ebenso nützlich sein, die
Benutzerstudie konnte aufgrund der geringen Datenmenge darĂĽber
keinen endgĂĽltigen AufschluĂź geben. Die Studie konnte allerdings
für detailierte Basismerkmale wie Formalität und
Sprecheridentität keinen Effekt zeigen.
Abstract
Written language is one of our primary means for documenting our
lives, achievements, and environment. Our capabilities to
record, store and retrieve audio, still pictures, and video are
undergoing a revolution and may support, supplement or even
replace written documentation. This technology enables us to
record information that would otherwise be lost, lower the cost
of documentation and enhance highÂquality documents with
original audiovisual material.
The indexing of the audio material is the key technology to
realize those benefits. This work presents effective
alternatives to keyword based indices which restrict the search
space and may in part be calculated with very limited resources.
Indexing speech documents can be done at a various levels:
Stylistically a document belongs to a certain database which can
be determined automatically with high accuracy using very simple
features. The resulting factor in search space reduction is in
the order of 4Â10 while topic classification yielded a factor
of 18 in a news domain.
Since documents can be very long they need to be segmented into
topical regions. A new probabilistic segmentation framework as
well as new features (speaker initiative and style) prove to be
very effective compared to traditional keyword based methods. At
the topical segment level activities (storytelling, discussing,
planning, ...) can be detected using a machine learning approach
with limited accuracy; however even human annotators do not
annotate them very reliably. A maximum search space reduction
factor of 6 is theoretically possible on the databases used. A
topical classification of these regions has been attempted
on one database, the detection accuracy for that index, however,
was very low.
At the utterance level dialogue acts such as statements,
questions, backchannels (aha, yeah, ...), etc. are being
recognized using a novel discriminatively trained HMM procedure.
The procedure can be extended to recognize short sequences such
as question/answer pairs, so called dialogue games.
Dialog acts and games are useful for building classifiers for
speaking style. Similarily a user may remember a certain dialog
act sequence and may search for it in a graphical
representation.
In a study with very pessimistic assumptions users are able to
pick one out of four similar and equiprobable meetings correctly
with an accuracy ~ 43% using graphical activity information.
Dialogue acts may be useful in this situation as well but the
sample size did not allow to draw final conclusions. However the
user study fails to show any effect for detailed basic features
such as formality or speaker identity
Proceedings of the Eighth Italian Conference on Computational Linguistics CliC-it 2021
The eighth edition of the Italian Conference on Computational Linguistics (CLiC-it 2021) was held at UniversitĂ degli Studi di Milano-Bicocca from 26th to 28th January 2022. After the edition of 2020, which was held in fully virtual mode due to the health emergency related to Covid-19, CLiC-it 2021 represented the first moment for the Italian research community of Computational Linguistics to meet in person after more than one year of full/partial lockdown
- …