352 research outputs found

    Measuring Semantic Textual Similarity and Automatic Answer Assessment in Dialogue Based Tutoring Systems

    Get PDF
    This dissertation presents methods and resources proposed to improve onmeasuring semantic textual similarity and their applications in student responseunderstanding in dialogue based Intelligent Tutoring Systems. In order to predict the extent of similarity between given pair of sentences,we have proposed machine learning models using dozens of features, such as thescores calculated using optimal multi-level alignment, vector based compositionalsemantics, and machine translation evaluation methods. Furthermore, we haveproposed models towards adding an interpretation layer on top of similaritymeasurement systems. Our models on predicting and interpreting the semanticsimilarity have been the top performing systems in SemEval (a premier venue for thesemantic evaluation) for the last three years. The correlations between our models\u27predictions and the human judgments were above 0.80 for several datasets while ourmodels being very robust than many other top performing systems. Moreover, wehave proposed Bayesian. We have also proposed a novel Neural Network based word representationmapping approach which allows us to map the vector based representation of a wordfound in one model to the another model where the word representation is missing,effectively pooling together the vocabularies and corresponding representationsacross models. Our experiments show that the model coverage increased by few toseveral times depending on which model\u27s vocabulary is taken as a reference. Also,the transformed representations were well correlated to the native target modelvectors showing that the mapped representations can be used with condence tosubstitute the missing word representations in the target model. models to adapt similarity models across domains. Furthermore, we have proposed methods to improve open-ended answersassessment in dialogue based tutoring systems which is very challenging because ofthe variations in student answers which often are not self contained and need thecontextual information (e.g., dialogue history) in order to better assess theircorrectness. In that, we have proposed Probabilistic Soft Logic (PSL) modelsaugmenting semantic similarity information with other knowledge. To detect intra- and inter-sentential negation scope and focus in tutorialdialogs, we have developed Conditional Random Fields (CRF) models. The resultsindicate that our approach is very effective in detecting negation scope and focus intutorial dialogue context and can be further developed to augment the naturallanguage understanding systems. Additionally, we created resources (datasets, models, and tools) for fosteringresearch in semantic similarity and student response understanding inconversational tutoring systems

    TOWARDS BUILDING INTELLIGENT COLLABORATIVE PROBLEM SOLVING SYSTEMS

    Get PDF
    Historically, Collaborative Problem Solving (CPS) systems were more focused on Human Computer Interaction (HCI) issues, such as providing good experience of communication among the participants. Whereas, Intelligent Tutoring Systems (ITS) focus both on HCI issues as well as leveraging Artificial Intelligence (AI) techniques in their intelligent agents. This dissertation seeks to minimize the gap between CPS systems and ITS by adopting the methods used in ITS researches. To move towards this goal, we focus on analyzing interactions with textual inputs in online learning systems such as DeepTutor and Virtual Internships (VI) to understand their semantics and underlying intents. In order to address the problem of assessing the student generated short text, this research explores firstly data driven machine learning models coupled with expert generated as well as general text analysis features. Secondly it explores method to utilize knowledge graph embedding for assessing student answer in ITS. Finally, it also explores a method using only standard reference examples generated by human teacher. Such method is useful when a new system has been deployed and no student data were available.To handle negation in tutorial dialogue, this research explored a Long Short Term Memory (LSTM) based method. The advantage of this method is that it requires no human engineered features and performs comparably well with other models using human engineered features.Another important analysis done in this research is to find speech acts in conversation utterances of multiple players in VI. Among various models, a noise label trained neural network model performed better in categorizing the speech acts of the utterances.The learners\u27 professional skill development in VI is characterized by the distribution of SKIVE elements, the components of epistemic frames. Inferring the population distribution of these elements could help to assess the learners\u27 skill development. This research sought a Markov method to infer the population distribution of SKIVE elements, namely the stationary distribution of the elements.While studying various aspects of interactions in our targeted learning systems, we motivate our research to replace the human mentor or tutor with intelligent agent. Introducing intelligent agent in place of human helps to reduce the cost as well as scale up the system

    Deeper Understanding of Tutorial Dialogues and Student Assessment

    Get PDF
    Bloom (1984) reported two standard deviation improvement with human tutoring which inspired many researchers to develop Intelligent Tutoring Systems (ITSs) that are as effective as human tutoring. However, recent studies suggest that the 2-sigma result was misleading and that current ITSs are as good as human tutors. Nevertheless, we can think of 2 standard deviations as the benchmark for tutoring effectiveness of ideal expert tutors. In the case of ITSs, there is still the possibility that ITSs could be better than humans.One way to improve the ITSs would be identifying, understanding, and then successfully implementing effective tutorial strategies that lead to learning gains. Another step towards improving the effectiveness of ITSs is an accurate assessment of student responses. However, evaluating student answers in tutorial dialogues is challenging. The student answers often refer to the entities in the previous dialogue turns and problem description. Therefore, the student answers should be evaluated by taking dialogue context into account. Moreover, the system should explain which parts of the student answer are correct and which are incorrect. Such explanation capability allows the ITSs to provide targeted feedback to help students reflect upon and correct their knowledge deficits. Furthermore, targeted feedback increases learners\u27 engagement, enabling them to persist in solving the instructional task at hand on their own. In this dissertation, we describe our approach to discover and understand effective tutorial strategies employed by effective human tutors while interacting with learners. We also present various approaches to automatically assess students\u27 contributions using general methods that we developed for semantic analysis of short texts. We explain our work using generic semantic similarity approaches to evaluate the semantic similarity between individual learner contributions and ideal answers provided by experts for target instructional tasks. We also describe our method to assess student performance based on tutorial dialogue context, accounting for linguistic phenomena such as ellipsis and pronouns. We then propose an approach to provide an explanatory capability for assessing student responses. Finally, we recommend a novel method based on concept maps for jointly evaluating and interpreting the correctness of student responses

    Advancement Auto-Assessment of Students Knowledge States from Natural Language Input

    Get PDF
    Knowledge Assessment is a key element in adaptive instructional systems and in particular in Intelligent Tutoring Systems because fully adaptive tutoring presupposes accurate assessment. However, this is a challenging research problem as numerous factors affect students’ knowledge state estimation such as the difficulty level of the problem, time spent in solving the problem, etc. In this research work, we tackle this research problem from three perspectives: assessing the prior knowledge of students, assessing the natural language short and long students’ responses, and knowledge tracing.Prior knowledge assessment is an important component of knowledge assessment as it facilitates the adaptation of the instruction from the very beginning, i.e., when the student starts interacting with the (computer) tutor. Grouping students into groups with similar mental models and patterns of prior level of knowledge allows the system to select the right level of scaffolding for each group of students. While not adapting instruction to each individual learner, the advantage of adapting to groups of students based on a limited number of prior knowledge levels has the advantage of decreasing the authoring costs of the tutoring system. To achieve this goal of identifying or clustering students based on their prior knowledge, we have employed effective clustering algorithms. Automatically assessing open-ended student responses is another challenging aspect of knowledge assessment in ITSs. In dialogue-based ITSs, the main interaction between the learner and the system is natural language dialogue in which students freely respond to various system prompts or initiate dialogue moves in mixed-initiative dialogue systems. Assessing freely generated student responses in such contexts is challenging as students can express the same idea in different ways owing to different individual style preferences and varied individual cognitive abilities. To address this challenging task, we have proposed several novel deep learning models as they are capable to capture rich high-level semantic features of text. Knowledge tracing (KT) is an important type of knowledge assessment which consists of tracking students’ mastery of knowledge over time and predicting their future performances. Despite the state-of-the-art results of deep learning in this task, it has many limitations. For instance, most of the proposed methods ignore pertinent information (e.g., Prior knowledge) that can enhance the knowledge tracing capability and performance. Working toward this objective, we have proposed a generic deep learning framework that accounts for the engagement level of students, the difficulty of questions and the semantics of the questions and uses a novel times series model called Temporal Convolutional Network for future performance prediction. The advanced auto-assessment methods presented in this dissertation should enable better ways to estimate learner’s knowledge states and in turn the adaptive scaffolding those systems can provide which in turn should lead to more effective tutoring and better learning gains for students. Furthermore, the proposed method should enable more scalable development and deployment of ITSs across topics and domains for the benefit of all learners of all ages and backgrounds

    Automatic Short Answer Grading Using Transformers

    Get PDF
    RÉSUMÉ : L’évaluation des réponses courtes en langage naturel est une tendance dominante dans tout environnement éducatif. Ces techniques ont le potentiel d’aider les enseignants à mieux comprendre les réussites et les échecs de leurs élèves. En comparaison, les autres types d’évaluation ne mesurent souvent pas adéquatement les compétences des élèves, telles que les questions à choix multiples ou celles où il faut combler des espaces. Cependant, ce sont les moyens les plus fréquemment utilisés pour évaluer les élèves, en particulier dans les envi-ronnements de cours en ligne ouverts (MOOCs). La raison de leur emploi fréquent est que ces questions sont plus simples à corriger avec un ordinateur. Comparativement, devoir com-prendre et noter manuellement des réponses courtes est une tâche plus diÿcile et plus longue, d’autant plus en considérant le nombre croissant d’élèves en classe. La notation automatique de réponses courtes, généralement abrégée de l’anglais par ASAG, est une solution parfaite-ment adaptée à ce problème. Dans ce mémoire, nous nous concentrons sur le ASAG basé sur la classification avec des notes nominales, telles que correct ou incorrect. Nous proposons une approche par référence basée sur un modèle d’apprentissage profond, que nous entraînons sur quatre ensembles de données ASAG de pointe, à savoir SemEval-2013 (SciEntBank et BEETLE), Dt-grade et un jeu de données sur la biologie. Notre approche utilise les modèles BERT Base (sensible à la casse ou non) et XLNET Base (seulement sensible à la casse). Notre analyse subséquente emploie les ensembles de données GLUE (General Language Un-derstanding Evaluation), incluant des tâches de questions-réponses, d’implication textuelle, d’identification de paraphrases et d’analyse de similitude textuelle sémantique (STS). Nous démontrons que celles-ci contribuent à une meilleure performance des modèles sur la tâche ASAG, surtout avec le jeu de données SciEntBank.---------- ABSTRACT : Assessment of short natural language answers is a prevailing trend in any educational envi-ronment. It helps teachers to understand better the success and failure of students. Other types of questions such as multiple-choice or fill-in-the-gap questions don’t provide adequate clues for evaluating the students’ proficiency exhaustively. However, they are common means of student evaluation especially in Massive Open Online Courses (MOOCs) environments. One of the major reasons is that they are fairly easy to be graded. Nonetheless, understand-ing and marking manually short answers are more challenging and time-consuming tasks, especially when the number of students grows in a class. Automatic Short Answer Grading, usually abbreviated to ASAG, is a highly demanding solution in this current context. In this thesis, we mainly concentrate on classification-based ASAG with nominal grades such as correct or not correct. We propose a reference-based approach based on a deep learn-ing model on four ASAG state-of-the-art datasets, namely SemEval-2013 (SciEntBank and BEETLE), Dt-grade and Biology dataset. Our approach is based on BERT (cased and un-cased) and XLNET (cased) models. Our secondary analysis includes how GLUE (General Language Understanding Evaluation) tasks such as question answering, entailment, para-phrase identification and semantic textual similarity analysis strengthen the ASAG task on SciEntBank dataset. We show that language models based on transformers such as BERT and XLNET outperform or equal the state-of-the-art feature-based approaches. We further indicate that the performance of our BERT model increases substantially when we fine-tune a BERT model on an entailment task such as the GLUE MNLI dataset and then on the ASAG task compared to the other GLUE models

    Comprehension based adaptive learning systems

    Get PDF
    Conversational Intelligent Tutoring Systems aim to mimic the adaptive behaviour of human tutors by delivering tutorial content as part of a dynamic exchange of information conducted using natural language. Deciding when it is beneficial to intervene in a student’s learning process is an important skill for tutoring. Human tutors use prior knowledge about the student, discourse content and learner non-verbal behaviour to choose when intervention will help learners overcome impasse. Experienced human tutors adapt discourse and pedagogy based on recognition of comprehension and non-comprehension indicative learner behaviour. In this research non-verbal behaviour is explored as a method of computationally analysing reading comprehension so as to equip an intelligent conversational agent with the human-like ability to estimate comprehension from non-verbal behaviour as a decision making trigger for feedback, prompts or hints. This thesis presents research that combines a conversational intelligent tutoring system (CITS) with near real-time comprehension classification based on modelling of e-learner non-verbal behaviour to estimate learner comprehension during on-screen conversational tutoring and to use comprehension classifications as a trigger for intervening with hints, prompts or feedback for the learner. To improve the effectiveness of tuition in e-learning, this research aims to design, develop and demonstrate novel computational methods for modelling e-learner comprehension of on-screen information in near real-time and for adapting CITS tutorial discourse and pedagogy in response to perception of comprehension indicative behaviour. The contribution of this research is to detail the motivation for, design of, and evaluation of a system which has the human-like ability to introduce micro-adaptive feedback into tutorial discourse in response to automatic perception of e-learner reading comprehension. This research evaluates empirically whether e-learner non-verbal behaviour can be modelled to classify comprehension in near real-time and presents a near real-time comprehension classification system which achieves normalised comprehension classification accuracy of 75%. Understanding e-learner comprehension creates exciting opportunities for advanced personalisation of materials, discourse, challenge and the digital environment itself. The research suggests a benefit is gained from comprehension based adaptation in conversational intelligent tutoring systems, with a controlled trial of a comprehension based adaptive CITS called Hendrix 2.0 showing increases in tutorial assessment scores of up to 17% when comprehension based discourse adaptation is deployed to scaffold the learning experience

    Short Answer Assessment in Context: The Role of Information Structure

    Get PDF
    Short Answer Assessment (SAA), the computational task of judging the appro- priateness of an answer to a question, has received much attention in recent years (cf., e.g., Dzikovska et al. 2013; Burrows et al. 2015). Most researchers have approached the problem as one similar to paraphrase recognition (cf., e.g., Brockett & Dolan 2005) or textual entailment (Dagan et al., 2006), where the answer to be evaluated is aligned to another available utterance, such as a target answer, in a sufficiently abstract way to capture form variation. While this is a reasonable strategy, it fails to take the explicit context of an answer into account: the question. In this thesis, we present an attempt to change this situation by investigating the role of Information Structure (IS, cf., e.g., Krifka 2007) in SAA. The basic assumption adapted from IS here will be that the content of a linguistic ex- pression is structured in a non-arbitrary way depending on its context (here: the question), and thus it is possible to predetermine to some extent which part of the expression’s content is relevant. In particular, we will adopt the Question Under Discussion (QUD) approach advanced by Roberts (2012) where the information structure of an answer is determined by an explicit or implicit question in the discourse. We proceed by first introducing the reader to the necessary prerequisites in chapters 2 and 3. Since this is a computational linguistics thesis which is inspired by theoretical linguistic research, we will provide an overview of relevant work in both areas, discussing SAA and Information Structure (IS) in sufficient detail, as well as existing attempts at annotating Information Structure in corpora. After providing the reader with enough background to understand the remainder of the thesis, we launch into a discussion of which IS notions and dimensions are most relevant to our goal. We compare the given/new distinction (information status) to the focus/background distinction and conclude that the latter is better suited to our needs, as it captures requested information, which can be either given or new in the context. In chapter 4, we introduce the empirical basis of this work, the Corpus of Reading Comprehension Exercises in German (CREG, Ott, Ziai & Meurers 2012). We outline how as a task-based corpus, CREG is particularly suited to the analysis of language in context, and how it thus forms the basis of our efforts in SAA and focus detection. Complementing this empirical basis, we present the SAA system CoMiC in chapter 5, which is used to integrate focus into SAA in chapter 8. Chapter 6 then delves into the creation of a gold standard for automatic focus detection. We describe what the desiderata for such a gold standard are and how a subset of the CREG corpus is chosen for manual focus annotation. Having determined these prerequisites, we proceed in detail to our novel annotation scheme for focus, and its intrinsic evaluation in terms of inter- annotator agreement. We also discuss explorations of using crowd-sourcing for focus annotation. After establishing the data basis, we turn to the task of automatic focus detection in short answers in chapter 7. We first define the computational task as classifying whether a given word of an answer is focused or not. We experiment with several groups of features and explain in detail the motivation for each: syntax and lexis of the question and the the answer, positional features and givenness features, taking into account both question and answer properties. Using the adjudicated gold standard we established in chapter 6, we show that focus can be detected robustly using these features in a word-based classifier in comparison to several baselines. In chapter 8, we describe the integration of focus information into SAA, which is both an extrinsic testbed for focus annotation and detection per se and the computational task we originally set out to advance. We show that there are several possible ways of integrating focus information into an alignment- based SAA system, and discuss each one’s advantages and disadvantages. We also experiment with using focus vs. using givenness in alignment before concluding that a combination of both yields superior overall performance. Finally, chapter 9 presents a summary of our main research findings along with the contributions of this thesis. We conclude that analyzing focus in authentic data is not only possible but necessary for a) developing context- aware SAA approaches and b) grounding and testing linguistic theory. We give an outlook on where future research needs to go and what particular avenues could be explored.Short Answer Assessment (SAA), die computerlinguistische Aufgabe mit dem Ziel, die Angemessenheit einer Antwort auf eine Frage zu bewerten, ist in den letzten Jahren viel untersucht worden (siehe z.B. Dzikovska et al. 2013; Burrows et al. 2015). Meist wird das Problem analog zur Paraphrase Recognition (siehe z.B. Brockett & Dolan 2005) oder zum Textual Entailment (Dagan et al., 2006) behandelt, indem die zu bewertende Antwort mit einer Referenzantwort verglichen wird. Dies ist prinzipiell ein sinnvoller Ansatz, der jedoch den expliziten Kontext einer Antwort außer Acht lässt: die Frage. In der vorliegenden Arbeit wird ein Ansatz dargestellt, diesen Stand der Forschung zu ändern, indem die Rolle der Informationsstruktur (IS, siehe z.B. Krifka 2007) im SAA untersucht wird. Der Ansatz basiert auf der grundlegen- den Annahme der IS, dass der Inhalt eines sprachlichen Ausdrucks auf einer bestimmte Art und Weise durch seinen Kontext (hier: die Frage) strukturiert wird, und dass man daher bis zu einem gewissen Grad vorhersagen kann, welcher inhaltliche Teil des Ausdrucks relevant ist. Insbesondere wird der Question Under Discussion (QUD) Ansatz (Roberts, 2012) übernommen, bei dem die Informationsstruktur einer Antwort durch eine explizite oder implizite Frage im Diskurs bestimmt wird. In Kapitel 2 und 3 wird der Leser zunächst in die relevanten wissenschaft- lichen Bereiche dieser Dissertation eingeführt. Da es sich um eine compu- terlinguistische Arbeit handelt, die von theoretisch-linguistischer Forschung inspiriert ist, werden sowohl SAA als auch IS in für die Arbeit ausreichender Tiefe diskutiert, sowie ein Überblick über aktuelle Ansätze zur Annotation von IS-Kategorien gegeben. Anschließend wird erörtert, welche Begriffe und Unterscheidungen der IS für die Ziele dieser Arbeit zentral sind: Ein Vergleich der given/new-Unterscheidung und der focus/background-Unterscheidung ergibt, dass letztere das relevantere Kriterium darstellt, da sie erfragte Information erfasst, welche im Kontext sowohl gegeben als auch neu sein kann. Kapitel 4 stellt die empirische Basis dieser Arbeit vor, den Corpus of Reading Comprehension Exercises in German (CREG, Ott, Ziai & Meurers 2012). Es wird herausgearbeitet, warum ein task-basiertes Korpus wie CREG besonders geeignet für die linguistische Analyse von Sprache im Kontext ist, und dass es daher die Basis für die in dieser Arbeit dargestellten Untersuchungen zu SAA und zur Fokusanalyse darstellt. Kapitel 5 präsentiert das SAA-System CoMiC (Meurers, Ziai, Ott & Kopp, 2011b), welches für die Integration von Fokus in SAA in Kapitel 8 verwendet wird. Kapitel 6 befasst sich mit der Annotation eines Korpus mit dem Ziel der manuellen und automatischen Fokusanalyse. Es wird diskutiert, auf welchen Kriterien ein Ansatz zur Annotation von Fokus sinnvoll aufbauen kann, bevor ein neues Annotationsschema präsentiert und auf einen Teil von CREG ange- wendet wird. Der Annotationsansatz wird erfolgreich intrinsisch validiert, und neben Expertenannotation wird außerdem ein Crowdsourcing-Experiment zur Fokusannotation beschrieben. Nachdem die Datengrundlage etabliert wurde, wendet sich Kapitel 7 der automatischen Fokuserkennung in Antworten zu. Nach einem Überblick über bisherige Arbeiten wird zunächst diskutiert, welche relevanten Eigenschaften von Fragen und Antworten in einem automatischen Ansatz verwendet werden können. Darauf folgt die Beschreibung eines wortbasierten Modells zur Foku- serkennung, welches Merkmale der Syntax und Lexis von Frage und Antwort einbezieht und mehrere Baselines in der Genauigkeit der Klassifikation klar übertrifft. In Kapitel 8 wird die Integration von Fokusinformation in SAA anhand des CoMiC-Systems dargestellt, welche sowohl als extrinsische Validierung von manueller und automatischer Fokusanalyse dient, als auch die computerlin- guistische Aufgabe darstellt, zu der diese Arbeit einen Beitrag leistet. Fokus wird als Filter für die Zuordnung von Lerner- und Musterantworten in CoMiC integriert und diese Konfiguration wird benutzt, um den Einfluss von manu- eller und automatischer Fokusannotation zu untersuchen, was zu positiven Ergebnissen führt. Es wird außerdem gezeigt, dass eine Kombination von Fokus und Givenness bei verlässlicher Fokusinformation für bessere Ergebnisse sorgt als jede Kategorie in Isolation erreichen kann. Schließlich gibt Kapitel 9 nochmals einen Überblick über den Inhalt der Arbeit und stellt die Hauptbeiträge heraus. Die Schlussfolgerung ist, dass Fokusanalyse in authentischen Daten sowohl möglich als auch notwendig ist, um a) den Kontext in SAA einzubeziehen und b) linguistische Theorien zu IS zu validieren und zu testen. Basierend auf den Ergebnissen werden mehrere mögliche Richtungen für zukünftige Forschung aufgezeigt

    Accessing spoken interaction through dialogue processing [online]

    Get PDF
    Zusammenfassung Unser Leben, unsere Leistungen und unsere Umgebung, alles wird derzeit durch Schriftsprache dokumentiert. Die rasante Fortentwicklung der technischen Möglichkeiten Audio, Bilder und Video aufzunehmen, abzuspeichern und wiederzugeben kann genutzt werden um die schriftliche Dokumentation von menschlicher Kommunikation, zum Beispiel Meetings, zu unterstützen, zu ergänzen oder gar zu ersetzen. Diese neuen Technologien können uns in die Lage versetzen Information aufzunehmen, die anderweitig verloren gehen, die Kosten der Dokumentation zu senken und hochwertige Dokumente mit audiovisuellem Material anzureichern. Die Indizierung solcher Aufnahmen stellt die Kerntechnologie dar um dieses Potential auszuschöpfen. Diese Arbeit stellt effektive Alternativen zu schlüsselwortbasierten Indizes vor, die Suchraumeinschränkungen bewirken und teilweise mit einfachen Mitteln zu berechnen sind. Die Indizierung von Sprachdokumenten kann auf verschiedenen Ebenen erfolgen: Ein Dokument gehört stilistisch einer bestimmten Datenbasis an, welche durch sehr einfache Merkmale bei hoher Genauigkeit automatisch bestimmt werden kann. Durch diese Art von Klassifikation kann eine Reduktion des Suchraumes um einen Faktor der Größenordnung 4­10 erfolgen. Die Anwendung von thematischen Merkmalen zur Textklassifikation bei einer Nachrichtendatenbank resultiert in einer Reduktion um einen Faktor 18. Da Sprachdokumente sehr lang sein können müssen sie in thematische Segmente unterteilt werden. Ein neuer probabilistischer Ansatz sowie neue Merkmale (Sprecherinitia­ tive und Stil) liefern vergleichbare oder bessere Resultate als traditionelle schlüsselwortbasierte Ansätze. Diese thematische Segmente können durch die vorherrschende Aktivität charakterisiert werden (erzählen, diskutieren, planen, ...), die durch ein neuronales Netz detektiert werden kann. Die Detektionsraten sind allerdings begrenzt da auch Menschen diese Aktivitäten nur ungenau bestimmen. Eine maximale Reduktion des Suchraumes um den Faktor 6 ist bei den verwendeten Daten theoretisch möglich. Eine thematische Klassifikation dieser Segmente wurde ebenfalls auf einer Datenbasis durchgeführt, die Detektionsraten für diesen Index sind jedoch gering. Auf der Ebene der einzelnen Äußerungen können Dialogakte wie Aussagen, Fragen, Rückmeldungen (aha, ach ja, echt?, ...) usw. mit einem diskriminativ trainierten Hidden Markov Model erkannt werden. Dieses Verfahren kann um die Erkennung von kurzen Folgen wie Frage/Antwort­Spielen erweitert werden (Dialogspiele). Dialogakte und ­spiele können eingesetzt werden um Klassifikatoren für globale Sprechstile zu bauen. Ebenso könnte ein Benutzer sich an eine bestimmte Dialogaktsequenz erinnern und versuchen, diese in einer grafischen Repräsentation wiederzufinden. In einer Studie mit sehr pessimistischen Annahmen konnten Benutzer eines aus vier ähnlichen und gleichwahrscheinlichen Gesprächen mit einer Genauigkeit von ~ 43% durch eine graphische Repräsentation von Aktivität bestimmt. Dialogakte könnte in diesem Szenario ebenso nützlich sein, die Benutzerstudie konnte aufgrund der geringen Datenmenge darüber keinen endgültigen Aufschluß geben. Die Studie konnte allerdings für detailierte Basismerkmale wie Formalität und Sprecheridentität keinen Effekt zeigen. Abstract Written language is one of our primary means for documenting our lives, achievements, and environment. Our capabilities to record, store and retrieve audio, still pictures, and video are undergoing a revolution and may support, supplement or even replace written documentation. This technology enables us to record information that would otherwise be lost, lower the cost of documentation and enhance high­quality documents with original audiovisual material. The indexing of the audio material is the key technology to realize those benefits. This work presents effective alternatives to keyword based indices which restrict the search space and may in part be calculated with very limited resources. Indexing speech documents can be done at a various levels: Stylistically a document belongs to a certain database which can be determined automatically with high accuracy using very simple features. The resulting factor in search space reduction is in the order of 4­10 while topic classification yielded a factor of 18 in a news domain. Since documents can be very long they need to be segmented into topical regions. A new probabilistic segmentation framework as well as new features (speaker initiative and style) prove to be very effective compared to traditional keyword based methods. At the topical segment level activities (storytelling, discussing, planning, ...) can be detected using a machine learning approach with limited accuracy; however even human annotators do not annotate them very reliably. A maximum search space reduction factor of 6 is theoretically possible on the databases used. A topical classification of these regions has been attempted on one database, the detection accuracy for that index, however, was very low. At the utterance level dialogue acts such as statements, questions, backchannels (aha, yeah, ...), etc. are being recognized using a novel discriminatively trained HMM procedure. The procedure can be extended to recognize short sequences such as question/answer pairs, so called dialogue games. Dialog acts and games are useful for building classifiers for speaking style. Similarily a user may remember a certain dialog act sequence and may search for it in a graphical representation. In a study with very pessimistic assumptions users are able to pick one out of four similar and equiprobable meetings correctly with an accuracy ~ 43% using graphical activity information. Dialogue acts may be useful in this situation as well but the sample size did not allow to draw final conclusions. However the user study fails to show any effect for detailed basic features such as formality or speaker identity

    Proceedings of the Eighth Italian Conference on Computational Linguistics CliC-it 2021

    Get PDF
    The eighth edition of the Italian Conference on Computational Linguistics (CLiC-it 2021) was held at UniversitĂ  degli Studi di Milano-Bicocca from 26th to 28th January 2022. After the edition of 2020, which was held in fully virtual mode due to the health emergency related to Covid-19, CLiC-it 2021 represented the first moment for the Italian research community of Computational Linguistics to meet in person after more than one year of full/partial lockdown
    • …
    corecore