653 research outputs found

    Script acquisition : a crowdsourcing and text mining approach

    Get PDF
    According to Grice’s (1975) theory of pragmatics, people tend to omit basic information when participating in a conversation (or writing a narrative) under the assumption that left out details are already known or can be inferred from commonsense knowledge by the hearer (or reader). Writing and understanding of texts makes particular use of a specific kind of common-sense knowledge, referred to as script knowledge. Schank and Abelson (1977) proposed Scripts as a model of human knowledge represented in memory that stores the frequent habitual activities, called scenarios, (e.g. eating in a fast food restaurant, etc.), and the different courses of action in those routines. This thesis addresses measures to provide a sound empirical basis for high-quality script models. We work on three key areas related to script modeling: script knowledge acquisition, script induction and script identification in text. We extend the existing repository of script knowledge bases in two different ways. First, we crowdsource a corpus of 40 scenarios with 100 event sequence descriptions (ESDs) each, thus going beyond the size of previous script collections. Second, the corpus is enriched with partial alignments of ESDs, done by human annotators. The crowdsourced partial alignments are used as prior knowledge to guide the semi-supervised script-induction algorithm proposed in this dissertation. We further present a semi-supervised clustering approach to induce script structure from crowdsourced descriptions of event sequences by grouping event descriptions into paraphrase sets and inducing their temporal order. The proposed semi-supervised clustering model better handles order variation in scripts and extends script representation formalism, Temporal Script graphs, by incorporating "arbitrary order" equivalence classes in order to allow for the flexible event order inherent in scripts. In the third part of this dissertation, we introduce the task of scenario detection, in which we identify references to scripts in narrative texts. We curate a benchmark dataset of annotated narrative texts, with segments labeled according to the scripts they instantiate. The dataset is the first of its kind. The analysis of the annotation shows that one can identify scenario references in text with reasonable reliability. Subsequently, we proposes a benchmark model that automatically segments and identifies text fragments referring to given scenarios. The proposed model achieved promising results, and therefore opens up research on script parsing and wide coverage script acquisition.Gemäß der Grice’schen (1975) Pragmatiktheorie neigen Menschen dazu, grundlegende Informationen auszulassen, wenn sie an einem Gespräch teilnehmen (oder eine Geschichte schreiben). Dies geschieht unter der Annahme, dass die ausgelassenen Details bereits bekannt sind, oder vom Hörer (oder Leser) aus Weltwissen erschlossen werden können. Besonders beim Schreiben und Verstehen von Text wird Verwendung einer spezifischen Art von solchem Weltwissen gemacht, welches auch Skriptwissen genannt wird. Schank und Abelson (1977) erdachten Skripte als ein Modell menschlichen Wissens, welches im menschlichen Gedächtnis gespeichert ist und häufige Alltags-Aktivitäten sowie deren typischen Ablauf beinhaltet. Solche Skript-Aktivitäten werden auch als Szenarios bezeichnet und umfassen zum Beispiel Im Restaurant Essen etc. Diese Dissertation widmet sich der Bereitstellung einer soliden empirischen Grundlage zur Akquisition qualitativ hochwertigen Skriptwissens. Wir betrachten drei zentrale Aspekte im Bereich der Skriptmodellierung: Akquisition ition von Skriptwissen, Skript-Induktion und Skriptidentifizierung in Text. Wir erweitern das bereits bestehende Repertoire und Skript-Datensätzen in 2 Bereichen. Erstens benutzen wir Crowdsourcing zur Erstellung eines Korpus, das 40 Szenarien mit jeweils 100 Ereignissequenzbeschreibungen (Event Sequence Descriptions, ESDs) beinhaltet, und welches somit größer als bestehende Skript- Datensätze ist. Zweitens erweitern wir das Korpus mit partiellen ESD-Alignierungen, die von Hand annotiert werden. Die partiellen Alignierungen werden dann als Vorwissen für einen halbüberwachten Algorithmus zur Skriptinduktion benutzt, der im Rahmen dieser Dissertation vorgestellt wird. Wir präsentieren außerdem einen halbüberwachten Clusteringansatz zur Induktion von Skripten, basierend auf Ereignissequenzen, die via Crowdsourcing gesammelt wurden. Hierbei werden einzelne Ereignisbeschreibungen gruppiert, um Paraphrasenmengen und der deren temporale Ordnung abzuleiten. Der vorgestellte Clusteringalgorithmus ist im Stande, Variationen in der typischen Reihenfolge in Skripte besser abzubilden und erweitert damit einen Formalismus zur Skriptrepräsentation, temporale Skriptgraphen. Dies wird dadurch bewerkstelligt, dass Equivalenzklassen von Beschreibungen mit "arbiträrer Reihenfolge" genutzt werden, die es erlauben, eine flexible Ereignisordnung abzubilden, die inhärent bei Skripten vorhanden ist. Im dritten Teil der vorliegenden Arbeit führen wir den Task der SzenarioIdentifikation ein, also der automatischen Identifikation von Skriptreferenzen in narrativen Texten. Wir erstellen einen Benchmark-Datensatz mit annotierten narrativen Texten, in denen einzelne Segmente im Bezug auf das Skript, welches sie instantiieren, markiert wurden. Dieser Datensatz ist der erste seiner Art. Eine Analyse der Annotation zeigt, dass Referenzen zu Szenarien im Text mit annehmbarer Akkuratheit vorhergesagt werden können. Zusätzlich stellen wir ein Benchmark-Modell vor, welches Textfragmente automatisch erstellt und deren Szenario identifiziert. Das vorgestellte Modell erreicht erfolgversprechende Resultate und öffnet damit einen Forschungszweig im Bereich des Skript-Parsens und der Skript-Akquisition im großen Stil

    Hollywood in Homes: Crowdsourcing Data Collection for Activity Understanding

    Get PDF
    Computer vision has a great potential to help our daily lives by searching for lost keys, watering flowers or reminding us to take a pill. To succeed with such tasks, computer vision methods need to be trained from real and diverse examples of our daily dynamic scenes. While most of such scenes are not particularly exciting, they typically do not appear on YouTube, in movies or TV broadcasts. So how do we collect sufficiently many diverse but boring samples representing our lives? We propose a novel Hollywood in Homes approach to collect such data. Instead of shooting videos in the lab, we ensure diversity by distributing and crowdsourcing the whole process of video creation from script writing to video recording and annotation. Following this procedure we collect a new dataset, Charades, with hundreds of people recording videos in their own homes, acting out casual everyday activities. The dataset is composed of 9,848 annotated videos with an average length of 30 seconds, showing activities of 267 people from three continents. Each video is annotated by multiple free-text descriptions, action labels, action intervals and classes of interacted objects. In total, Charades provides 27,847 video descriptions, 66,500 temporally localized intervals for 157 action classes and 41,104 labels for 46 object classes. Using this rich data, we evaluate and provide baseline results for several tasks including action recognition and automatic description generation. We believe that the realism, diversity, and casual nature of this dataset will present unique challenges and new opportunities for computer vision community

    Are NLP Models Good at Tracing Thoughts: An Overview of Narrative Understanding

    Full text link
    Narrative understanding involves capturing the author's cognitive processes, providing insights into their knowledge, intentions, beliefs, and desires. Although large language models (LLMs) excel in generating grammatically coherent text, their ability to comprehend the author's thoughts remains uncertain. This limitation hinders the practical applications of narrative understanding. In this paper, we conduct a comprehensive survey of narrative understanding tasks, thoroughly examining their key features, definitions, taxonomy, associated datasets, training objectives, evaluation metrics, and limitations. Furthermore, we explore the potential of expanding the capabilities of modularized LLMs to address novel narrative understanding tasks. By framing narrative understanding as the retrieval of the author's imaginative cues that outline the narrative structure, our study introduces a fresh perspective on enhancing narrative comprehension

    A Systematic Survey of ML Datasets for Prime CV Research Areas-Media and Metadata

    Get PDF
    The ever-growing capabilities of computers have enabled pursuing Computer Vision through Machine Learning (i.e., MLCV). ML tools require large amounts of information to learn from (ML datasets). These are costly to produce but have received reduced attention regarding standardization. This prevents the cooperative production and exploitation of these resources, impedes countless synergies, and hinders ML research. No global view exists of the MLCV dataset tissue. Acquiring it is fundamental to enable standardization. We provide an extensive survey of the evolution and current state of MLCV datasets (1994 to 2019) for a set of specific CV areas as well as a quantitative and qualitative analysis of the results. Data were gathered from online scientific databases (e.g., Google Scholar, CiteSeerX). We reveal the heterogeneous plethora that comprises the MLCV dataset tissue; their continuous growth in volume and complexity; the specificities of the evolution of their media and metadata components regarding a range of aspects; and that MLCV progress requires the construction of a global standardized (structuring, manipulating, and sharing) MLCV "library". Accordingly, we formulate a novel interpretation of this dataset collective as a global tissue of synthetic cognitive visual memories and define the immediately necessary steps to advance its standardization and integration

    Citizen Science and Geospatial Capacity Building

    Get PDF
    This book is a collection of the articles published the Special Issue of ISPRS International Journal of Geo-Information on “Citizen Science and Geospatial Capacity Building”. The articles cover a wide range of topics regarding the applications of citizen science from a geospatial technology perspective. Several applications show the importance of Citizen Science (CitSci) and volunteered geographic information (VGI) in various stages of geodata collection, processing, analysis and visualization; and for demonstrating the capabilities, which are covered in the book. Particular emphasis is given to various problems encountered in the CitSci and VGI projects with a geospatial aspect, such as platform, tool and interface design, ontology development, spatial analysis and data quality assessment. The book also points out the needs and future research directions in these subjects, such as; (a) data quality issues especially in the light of big data; (b) ontology studies for geospatial data suited for diverse user backgrounds, data integration, and sharing; (c) development of machine learning and artificial intelligence based online tools for pattern recognition and object identification using existing repositories of CitSci and VGI projects; and (d) open science and open data practices for increasing the efficiency, decreasing the redundancy, and acknowledgement of all stakeholders
    • …
    corecore