2,049 research outputs found

    Leveraging video annotations in video-based e-learning

    Get PDF
    The e-learning community has been producing and using video content for a long time, and in the last years, the advent of MOOCs greatly relied on video recordings of teacher courses. Video annotations are information pieces that can be anchored in the temporality of the video so as to sustain various processes ranging from active reading to rich media editing. In this position paper we study how video annotations can be used in an e-learning context - especially MOOCs - from the triple point of view of pedagogical processes, current technical platforms functionalities, and current challenges. Our analysis is that there is still plenty of room for leveraging video annotations in MOOCs beyond simple active reading, namely live annotation, performance annotation and annotation for assignment; and that new developments are needed to accompany this evolution.Comment: 7th International Conference on Computer Supported Education (CSEDU), Barcelone : Spain (2014

    TiFi: Taxonomy Induction for Fictional Domains [Extended version]

    No full text
    Taxonomies are important building blocks of structured knowledge bases, and their construction from text sources and Wikipedia has received much attention. In this paper we focus on the construction of taxonomies for fictional domains, using noisy category systems from fan wikis or text extraction as input. Such fictional domains are archetypes of entity universes that are poorly covered by Wikipedia, such as also enterprise-specific knowledge bases or highly specialized verticals. Our fiction-targeted approach, called TiFi, consists of three phases: (i) category cleaning, by identifying candidate categories that truly represent classes in the domain of interest, (ii) edge cleaning, by selecting subcategory relationships that correspond to class subsumption, and (iii) top-level construction, by mapping classes onto a subset of high-level WordNet categories. A comprehensive evaluation shows that TiFi is able to construct taxonomies for a diverse range of fictional domains such as Lord of the Rings, The Simpsons or Greek Mythology with very high precision and that it outperforms state-of-the-art baselines for taxonomy induction by a substantial margin

    The mapKurator System: A Complete Pipeline for Extracting and Linking Text from Historical Maps

    Full text link
    Documents hold spatial focus and valuable locality characteristics. For example, descriptions of listings in real estate or travel blogs contain information about specific local neighborhoods. This information is valuable to characterize how humans perceive their environment. However, the first step to making use of this information is to identify the spatial focus (e.g., a city) of a document. Traditional approaches for identifying the spatial focus of a document rely on detecting and disambiguating toponyms from the document. This approach requires a vocabulary set of location phrases and ad-hoc rules, which ignore important words related to location. Recent topic modeling approaches using large language models often consider a few topics, each with broad coverage. In contrast, the spatial focus of a document can be a country, a city, or even a neighborhood, which together, is much larger than the number of topics considered in these approaches. Additionally, topic modeling methods are often applied to broad topics of news articles where context is easily distinguishable. To identify the geographic focus of a document effectively, we present a simple but effective Joint Embedding of multi-LocaLitY (JELLY), which jointly learns representations with separate encoders of document and location. JELLY significantly outperforms state-of-the-art methods for identifying spatial focus from documents from a number of sources. We also demonstrate case studies on the arithmetic of the learned representations, including identifying cities with similar locality characteristics and zero-shot learning to identify document spatial focus.Comment: 4 pages, 4 figure

    Evaluating the application of semantic inferencing rules to image annotation

    Get PDF
    Semantic annotation of digital objects within large multimedia collections is a difficult and challenging task. We describe a method for semi-automatic annotation of images and apply it to and evaluate it on images of pancreatic cells. By comparing the performance of this approach in the pancreatic cell domain with previous results in the fuel cell domain, we aim to determine characteristics of a domain which indicate that the method will or will not work in that domain. We conclude by describing the types of images and domains in which we can expect satisfactory results with this approach. Copyright 2005 ACM

    Automatic tagging and geotagging in video collections and communities

    Get PDF
    Automatically generated tags and geotags hold great promise to improve access to video collections and online communi- ties. We overview three tasks offered in the MediaEval 2010 benchmarking initiative, for each, describing its use scenario, definition and the data set released. For each task, a reference algorithm is presented that was used within MediaEval 2010 and comments are included on lessons learned. The Tagging Task, Professional involves automatically matching episodes in a collection of Dutch television with subject labels drawn from the keyword thesaurus used by the archive staff. The Tagging Task, Wild Wild Web involves automatically predicting the tags that are assigned by users to their online videos. Finally, the Placing Task requires automatically assigning geo-coordinates to videos. The specification of each task admits the use of the full range of available information including user-generated metadata, speech recognition transcripts, audio, and visual features

    Knowledge-rich Image Gist Understanding Beyond Literal Meaning

    Full text link
    We investigate the problem of understanding the message (gist) conveyed by images and their captions as found, for instance, on websites or news articles. To this end, we propose a methodology to capture the meaning of image-caption pairs on the basis of large amounts of machine-readable knowledge that has previously been shown to be highly effective for text understanding. Our method identifies the connotation of objects beyond their denotation: where most approaches to image understanding focus on the denotation of objects, i.e., their literal meaning, our work addresses the identification of connotations, i.e., iconic meanings of objects, to understand the message of images. We view image understanding as the task of representing an image-caption pair on the basis of a wide-coverage vocabulary of concepts such as the one provided by Wikipedia, and cast gist detection as a concept-ranking problem with image-caption pairs as queries. To enable a thorough investigation of the problem of gist understanding, we produce a gold standard of over 300 image-caption pairs and over 8,000 gist annotations covering a wide variety of topics at different levels of abstraction. We use this dataset to experimentally benchmark the contribution of signals from heterogeneous sources, namely image and text. The best result with a Mean Average Precision (MAP) of 0.69 indicate that by combining both dimensions we are able to better understand the meaning of our image-caption pairs than when using language or vision information alone. We test the robustness of our gist detection approach when receiving automatically generated input, i.e., using automatically generated image tags or generated captions, and prove the feasibility of an end-to-end automated process

    CASAM: Collaborative Human-machine Annotation of Multimedia.

    Get PDF
    The CASAM multimedia annotation system implements a model of cooperative annotation between a human annotator and automated components. The aim is that they work asynchronously but together. The system focuses upon the areas where automated recognition and reasoning are most effective and the user is able to work in the areas where their unique skills are required. The system’s reasoning is influenced by the annotations provided by the user and, similarly, the user can see the system’s work and modify and, implicitly, direct it. The CASAM system interacts with the user by providing a window onto the current state of annotation, and by generating requests for information which are important for the final annotation or to constrain its reasoning. The user can modify the annotation, respond to requests and also add their own annotations. The objective is that the human annotator’s time is used more effectively and that the result is an annotation that is both of higher quality and produced more quickly. This can be especially important in circumstances where the annotator has a very restricted amount of time in which to annotate the document. In this paper we describe our prototype system. We expand upon the techniques used for automatically analysing the multimedia document, for reasoning over the annotations generated and for the generation of an effective interaction with the end-user. We also present the results of evaluations undertaken with media professionals in order to validate the approach and gain feedback to drive further research
    corecore