2,049 research outputs found
Leveraging video annotations in video-based e-learning
The e-learning community has been producing and using video content for a
long time, and in the last years, the advent of MOOCs greatly relied on video
recordings of teacher courses. Video annotations are information pieces that
can be anchored in the temporality of the video so as to sustain various
processes ranging from active reading to rich media editing. In this position
paper we study how video annotations can be used in an e-learning context -
especially MOOCs - from the triple point of view of pedagogical processes,
current technical platforms functionalities, and current challenges. Our
analysis is that there is still plenty of room for leveraging video annotations
in MOOCs beyond simple active reading, namely live annotation, performance
annotation and annotation for assignment; and that new developments are needed
to accompany this evolution.Comment: 7th International Conference on Computer Supported Education (CSEDU),
Barcelone : Spain (2014
TiFi: Taxonomy Induction for Fictional Domains [Extended version]
Taxonomies are important building blocks of structured knowledge bases, and their construction from text sources and Wikipedia has received much attention. In this paper we focus on the construction of taxonomies for fictional domains, using noisy category systems from fan wikis or text extraction as input. Such fictional domains are archetypes of entity universes that are poorly covered by Wikipedia, such as also enterprise-specific knowledge bases or highly specialized verticals. Our fiction-targeted approach, called TiFi, consists of three phases: (i) category cleaning, by identifying candidate categories that truly represent classes in the domain of interest, (ii) edge cleaning, by selecting subcategory relationships that correspond to class subsumption, and (iii) top-level construction, by mapping classes onto a subset of high-level WordNet categories. A comprehensive evaluation shows that TiFi is able to construct taxonomies for a diverse range of fictional domains such as Lord of the Rings, The Simpsons or Greek Mythology with very high precision and that it outperforms state-of-the-art baselines for taxonomy induction by a substantial margin
The mapKurator System: A Complete Pipeline for Extracting and Linking Text from Historical Maps
Documents hold spatial focus and valuable locality characteristics. For
example, descriptions of listings in real estate or travel blogs contain
information about specific local neighborhoods. This information is valuable to
characterize how humans perceive their environment. However, the first step to
making use of this information is to identify the spatial focus (e.g., a city)
of a document. Traditional approaches for identifying the spatial focus of a
document rely on detecting and disambiguating toponyms from the document. This
approach requires a vocabulary set of location phrases and ad-hoc rules, which
ignore important words related to location. Recent topic modeling approaches
using large language models often consider a few topics, each with broad
coverage. In contrast, the spatial focus of a document can be a country, a
city, or even a neighborhood, which together, is much larger than the number of
topics considered in these approaches. Additionally, topic modeling methods are
often applied to broad topics of news articles where context is easily
distinguishable. To identify the geographic focus of a document effectively, we
present a simple but effective Joint Embedding of multi-LocaLitY (JELLY), which
jointly learns representations with separate encoders of document and location.
JELLY significantly outperforms state-of-the-art methods for identifying
spatial focus from documents from a number of sources. We also demonstrate case
studies on the arithmetic of the learned representations, including identifying
cities with similar locality characteristics and zero-shot learning to identify
document spatial focus.Comment: 4 pages, 4 figure
Evaluating the application of semantic inferencing rules to image annotation
Semantic annotation of digital objects within large multimedia collections is a difficult and challenging task. We describe a method for semi-automatic annotation of images and apply it to and evaluate it on images of pancreatic cells. By comparing the performance of this approach in the pancreatic cell domain with previous results in the fuel cell domain, we aim to determine characteristics of a domain which indicate that the method will or will not work in that domain. We conclude by describing the types of images and domains in which we can expect satisfactory results with this approach. Copyright 2005 ACM
Automatic tagging and geotagging in video collections and communities
Automatically generated tags and geotags hold great promise
to improve access to video collections and online communi-
ties. We overview three tasks offered in the MediaEval 2010
benchmarking initiative, for each, describing its use scenario, definition and the data set released. For each task, a reference algorithm is presented that was used within MediaEval 2010 and comments are included on lessons learned. The Tagging Task, Professional involves automatically matching episodes in a collection of Dutch television with subject labels drawn from the keyword thesaurus used by the archive staff. The Tagging Task, Wild Wild Web involves automatically predicting the tags that are assigned by users to their online videos. Finally, the Placing Task requires automatically assigning geo-coordinates to videos. The specification of each task admits the use of the full range of available information including user-generated metadata, speech recognition transcripts, audio, and visual features
Knowledge-rich Image Gist Understanding Beyond Literal Meaning
We investigate the problem of understanding the message (gist) conveyed by
images and their captions as found, for instance, on websites or news articles.
To this end, we propose a methodology to capture the meaning of image-caption
pairs on the basis of large amounts of machine-readable knowledge that has
previously been shown to be highly effective for text understanding. Our method
identifies the connotation of objects beyond their denotation: where most
approaches to image understanding focus on the denotation of objects, i.e.,
their literal meaning, our work addresses the identification of connotations,
i.e., iconic meanings of objects, to understand the message of images. We view
image understanding as the task of representing an image-caption pair on the
basis of a wide-coverage vocabulary of concepts such as the one provided by
Wikipedia, and cast gist detection as a concept-ranking problem with
image-caption pairs as queries. To enable a thorough investigation of the
problem of gist understanding, we produce a gold standard of over 300
image-caption pairs and over 8,000 gist annotations covering a wide variety of
topics at different levels of abstraction. We use this dataset to
experimentally benchmark the contribution of signals from heterogeneous
sources, namely image and text. The best result with a Mean Average Precision
(MAP) of 0.69 indicate that by combining both dimensions we are able to better
understand the meaning of our image-caption pairs than when using language or
vision information alone. We test the robustness of our gist detection approach
when receiving automatically generated input, i.e., using automatically
generated image tags or generated captions, and prove the feasibility of an
end-to-end automated process
CASAM: Collaborative Human-machine Annotation of Multimedia.
The CASAM multimedia annotation system implements a model of cooperative annotation between a human annotator and automated components. The aim is that they work asynchronously but together. The system focuses upon the areas where automated recognition and reasoning are most effective and the user is able to work in the areas where their unique skills are required. The system’s reasoning is influenced by the annotations provided by the user and, similarly, the user can see the system’s work and modify and, implicitly, direct it. The CASAM system interacts with the user by providing a window onto the current state of annotation, and by generating requests for information which are important for the final annotation or to constrain its reasoning. The user can modify the annotation, respond to requests and also add their own annotations. The objective is that the human annotator’s time is used more effectively and that the result is an annotation that is both of higher quality and produced more quickly. This can be especially important in circumstances where the annotator has a very restricted amount of time in which to annotate the document. In this paper we describe our prototype system. We expand upon the techniques used for automatically analysing the multimedia document, for reasoning over the annotations generated and for the generation of an effective interaction with the end-user. We also present the results of evaluations undertaken with media professionals in order to validate the approach and gain feedback to drive further research
Recommended from our members
Proceedings of QG2010: The Third Workshop on Question Generation
These are the peer-reviewed proceedings of "QG2010, The Third Workshop on Question Generation". The workshop included a special track for "QGSTEC2010: The First Question Generation Shared Task and Evaluation Challenge".
QG2010 was held as part of The Tenth International Conference on Intelligent Tutoring Systems (ITS2010)
- …