3,707 research outputs found
Crowdsourcing in Computer Vision
Computer vision systems require large amounts of manually annotated data to
properly learn challenging visual concepts. Crowdsourcing platforms offer an
inexpensive method to capture human knowledge and understanding, for a vast
number of visual perception tasks. In this survey, we describe the types of
annotations computer vision researchers have collected using crowdsourcing, and
how they have ensured that this data is of high quality while annotation effort
is minimized. We begin by discussing data collection on both classic (e.g.,
object recognition) and recent (e.g., visual story-telling) vision tasks. We
then summarize key design decisions for creating effective data collection
interfaces and workflows, and present strategies for intelligently selecting
the most important data instances to annotate. Finally, we conclude with some
thoughts on the future of crowdsourcing in computer vision.Comment: A 69-page meta review of the field, Foundations and Trends in
Computer Graphics and Vision, 201
BlogForever D2.6: Data Extraction Methodology
This report outlines an inquiry into the area of web data extraction, conducted within the context of blog preservation. The report reviews theoretical advances and practical developments for implementing data extraction. The inquiry is extended through an experiment that demonstrates the effectiveness and feasibility of implementing some of the suggested approaches. More specifically, the report discusses an approach based on unsupervised machine learning that employs the RSS feeds and HTML representations of blogs. It outlines the possibilities of extracting semantics available in blogs and demonstrates the benefits of exploiting available standards such as microformats and microdata. The report proceeds to propose a methodology for extracting and processing blog data to further inform the design and development of the BlogForever platform
Discrepancies Between Database- and Pragmatically Driven NLG: Insights from QUD-Based Annotations
We present annotation findings when using an annotated corpus of driving reports as informational texts with an elaborated pragmatics for the automatic generation of corresponding texts. The generation process requires access to a database providing the technical details of the vehicles, as well as an annotated corpus for sophisticated, pragmatically motivated text planning. We focus on the annotation results since they are the basic framework for linking text planning with database queries and microplanning. We show that the annotations point to a variety of linguistic phenomena that have received little or no attention in the literature so far, and they raise corresponding questions regarding the access to information from databases for the generation process
A Type-coherent, Expressive Representation as an Initial Step to Language Understanding
A growing interest in tasks involving language understanding by the NLP
community has led to the need for effective semantic parsing and inference.
Modern NLP systems use semantic representations that do not quite fulfill the
nuanced needs for language understanding: adequately modeling language
semantics, enabling general inferences, and being accurately recoverable. This
document describes underspecified logical forms (ULF) for Episodic Logic (EL),
which is an initial form for a semantic representation that balances these
needs. ULFs fully resolve the semantic type structure while leaving issues such
as quantifier scope, word sense, and anaphora unresolved; they provide a
starting point for further resolution into EL, and enable certain structural
inferences without further resolution. This document also presents preliminary
results of creating a hand-annotated corpus of ULFs for the purpose of training
a precise ULF parser, showing a three-person pairwise interannotator agreement
of 0.88 on confident annotations. We hypothesize that a divide-and-conquer
approach to semantic parsing starting with derivation of ULFs will lead to
semantic analyses that do justice to subtle aspects of linguistic meaning, and
will enable construction of more accurate semantic parsers.Comment: Accepted for publication at The 13th International Conference on
Computational Semantics (IWCS 2019
- âŠ