10,638 research outputs found
Crowdsourcing in Computer Vision
Computer vision systems require large amounts of manually annotated data to
properly learn challenging visual concepts. Crowdsourcing platforms offer an
inexpensive method to capture human knowledge and understanding, for a vast
number of visual perception tasks. In this survey, we describe the types of
annotations computer vision researchers have collected using crowdsourcing, and
how they have ensured that this data is of high quality while annotation effort
is minimized. We begin by discussing data collection on both classic (e.g.,
object recognition) and recent (e.g., visual story-telling) vision tasks. We
then summarize key design decisions for creating effective data collection
interfaces and workflows, and present strategies for intelligently selecting
the most important data instances to annotate. Finally, we conclude with some
thoughts on the future of crowdsourcing in computer vision.Comment: A 69-page meta review of the field, Foundations and Trends in
Computer Graphics and Vision, 201
Accurator: Nichesourcing for Cultural Heritage
With more and more cultural heritage data being published online, their
usefulness in this open context depends on the quality and diversity of
descriptive metadata for collection objects. In many cases, existing metadata
is not adequate for a variety of retrieval and research tasks and more specific
annotations are necessary. However, eliciting such annotations is a challenge
since it often requires domain-specific knowledge. Where crowdsourcing can be
successfully used for eliciting simple annotations, identifying people with the
required expertise might prove troublesome for tasks requiring more complex or
domain-specific knowledge. Nichesourcing addresses this problem, by tapping
into the expert knowledge available in niche communities. This paper presents
Accurator, a methodology for conducting nichesourcing campaigns for cultural
heritage institutions, by addressing communities, organizing events and
tailoring a web-based annotation tool to a domain of choice. The contribution
of this paper is threefold: 1) a nichesourcing methodology, 2) an annotation
tool for experts and 3) validation of the methodology and tool in three case
studies. The three domains of the case studies are birds on art, bible prints
and fashion images. We compare the quality and quantity of obtained annotations
in the three case studies, showing that the nichesourcing methodology in
combination with the image annotation tool can be used to collect high quality
annotations in a variety of domains and annotation tasks. A user evaluation
indicates the tool is suited and usable for domain specific annotation tasks
bdbms -- A Database Management System for Biological Data
Biologists are increasingly using databases for storing and managing their
data. Biological databases typically consist of a mixture of raw data,
metadata, sequences, annotations, and related data obtained from various
sources. Current database technology lacks several functionalities that are
needed by biological databases. In this paper, we introduce bdbms, an
extensible prototype database management system for supporting biological data.
bdbms extends the functionalities of current DBMSs to include: (1) Annotation
and provenance management including storage, indexing, manipulation, and
querying of annotation and provenance as first class objects in bdbms, (2)
Local dependency tracking to track the dependencies and derivations among data
items, (3) Update authorization to support data curation via content-based
authorization, in contrast to identity-based authorization, and (4) New access
methods and their supporting operators that support pattern matching on various
types of compressed biological data types. This paper presents the design of
bdbms along with the techniques proposed to support these functionalities
including an extension to SQL. We also outline some open issues in building
bdbms.Comment: This article is published under a Creative Commons License Agreement
(http://creativecommons.org/licenses/by/2.5/.) You may copy, distribute,
display, and perform the work, make derivative works and make commercial use
of the work, but, you must attribute the work to the author and CIDR 2007.
3rd Biennial Conference on Innovative Data Systems Research (CIDR) January
710, 2007, Asilomar, California, US
Crowdsourcing Question-Answer Meaning Representations
We introduce Question-Answer Meaning Representations (QAMRs), which represent
the predicate-argument structure of a sentence as a set of question-answer
pairs. We also develop a crowdsourcing scheme to show that QAMRs can be labeled
with very little training, and gather a dataset with over 5,000 sentences and
100,000 questions. A detailed qualitative analysis demonstrates that the
crowd-generated question-answer pairs cover the vast majority of
predicate-argument relationships in existing datasets (including PropBank,
NomBank, QA-SRL, and AMR) along with many previously under-resourced ones,
including implicit arguments and relations. The QAMR data and annotation code
is made publicly available to enable future work on how best to model these
complex phenomena.Comment: 8 pages, 6 figures, 2 table
Hard to Cheat: A Turing Test based on Answering Questions about Images
Progress in language and image understanding by machines has sparkled the
interest of the research community in more open-ended, holistic tasks, and
refueled an old AI dream of building intelligent machines. We discuss a few
prominent challenges that characterize such holistic tasks and argue for
"question answering about images" as a particular appealing instance of such a
holistic task. In particular, we point out that it is a version of a Turing
Test that is likely to be more robust to over-interpretations and contrast it
with tasks like grounding and generation of descriptions. Finally, we discuss
tools to measure progress in this field.Comment: Presented in AAAI-15 Workshop: Beyond the Turing Tes
Recommended from our members
The quest for information retrieval on the semantic web
Semantic search has been one of the motivations of the Semantic Web since it was envisioned. We propose a model for the exploitation of ontology-based KBs to improve search over large document repositories. The retrieval model is based on an adaptation of the classic vector-space model, including an annotation weighting algorithm, and a ranking algorithm. Semantic search is combined with keyword-based search to achieve tolerance to KB incompleteness. Our proposal has been tested on corpora of significant size, showing promising results with respect to keyword-based search, and providing ground for further analysis and research
- …