Search CORE

10,638 research outputs found

Crowdsourcing in Computer Vision

Author: Fei-Fei Li
Grauman Kristen
Kovashka Adriana
Russakovsky Olga
Publication venue: 'Now Publishers'
Publication date: 01/01/2016
Field of study

Computer vision systems require large amounts of manually annotated data to properly learn challenging visual concepts. Crowdsourcing platforms offer an inexpensive method to capture human knowledge and understanding, for a vast number of visual perception tasks. In this survey, we describe the types of annotations computer vision researchers have collected using crowdsourcing, and how they have ensured that this data is of high quality while annotation effort is minimized. We begin by discussing data collection on both classic (e.g., object recognition) and recent (e.g., visual story-telling) vision tasks. We then summarize key design decisions for creating effective data collection interfaces and workflows, and present strategies for intelligently selecting the most important data instances to annotate. Finally, we conclude with some thoughts on the future of crowdsourcing in computer vision.Comment: A 69-page meta review of the field, Foundations and Trends in Computer Graphics and Vision, 201

arXiv.org e-Print Archive

Crossref

Accurator: Nichesourcing for Cultural Heritage

Author: Aroyo Lora
De Boer Victor
Dijkshoorn Chris
Schreiber Guus
Publication venue
Publication date: 01/01/2017
Field of study

With more and more cultural heritage data being published online, their usefulness in this open context depends on the quality and diversity of descriptive metadata for collection objects. In many cases, existing metadata is not adequate for a variety of retrieval and research tasks and more specific annotations are necessary. However, eliciting such annotations is a challenge since it often requires domain-specific knowledge. Where crowdsourcing can be successfully used for eliciting simple annotations, identifying people with the required expertise might prove troublesome for tasks requiring more complex or domain-specific knowledge. Nichesourcing addresses this problem, by tapping into the expert knowledge available in niche communities. This paper presents Accurator, a methodology for conducting nichesourcing campaigns for cultural heritage institutions, by addressing communities, organizing events and tailoring a web-based annotation tool to a domain of choice. The contribution of this paper is threefold: 1) a nichesourcing methodology, 2) an annotation tool for experts and 3) validation of the methodology and tool in three case studies. The three domains of the case studies are birds on art, bible prints and fashion images. We compare the quality and quantity of obtained annotations in the three case studies, showing that the nichesourcing methodology in combination with the image annotation tool can be used to collect high quality annotations in a variety of domains and annotation tasks. A user evaluation indicates the tool is suited and usable for domain specific annotation tasks

arXiv.org e-Print Archive

VU Research Portal

bdbms -- A Database Management System for Biological Data

Author: Aref Walid G.
Eltabakh Mohamed Y.
Ouzzani Mourad
Publication venue
Publication date: 01/12/2006
Field of study

Biologists are increasingly using databases for storing and managing their data. Biological databases typically consist of a mixture of raw data, metadata, sequences, annotations, and related data obtained from various sources. Current database technology lacks several functionalities that are needed by biological databases. In this paper, we introduce bdbms, an extensible prototype database management system for supporting biological data. bdbms extends the functionalities of current DBMSs to include: (1) Annotation and provenance management including storage, indexing, manipulation, and querying of annotation and provenance as first class objects in bdbms, (2) Local dependency tracking to track the dependencies and derivations among data items, (3) Update authorization to support data curation via content-based authorization, in contrast to identity-based authorization, and (4) New access methods and their supporting operators that support pattern matching on various types of compressed biological data types. This paper presents the design of bdbms along with the techniques proposed to support these functionalities including an extension to SQL. We also outline some open issues in building bdbms.Comment: This article is published under a Creative Commons License Agreement (http://creativecommons.org/licenses/by/2.5/.) You may copy, distribute, display, and perform the work, make derivative works and make commercial use of the work, but, you must attribute the work to the author and CIDR 2007. 3rd Biennial Conference on Innovative Data Systems Research (CIDR) January 710, 2007, Asilomar, California, US

arXiv.org e-Print Archive

CiteSeerX

Purdue E-Pubs

Crowdsourcing Question-Answer Meaning Representations

Author: Dagan Ido
He Luheng
Michael Julian
Stanovsky Gabriel
Zettlemoyer Luke
Publication venue
Publication date: 15/11/2017
Field of study

We introduce Question-Answer Meaning Representations (QAMRs), which represent the predicate-argument structure of a sentence as a set of question-answer pairs. We also develop a crowdsourcing scheme to show that QAMRs can be labeled with very little training, and gather a dataset with over 5,000 sentences and 100,000 questions. A detailed qualitative analysis demonstrates that the crowd-generated question-answer pairs cover the vast majority of predicate-argument relationships in existing datasets (including PropBank, NomBank, QA-SRL, and AMR) along with many previously under-resourced ones, including implicit arguments and relations. The QAMR data and annotation code is made publicly available to enable future work on how best to model these complex phenomena.Comment: 8 pages, 6 figures, 2 table

arXiv.org e-Print Archive

Crossref

Hard to Cheat: A Turing Test based on Answering Questions about Images

Author: Fritz Mario
Malinowski Mateusz
Publication venue
Publication date: 01/01/2015
Field of study

Progress in language and image understanding by machines has sparkled the interest of the research community in more open-ended, holistic tasks, and refueled an old AI dream of building intelligent machines. We discuss a few prominent challenges that characterize such holistic tasks and argue for "question answering about images" as a particular appealing instance of such a holistic task. In particular, we point out that it is a version of a Turing Test that is likely to be more robust to over-interpretations and contrast it with tasks like grounding and generation of descriptions. Finally, we discuss tools to measure progress in this field.Comment: Presented in AAAI-15 Workshop: Beyond the Turing Tes

arXiv.org e-Print Archive

CISPA – Helmholtz-Zentrum für Informationssicherheit

MPG.PuRe

Recommended from our members

The quest for information retrieval on the semantic web

Author: Castells-Azpilicueta Pablo
Fernández-Sánchez Miriam
Vallet-Weadon David
Publication venue
Publication date: 01/12/2005
Field of study

Semantic search has been one of the motivations of the Semantic Web since it was envisioned. We propose a model for the exploitation of ontology-based KBs to improve search over large document repositories. The retrieval model is based on an adaptation of the classic vector-space model, including an annotation weighting algorithm, and a ranking algorithm. Semantic search is combined with keyword-based search to achieve tolerance to KB incompleteness. Our proposal has been tested on corpora of significant size, showing promising results with respect to keyword-based search, and providing ground for further analysis and research

Open Research Online (The Open University)