57,078 research outputs found
Vision of a Visipedia
The web is not perfect: while text is easily
searched and organized, pictures (the vast majority of the bits
that one can find online) are not. In order to see how one could
improve the web and make pictures first-class citizens of the
web, I explore the idea of Visipedia, a visual interface for
Wikipedia that is able to answer visual queries and enables
experts to contribute and organize visual knowledge. Five
distinct groups of humans would interact through Visipedia:
users, experts, editors, visual workers, and machine vision
scientists. The latter would gradually build automata able to
interpret images. I explore some of the technical challenges
involved in making Visipedia happen. I argue that Visipedia will
likely grow organically, combining state-of-the-art machine
vision with human labor
Crowdsourcing in Computer Vision
Computer vision systems require large amounts of manually annotated data to
properly learn challenging visual concepts. Crowdsourcing platforms offer an
inexpensive method to capture human knowledge and understanding, for a vast
number of visual perception tasks. In this survey, we describe the types of
annotations computer vision researchers have collected using crowdsourcing, and
how they have ensured that this data is of high quality while annotation effort
is minimized. We begin by discussing data collection on both classic (e.g.,
object recognition) and recent (e.g., visual story-telling) vision tasks. We
then summarize key design decisions for creating effective data collection
interfaces and workflows, and present strategies for intelligently selecting
the most important data instances to annotate. Finally, we conclude with some
thoughts on the future of crowdsourcing in computer vision.Comment: A 69-page meta review of the field, Foundations and Trends in
Computer Graphics and Vision, 201
Towards Avatars with Artificial Minds: Role of Semantic Memory
he first step towards creating avatars with human-like artificial minds is to give them human-like memory structures with an access to general knowledge about the world. This type of knowledge is stored in semantic memory. Although many approaches to modeling of semantic memories have been proposed they are not very useful in real life applications because they lack knowledge comparable to the common sense that humans have, and they cannot be implemented in a computationally efficient way. The most drastic simplification of semantic memory leading to the simplest knowledge representation that is sufficient for many applications is based on the Concept Description Vectors (CDVs) that store, for each concept, an information whether a given property is applicable to this concept or not. Unfortunately even such simple information about real objects or concepts is not available. Experiments with automatic creation of concept description vectors from various sources, including ontologies, dictionaries, encyclopedias and unstructured text sources are described. Haptek-based talking head that has an access to this memory has been created as an example of a humanized interface (HIT) that can interact with web pages and exchange information in a natural way. A few examples of applications of an avatar with semantic memory are given, including the twenty questions game and automatic creation of word puzzles
What Makes a Computation Unconventional?
A coherent mathematical overview of computation and its generalisations is
described. This conceptual framework is sufficient to comfortably host a wide
range of contemporary thinking on embodied computation and its models.Comment: Based on an invited lecture for the 'Symposium on
Natural/Unconventional Computing and Its Philosophical Significance' at the
AISB/IACAP World Congress 2012, University of Birmingham, July 2-6, 201
Evaluating Visual Conversational Agents via Cooperative Human-AI Games
As AI continues to advance, human-AI teams are inevitable. However, progress
in AI is routinely measured in isolation, without a human in the loop. It is
crucial to benchmark progress in AI, not just in isolation, but also in terms
of how it translates to helping humans perform certain tasks, i.e., the
performance of human-AI teams.
In this work, we design a cooperative game - GuessWhich - to measure human-AI
team performance in the specific context of the AI being a visual
conversational agent. GuessWhich involves live interaction between the human
and the AI. The AI, which we call ALICE, is provided an image which is unseen
by the human. Following a brief description of the image, the human questions
ALICE about this secret image to identify it from a fixed pool of images.
We measure performance of the human-ALICE team by the number of guesses it
takes the human to correctly identify the secret image after a fixed number of
dialog rounds with ALICE. We compare performance of the human-ALICE teams for
two versions of ALICE. Our human studies suggest a counterintuitive trend -
that while AI literature shows that one version outperforms the other when
paired with an AI questioner bot, we find that this improvement in AI-AI
performance does not translate to improved human-AI performance. This suggests
a mismatch between benchmarking of AI in isolation and in the context of
human-AI teams.Comment: HCOMP 201
Induction of models under uncertainty
This paper outlines a procedure for performing induction under uncertainty. This procedure uses a probabilistic representation and uses Bayes' theorem to decide between alternative hypotheses (theories). This procedure is illustrated by a robot with no prior world experience performing induction on data it has gathered about the world. The particular inductive problem is the formation of class descriptions both for the tutored and untutored cases. The resulting class definitions are inherently probabilistic and so do not have any sharply defined membership criterion. This robot example raises some fundamental problems about induction; particularly, it is shown that inductively formed theories are not the best way to make predictions. Another difficulty is the need to provide prior probabilities for the set of possible theories. The main criterion for such priors is a pragmatic one aimed at keeping the theory structure as simple as possible, while still reflecting any structure discovered in the data
- …