14,058 research outputs found
Symbol Emergence in Robotics: A Survey
Humans can learn the use of language through physical interaction with their
environment and semiotic communication with other people. It is very important
to obtain a computational understanding of how humans can form a symbol system
and obtain semiotic skills through their autonomous mental development.
Recently, many studies have been conducted on the construction of robotic
systems and machine-learning methods that can learn the use of language through
embodied multimodal interaction with their environment and other systems.
Understanding human social interactions and developing a robot that can
smoothly communicate with human users in the long term, requires an
understanding of the dynamics of symbol systems and is crucially important. The
embodied cognition and social interaction of participants gradually change a
symbol system in a constructive manner. In this paper, we introduce a field of
research called symbol emergence in robotics (SER). SER is a constructive
approach towards an emergent symbol system. The emergent symbol system is
socially self-organized through both semiotic communications and physical
interactions with autonomous cognitive developmental agents, i.e., humans and
developmental robots. Specifically, we describe some state-of-art research
topics concerning SER, e.g., multimodal categorization, word discovery, and a
double articulation analysis, that enable a robot to obtain words and their
embodied meanings from raw sensory--motor information, including visual
information, haptic information, auditory information, and acoustic speech
signals, in a totally unsupervised manner. Finally, we suggest future
directions of research in SER.Comment: submitted to Advanced Robotic
Why my photos look sideways or upside down? Detecting Canonical Orientation of Images using Convolutional Neural Networks
Image orientation detection requires high-level scene understanding. Humans
use object recognition and contextual scene information to correctly orient
images. In literature, the problem of image orientation detection is mostly
confronted by using low-level vision features, while some approaches
incorporate few easily detectable semantic cues to gain minor improvements. The
vast amount of semantic content in images makes orientation detection
challenging, and therefore there is a large semantic gap between existing
methods and human behavior. Also, existing methods in literature report highly
discrepant detection rates, which is mainly due to large differences in
datasets and limited variety of test images used for evaluation. In this work,
for the first time, we leverage the power of deep learning and adapt
pre-trained convolutional neural networks using largest training dataset
to-date for the image orientation detection task. An extensive evaluation of
our model on different public datasets shows that it remarkably generalizes to
correctly orient a large set of unconstrained images; it also significantly
outperforms the state-of-the-art and achieves accuracy very close to that of
humans
Why my photos look sideways or upside down? Detecting Canonical Orientation of Images using Convolutional Neural Networks
Image orientation detection requires high-level scene understanding. Humans
use object recognition and contextual scene information to correctly orient
images. In literature, the problem of image orientation detection is mostly
confronted by using low-level vision features, while some approaches
incorporate few easily detectable semantic cues to gain minor improvements. The
vast amount of semantic content in images makes orientation detection
challenging, and therefore there is a large semantic gap between existing
methods and human behavior. Also, existing methods in literature report highly
discrepant detection rates, which is mainly due to large differences in
datasets and limited variety of test images used for evaluation. In this work,
for the first time, we leverage the power of deep learning and adapt
pre-trained convolutional neural networks using largest training dataset
to-date for the image orientation detection task. An extensive evaluation of
our model on different public datasets shows that it remarkably generalizes to
correctly orient a large set of unconstrained images; it also significantly
outperforms the state-of-the-art and achieves accuracy very close to that of
humans
Data-Driven Shape Analysis and Processing
Data-driven methods play an increasingly important role in discovering
geometric, structural, and semantic relationships between 3D shapes in
collections, and applying this analysis to support intelligent modeling,
editing, and visualization of geometric data. In contrast to traditional
approaches, a key feature of data-driven approaches is that they aggregate
information from a collection of shapes to improve the analysis and processing
of individual shapes. In addition, they are able to learn models that reason
about properties and relationships of shapes without relying on hard-coded
rules or explicitly programmed instructions. We provide an overview of the main
concepts and components of these techniques, and discuss their application to
shape classification, segmentation, matching, reconstruction, modeling and
exploration, as well as scene analysis and synthesis, through reviewing the
literature and relating the existing works with both qualitative and numerical
comparisons. We conclude our report with ideas that can inspire future research
in data-driven shape analysis and processing.Comment: 10 pages, 19 figure
A Review of Codebook Models in Patch-Based Visual Object Recognition
The codebook model-based approach, while ignoring any structural aspect in vision, nonetheless provides state-of-the-art performances on current datasets. The key role of a visual codebook is to provide a way to map the low-level features into a fixed-length vector in histogram space to which standard classifiers can be directly applied. The discriminative power of such a visual codebook determines the quality of the codebook model, whereas the size of the codebook controls the complexity of the model. Thus, the construction of a codebook is an important step which is usually done by cluster analysis. However, clustering is a process that retains regions of high density in a distribution and it follows that the resulting codebook need not have discriminant properties. This is also recognised as a computational bottleneck of such systems. In our recent work, we proposed a resource-allocating codebook, to constructing a discriminant codebook in a one-pass design procedure that slightly outperforms more traditional approaches at drastically reduced computing times. In this review we survey several approaches that have been proposed over the last decade with their use of feature detectors, descriptors, codebook construction schemes, choice of classifiers in recognising objects, and datasets that were used in evaluating the proposed methods
Feature discovery and visualization of robot mission data using convolutional autoencoders and Bayesian nonparametric topic models
The gap between our ability to collect interesting data and our ability to
analyze these data is growing at an unprecedented rate. Recent algorithmic
attempts to fill this gap have employed unsupervised tools to discover
structure in data. Some of the most successful approaches have used
probabilistic models to uncover latent thematic structure in discrete data.
Despite the success of these models on textual data, they have not generalized
as well to image data, in part because of the spatial and temporal structure
that may exist in an image stream.
We introduce a novel unsupervised machine learning framework that
incorporates the ability of convolutional autoencoders to discover features
from images that directly encode spatial information, within a Bayesian
nonparametric topic model that discovers meaningful latent patterns within
discrete data. By using this hybrid framework, we overcome the fundamental
dependency of traditional topic models on rigidly hand-coded data
representations, while simultaneously encoding spatial dependency in our topics
without adding model complexity. We apply this model to the motivating
application of high-level scene understanding and mission summarization for
exploratory marine robots. Our experiments on a seafloor dataset collected by a
marine robot show that the proposed hybrid framework outperforms current
state-of-the-art approaches on the task of unsupervised seafloor terrain
characterization.Comment: 8 page
- …