7,857 research outputs found
WordFences: Text localization and recognition
En col·laboració amb la Universitat de Barcelona (UB) i la Universitat Rovira i Virgili (URV)In recent years, text recognition has achieved remarkable success in recognizing scanned
document text. However, word recognition in natural images is still an open problem,
which generally requires time consuming post-processing steps. We present a novel architecture
for individual word detection in scene images based on semantic segmentation.
Our contributions are twofold: the concept of WordFence, which detects border areas
surrounding each individual word and a unique pixelwise weighted softmax loss function
which penalizes background and emphasizes small text regions. WordFence ensures that
each word is detected individually, and the new loss function provides a strong training
signal to both text and word border localization. The proposed technique avoids intensive
post-processing by combining semantic word segmentation with a voting scheme
for merging segmentations of multiple scales, producing an end-to-end word detection
system. We achieve superior localization recall on common benchmark datasets - 92%
recall on ICDAR11 and ICDAR13 and 63% recall on SVT. Furthermore, end-to-end
word recognition achieves state-of-the-art 86% F-Score on ICDAR13
Deep Semantic Classification for 3D LiDAR Data
Robots are expected to operate autonomously in dynamic environments.
Understanding the underlying dynamic characteristics of objects is a key
enabler for achieving this goal. In this paper, we propose a method for
pointwise semantic classification of 3D LiDAR data into three classes:
non-movable, movable and dynamic. We concentrate on understanding these
specific semantics because they characterize important information required for
an autonomous system. Non-movable points in the scene belong to unchanging
segments of the environment, whereas the remaining classes corresponds to the
changing parts of the scene. The difference between the movable and dynamic
class is their motion state. The dynamic points can be perceived as moving,
whereas movable objects can move, but are perceived as static. To learn the
distinction between movable and non-movable points in the environment, we
introduce an approach based on deep neural network and for detecting the
dynamic points, we estimate pointwise motion. We propose a Bayes filter
framework for combining the learned semantic cues with the motion cues to infer
the required semantic classification. In extensive experiments, we compare our
approach with other methods on a standard benchmark dataset and report
competitive results in comparison to the existing state-of-the-art.
Furthermore, we show an improvement in the classification of points by
combining the semantic cues retrieved from the neural network with the motion
cues.Comment: 8 pages to be published in IROS 201
A Comprehensive Survey of Deep Learning in Remote Sensing: Theories, Tools and Challenges for the Community
In recent years, deep learning (DL), a re-branding of neural networks (NNs),
has risen to the top in numerous areas, namely computer vision (CV), speech
recognition, natural language processing, etc. Whereas remote sensing (RS)
possesses a number of unique challenges, primarily related to sensors and
applications, inevitably RS draws from many of the same theories as CV; e.g.,
statistics, fusion, and machine learning, to name a few. This means that the RS
community should be aware of, if not at the leading edge of, of advancements
like DL. Herein, we provide the most comprehensive survey of state-of-the-art
RS DL research. We also review recent new developments in the DL field that can
be used in DL for RS. Namely, we focus on theories, tools and challenges for
the RS community. Specifically, we focus on unsolved challenges and
opportunities as it relates to (i) inadequate data sets, (ii)
human-understandable solutions for modelling physical phenomena, (iii) Big
Data, (iv) non-traditional heterogeneous data sources, (v) DL architectures and
learning algorithms for spectral, spatial and temporal data, (vi) transfer
learning, (vii) an improved theoretical understanding of DL systems, (viii)
high barriers to entry, and (ix) training and optimizing the DL.Comment: 64 pages, 411 references. To appear in Journal of Applied Remote
Sensin
Symbol Emergence in Robotics: A Survey
Humans can learn the use of language through physical interaction with their
environment and semiotic communication with other people. It is very important
to obtain a computational understanding of how humans can form a symbol system
and obtain semiotic skills through their autonomous mental development.
Recently, many studies have been conducted on the construction of robotic
systems and machine-learning methods that can learn the use of language through
embodied multimodal interaction with their environment and other systems.
Understanding human social interactions and developing a robot that can
smoothly communicate with human users in the long term, requires an
understanding of the dynamics of symbol systems and is crucially important. The
embodied cognition and social interaction of participants gradually change a
symbol system in a constructive manner. In this paper, we introduce a field of
research called symbol emergence in robotics (SER). SER is a constructive
approach towards an emergent symbol system. The emergent symbol system is
socially self-organized through both semiotic communications and physical
interactions with autonomous cognitive developmental agents, i.e., humans and
developmental robots. Specifically, we describe some state-of-art research
topics concerning SER, e.g., multimodal categorization, word discovery, and a
double articulation analysis, that enable a robot to obtain words and their
embodied meanings from raw sensory--motor information, including visual
information, haptic information, auditory information, and acoustic speech
signals, in a totally unsupervised manner. Finally, we suggest future
directions of research in SER.Comment: submitted to Advanced Robotic
- …