3,129 research outputs found
Learning Robust Object Recognition Using Composed Scenes from Generative Models
Recurrent feedback connections in the mammalian visual system have been
hypothesized to play a role in synthesizing input in the theoretical framework
of analysis by synthesis. The comparison of internally synthesized
representation with that of the input provides a validation mechanism during
perceptual inference and learning. Inspired by these ideas, we proposed that
the synthesis machinery can compose new, unobserved images by imagination to
train the network itself so as to increase the robustness of the system in
novel scenarios. As a proof of concept, we investigated whether images composed
by imagination could help an object recognition system to deal with occlusion,
which is challenging for the current state-of-the-art deep convolutional neural
networks. We fine-tuned a network on images containing objects in various
occlusion scenarios, that are imagined or self-generated through a deep
generator network. Trained on imagined occluded scenarios under the object
persistence constraint, our network discovered more subtle and localized image
features that were neglected by the original network for object classification,
obtaining better separability of different object classes in the feature space.
This leads to significant improvement of object recognition under occlusion for
our network relative to the original network trained only on un-occluded
images. In addition to providing practical benefits in object recognition under
occlusion, this work demonstrates the use of self-generated composition of
visual scenes through the synthesis loop, combined with the object persistence
constraint, can provide opportunities for neural networks to discover new
relevant patterns in the data, and become more flexible in dealing with novel
situations.Comment: Accepted by 14th Conference on Computer and Robot Visio
Mirrored Light Field Video Camera Adapter
This paper proposes the design of a custom mirror-based light field camera
adapter that is cheap, simple in construction, and accessible. Mirrors of
different shape and orientation reflect the scene into an upwards-facing camera
to create an array of virtual cameras with overlapping field of view at
specified depths, and deliver video frame rate light fields. We describe the
design, construction, decoding and calibration processes of our mirror-based
light field camera adapter in preparation for an open-source release to benefit
the robotic vision community.Comment: tech report, v0.5, 15 pages, 6 figure
Active Metric-Semantic Mapping by Multiple Aerial Robots
Traditional approaches for active mapping focus on building geometric maps.
For most real-world applications, however, actionable information is related to
semantically meaningful objects in the environment. We propose an approach to
the active metric-semantic mapping problem that enables multiple heterogeneous
robots to collaboratively build a map of the environment. The robots actively
explore to minimize the uncertainties in both semantic (object classification)
and geometric (object modeling) information. We represent the environment using
informative but sparse object models, each consisting of a basic shape and a
semantic class label, and characterize uncertainties empirically using a large
amount of real-world data. Given a prior map, we use this model to select
actions for each robot to minimize uncertainties. The performance of our
algorithm is demonstrated through multi-robot experiments in diverse real-world
environments. The proposed framework is applicable to a wide range of
real-world problems, such as precision agriculture, infrastructure inspection,
and asset mapping in factories. A demo video can be found at
https://youtu.be/S86SgXi54oU.Comment: ICRA 2023 (2023 International Conference on Robotics and Automation
- …