81,461 research outputs found
Real-Time Seamless Single Shot 6D Object Pose Prediction
We propose a single-shot approach for simultaneously detecting an object in
an RGB image and predicting its 6D pose without requiring multiple stages or
having to examine multiple hypotheses. Unlike a recently proposed single-shot
technique for this task (Kehl et al., ICCV'17) that only predicts an
approximate 6D pose that must then be refined, ours is accurate enough not to
require additional post-processing. As a result, it is much faster - 50 fps on
a Titan X (Pascal) GPU - and more suitable for real-time processing. The key
component of our method is a new CNN architecture inspired by the YOLO network
design that directly predicts the 2D image locations of the projected vertices
of the object's 3D bounding box. The object's 6D pose is then estimated using a
PnP algorithm.
For single object and multiple object pose estimation on the LINEMOD and
OCCLUSION datasets, our approach substantially outperforms other recent
CNN-based approaches when they are all used without post-processing. During
post-processing, a pose refinement step can be used to boost the accuracy of
the existing methods, but at 10 fps or less, they are much slower than our
method.Comment: CVPR 201
Bio-Inspired Stereo Vision Calibration for Dynamic Vision Sensors
Many advances have been made in the eld of computer vision. Several recent research trends
have focused on mimicking human vision by using a stereo vision system. In multi-camera systems, a
calibration process is usually implemented to improve the results accuracy. However, these systems generate
a large amount of data to be processed; therefore, a powerful computer is required and, in many cases,
this cannot be done in real time. Neuromorphic Engineering attempts to create bio-inspired systems that
mimic the information processing that takes place in the human brain. This information is encoded using
pulses (or spikes) and the generated systems are much simpler (in computational operations and resources),
which allows them to perform similar tasks with much lower power consumption, thus these processes
can be developed over specialized hardware with real-time processing. In this work, a bio-inspired stereovision
system is presented, where a calibration mechanism for this system is implemented and evaluated
using several tests. The result is a novel calibration technique for a neuromorphic stereo vision system,
implemented over specialized hardware (FPGA - Field-Programmable Gate Array), which allows obtaining
reduced latencies on hardware implementation for stand-alone systems, and working in real time.Ministerio de EconomĂa y Competitividad TEC2016-77785-PMinisterio de EconomĂa y Competitividad TIN2016-80644-
Devito: Towards a generic Finite Difference DSL using Symbolic Python
Domain specific languages (DSL) have been used in a variety of fields to
express complex scientific problems in a concise manner and provide automated
performance optimization for a range of computational architectures. As such
DSLs provide a powerful mechanism to speed up scientific Python computation
that goes beyond traditional vectorization and pre-compilation approaches,
while allowing domain scientists to build applications within the comforts of
the Python software ecosystem. In this paper we present Devito, a new finite
difference DSL that provides optimized stencil computation from high-level
problem specifications based on symbolic Python expressions. We demonstrate
Devito's symbolic API and performance advantages over traditional Python
acceleration methods before highlighting its use in the scientific context of
seismic inversion problems.Comment: pyHPC 2016 conference submissio
VConv-DAE: Deep Volumetric Shape Learning Without Object Labels
With the advent of affordable depth sensors, 3D capture becomes more and more
ubiquitous and already has made its way into commercial products. Yet,
capturing the geometry or complete shapes of everyday objects using scanning
devices (e.g. Kinect) still comes with several challenges that result in noise
or even incomplete shapes. Recent success in deep learning has shown how to
learn complex shape distributions in a data-driven way from large scale 3D CAD
Model collections and to utilize them for 3D processing on volumetric
representations and thereby circumventing problems of topology and
tessellation. Prior work has shown encouraging results on problems ranging from
shape completion to recognition. We provide an analysis of such approaches and
discover that training as well as the resulting representation are strongly and
unnecessarily tied to the notion of object labels. Thus, we propose a full
convolutional volumetric auto encoder that learns volumetric representation
from noisy data by estimating the voxel occupancy grids. The proposed method
outperforms prior work on challenging tasks like denoising and shape
completion. We also show that the obtained deep embedding gives competitive
performance when used for classification and promising results for shape
interpolation
- âŠ