9,490 research outputs found
RASPV: A robotics framework for augmented simulated prosthetic vision
One of the main challenges of visual prostheses is to augment the perceived information to improve the experience of its wearers. Given the limited access to implanted patients, in order to facilitate the experimentation of new techniques, this is often evaluated via Simulated Prosthetic Vision (SPV) with sighted people. In this work, we introduce a novel SPV framework and implementation that presents major advantages with respect to previous approaches. First, it is integrated into a robotics framework, which allows us to benefit from a wide range of methods and algorithms from the field (e.g. object recognition, obstacle avoidance, autonomous navigation, deep learning). Second, we go beyond traditional image processing with 3D point clouds processing using an RGB-D camera, allowing us to robustly detect the floor, obstacles and the structure of the scene. Third, it works either with a real camera or in a virtual environment, which gives us endless possibilities for immersive experimentation through a head-mounted display. Fourth, we incorporate a validated temporal phosphene model that replicates time effects into the generation of visual stimuli. Finally, we have proposed, developed and tested several applications within this framework, such as avoiding moving obstacles, providing a general understanding of the scene, staircase detection, helping the subject to navigate an unfamiliar space, and object and person detection. We provide experimental results in real and virtual environments. The code is publicly available at https://www.github.com/aperezyus/RASP
Frustum PointNets for 3D Object Detection from RGB-D Data
In this work, we study 3D object detection from RGB-D data in both indoor and
outdoor scenes. While previous methods focus on images or 3D voxels, often
obscuring natural 3D patterns and invariances of 3D data, we directly operate
on raw point clouds by popping up RGB-D scans. However, a key challenge of this
approach is how to efficiently localize objects in point clouds of large-scale
scenes (region proposal). Instead of solely relying on 3D proposals, our method
leverages both mature 2D object detectors and advanced 3D deep learning for
object localization, achieving efficiency as well as high recall for even small
objects. Benefited from learning directly in raw point clouds, our method is
also able to precisely estimate 3D bounding boxes even under strong occlusion
or with very sparse points. Evaluated on KITTI and SUN RGB-D 3D detection
benchmarks, our method outperforms the state of the art by remarkable margins
while having real-time capability.Comment: 15 pages, 12 figures, 14 table
Complexer-YOLO: Real-Time 3D Object Detection and Tracking on Semantic Point Clouds
Accurate detection of 3D objects is a fundamental problem in computer vision
and has an enormous impact on autonomous cars, augmented/virtual reality and
many applications in robotics. In this work we present a novel fusion of neural
network based state-of-the-art 3D detector and visual semantic segmentation in
the context of autonomous driving. Additionally, we introduce
Scale-Rotation-Translation score (SRTs), a fast and highly parameterizable
evaluation metric for comparison of object detections, which speeds up our
inference time up to 20\% and halves training time. On top, we apply
state-of-the-art online multi target feature tracking on the object
measurements to further increase accuracy and robustness utilizing temporal
information. Our experiments on KITTI show that we achieve same results as
state-of-the-art in all related categories, while maintaining the performance
and accuracy trade-off and still run in real-time. Furthermore, our model is
the first one that fuses visual semantic with 3D object detection
- …