Search CORE

14 research outputs found

Learning-based Image Enhancement for Visual Odometry in Challenging HDR Environments

Author: Gomez-Ojeda Ruben
Gonzalez-Jimenez Javier
Scaramuzza Davide
Zhang Zichao
Publication venue
Publication date: 09/04/2018
Field of study

One of the main open challenges in visual odometry (VO) is the robustness to difficult illumination conditions or high dynamic range (HDR) environments. The main difficulties in these situations come from both the limitations of the sensors and the inability to perform a successful tracking of interest points because of the bold assumptions in VO, such as brightness constancy. We address this problem from a deep learning perspective, for which we first fine-tune a Deep Neural Network (DNN) with the purpose of obtaining enhanced representations of the sequences for VO. Then, we demonstrate how the insertion of Long Short Term Memory (LSTM) allows us to obtain temporally consistent sequences, as the estimation depends on previous states. However, the use of very deep networks does not allow the insertion into a real-time VO framework; therefore, we also propose a Convolutional Neural Network (CNN) of reduced size capable of performing faster. Finally, we validate the enhanced representations by evaluating the sequences produced by the two architectures in several state-of-art VO algorithms, such as ORB-SLAM and DSO

arXiv.org e-Print Archive

ZORA

Automatic Model Based Dataset Generation for Fast and Accurate Crop and Weeds Detection

Author: Di Cicco Maurilio
Grisetti Giorgio
Potena Ciro
Pretto Alberto
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

Selective weeding is one of the key challenges in the field of agriculture robotics. To accomplish this task, a farm robot should be able to accurately detect plants and to distinguish them between crop and weeds. Most of the promising state-of-the-art approaches make use of appearance-based models trained on large annotated datasets. Unfortunately, creating large agricultural datasets with pixel-level annotations is an extremely time consuming task, actually penalizing the usage of data-driven techniques. In this paper, we face this problem by proposing a novel and effective approach that aims to dramatically minimize the human intervention needed to train the detection and classification algorithms. The idea is to procedurally generate large synthetic training datasets randomizing the key features of the target environment (i.e., crop and weed species, type of soil, light conditions). More specifically, by tuning these model parameters, and exploiting a few real-world textures, it is possible to render a large amount of realistic views of an artificial agricultural scenario with no effort. The generated data can be directly used to train the model or to supplement real-world images. We validate the proposed methodology by using as testbed a modern deep learning based image segmentation architecture. We compare the classification results obtained using both real and synthetic images as training data. The reported results confirm the effectiveness and the potentiality of our approach.Comment: To appear in IEEE/RSJ IROS 201

arXiv.org e-Print Archive

Crossref

Archivio della ricerca- Università di Roma La Sapienza

Archivio istituzionale della ricerca - Università di Padova

Depth estimation of a single RGB image with semisupervised two-stage regression

Author: Chi Jun
Dong Junyu
Gao Jie
Qi Lin
Yu Hui
Zhang Shu
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 15/11/2019
Field of study

Crossref

Portsmouth University Research Portal (Pure)

FastDepth: Fast Monocular Depth Estimation on Embedded Systems

Author: Karaman Sertac
Ma Fangchang
Sze Vivienne
Wofk Diana
Yang Tien-Ju
Publication venue
Publication date: 07/03/2019
Field of study

Depth sensing is a critical function for robotic tasks such as localization, mapping and obstacle detection. There has been a significant and growing interest in depth estimation from a single RGB image, due to the relatively low cost and size of monocular cameras. However, state-of-the-art single-view depth estimation algorithms are based on fairly complex deep neural networks that are too slow for real-time inference on an embedded platform, for instance, mounted on a micro aerial vehicle. In this paper, we address the problem of fast depth estimation on embedded systems. We propose an efficient and lightweight encoder-decoder network architecture and apply network pruning to further reduce computational complexity and latency. In particular, we focus on the design of a low-latency decoder. Our methodology demonstrates that it is possible to achieve similar accuracy as prior work on depth estimation, but at inference speeds that are an order of magnitude faster. Our proposed network, FastDepth, runs at 178 fps on an NVIDIA Jetson TX2 GPU and at 27 fps when using only the TX2 CPU, with active power consumption under 10 W. FastDepth achieves close to state-of-the-art accuracy on the NYU Depth v2 dataset. To the best of the authors' knowledge, this paper demonstrates real-time monocular depth estimation using a deep neural network with the lowest latency and highest throughput on an embedded platform that can be carried by a micro aerial vehicle.Comment: Accepted for presentation at ICRA 2019. 8 pages, 6 figures, 7 table

arXiv.org e-Print Archive

DSpace@MIT

Crossref

Anytime Stereo Image Depth Estimation on Mobile Devices

Author: Campbell Mark
Huang Gao
Lai Zihang
van der Maaten Laurens
Wang Brian H.
Wang Yan
Weinberger Kilian Q.
Publication venue
Publication date: 05/03/2019
Field of study

Many applications of stereo depth estimation in robotics require the generation of accurate disparity maps in real time under significant computational constraints. Current state-of-the-art algorithms force a choice between either generating accurate mappings at a slow pace, or quickly generating inaccurate ones, and additionally these methods typically require far too many parameters to be usable on power- or memory-constrained devices. Motivated by these shortcomings, we propose a novel approach for disparity prediction in the anytime setting. In contrast to prior work, our end-to-end learned approach can trade off computation and accuracy at inference time. Depth estimation is performed in stages, during which the model can be queried at any time to output its current best estimate. Our final model can process 1242

\times

375 resolution images within a range of 10-35 FPS on an NVIDIA Jetson TX2 module with only marginal increases in error -- using two orders of magnitude fewer parameters than the most competitive baseline. The source code is available at https://github.com/mileyan/AnyNet .Comment: Accepted by ICRA201

arXiv.org e-Print Archive

Crossref