Search CORE

1,674 research outputs found

Depth from Monocular Images using a Semi-Parallel Deep Neural Network (SPDNN) Hybrid Architecture

Author: Bazrafkan S.
Corcoran P.
Javidnia H.
Lemley J.
Publication venue: 'SPIE-Intl Soc Optical Eng'
Publication date: 18/04/2018
Field of study

Deep neural networks are applied to a wide range of problems in recent years. In this work, Convolutional Neural Network (CNN) is applied to the problem of determining the depth from a single camera image (monocular depth). Eight different networks are designed to perform depth estimation, each of them suitable for a feature level. Networks with different pooling sizes determine different feature levels. After designing a set of networks, these models may be combined into a single network topology using graph optimization techniques. This "Semi Parallel Deep Neural Network (SPDNN)" eliminates duplicated common network layers, and can be further optimized by retraining to achieve an improved model compared to the individual topologies. In this study, four SPDNN models are trained and have been evaluated at 2 stages on the KITTI dataset. The ground truth images in the first part of the experiment are provided by the benchmark, and for the second part, the ground truth images are the depth map results from applying a state-of-the-art stereo matching method. The results of this evaluation demonstrate that using post-processing techniques to refine the target of the network increases the accuracy of depth estimation on individual mono images. The second evaluation shows that using segmentation data alongside the original data as the input can improve the depth estimation results to a point where performance is comparable with stereo depth estimation. The computational time is also discussed in this study.Comment: 44 pages, 25 figure

arXiv.org e-Print Archive

Irish Universities

Access to Research at National University of Ireland, Galway

Integration of Absolute Orientation Measurements in the KinectFusion Reconstruction pipeline

Author: Ghanem Bernard S.
Giancola Silvio
Schneider Jens
Wonka Peter
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 11/02/2018
Field of study

In this paper, we show how absolute orientation measurements provided by low-cost but high-fidelity IMU sensors can be integrated into the KinectFusion pipeline. We show that integration improves both runtime, robustness and quality of the 3D reconstruction. In particular, we use this orientation data to seed and regularize the ICP registration technique. We also present a technique to filter the pairs of 3D matched points based on the distribution of their distances. This filter is implemented efficiently on the GPU. Estimating the distribution of the distances helps control the number of iterations necessary for the convergence of the ICP algorithm. Finally, we show experimental results that highlight improvements in robustness, a speed-up of almost 12%, and a gain in tracking quality of 53% for the ATE metric on the Freiburg benchmark.Comment: CVPR Workshop on Visual Odometry and Computer Vision Applications Based on Location Clues 201

arXiv.org e-Print Archive

Crossref

Scipedia

Recommended from our members

Generating Absolute-Scale Point Cloud Data of Built Infrastructure Scenes Using a Monocular Camera Setting

Author: Brilakis Ioannis
Rashidi Abbas
Vela Patricio
Publication venue: JOURNAL OF COMPUTING IN CIVIL ENGINEERING
Publication date: 21/07/2014
Field of study

The global scale of Point Cloud Data (PCD) generated through monocular photo/videogrammetry is unknown, and can be calculated using at least one known dimension of the scene. Measuring one or more dimensions for this purpose induces a manual step in the 3D reconstruction process; this increases the effort and reduces the speed of reconstructing scenes, and induces substantial human error in the process due to the high level of measurement accuracy needed. Other ways of measuring such dimensions are based on acquiring additional information by either using extra sensors or specific classes of objects existing in the scene; we found that these solutions are not simple, cost effective or general enough to be considered practical for reconstructing both indoor and outdoor built infrastructure scenes. To address the issue, in this paper, we propose a novel method for automatically calculating the absolute scale of built infrastructure PCD. We use a pre-measured cube for outdoor scenes and a sheet of paper for indoor environments as the calibration patterns. Assuming that the dimensions of these objects are known, the proposed method extracts the objects’ corner points in 2D video frames using a novel algorithm. The extracted corner points are then matched between the consecutive frames. Finally, the corresponding corner points are reconstructed along with other features of the scenes to determine the real world scale. To evaluate the performance of the method, ten indoor and ten outdoor cases were selected and the absolute-scale PCD for each case was computed. Results illustrated the proposed algorithm is able to reconstruct the predefined objects with a high success rate while the generated absolute scale PCD is sufficiently accurate.This is the accepted manuscript. The final version is available from ASCE at http://dx.doi.org/10.1061/(ASCE)CP.1943-5487.000041

Apollo (Cambridge)

Evaluation of CNN-based Single-Image Depth Estimation Methods

Author: A Saxena
Arno Knapitsch
F Liu
N Silberman
P Dollár
R Garg
S Kim
Publication venue
Publication date: 01/01/2018
Field of study

While an increasing interest in deep models for single-image depth estimation methods can be observed, established schemes for their evaluation are still limited. We propose a set of novel quality criteria, allowing for a more detailed analysis by focusing on specific characteristics of depth maps. In particular, we address the preservation of edges and planar regions, depth consistency, and absolute distance accuracy. In order to employ these metrics to evaluate and compare state-of-the-art single-image depth estimation approaches, we provide a new high-quality RGB-D dataset. We used a DSLR camera together with a laser scanner to acquire high-resolution images and highly accurate depth maps. Experimental results show the validity of our proposed evaluation protocol

arXiv.org e-Print Archive

Institute of Transport Research:Publications

Crossref

Keyframe-based visual–inertial odometry using nonlinear optimization

Author: Bosse M
Furgale P
Leutenegger S
Lynen S
Siegwart R
Publication venue: 'SAGE Publications'
Publication date: 15/12/2014
Field of study

Combining visual and inertial measurements has become popular in mobile robotics, since the two sensing modalities offer complementary characteristics that make them the ideal choice for accurate visual–inertial odometry or simultaneous localization and mapping (SLAM). While historically the problem has been addressed with filtering, advancements in visual estimation suggest that nonlinear optimization offers superior accuracy, while still tractable in complexity thanks to the sparsity of the underlying problem. Taking inspiration from these findings, we formulate a rigorously probabilistic cost function that combines reprojection errors of landmarks and inertial terms. The problem is kept tractable and thus ensuring real-time operation by limiting the optimization to a bounded window of keyframes through marginalization. Keyframes may be spaced in time by arbitrary intervals, while still related by linearized inertial terms. We present evaluation results on complementary datasets recorded with our custom-built stereo visual–inertial hardware that accurately synchronizes accelerometer and gyroscope measurements with imagery. A comparison of both a stereo and monocular version of our algorithm with and without online extrinsics estimation is shown with respect to ground truth. Furthermore, we compare the performance to an implementation of a state-of-the-art stochastic cloning sliding-window filter. This competitive reference implementation performs tightly coupled filtering-based visual–inertial odometry. While our approach declaredly demands more computation, we show its superior performance in terms of accuracy

Crossref

Spiral - Imperial College Digital Repository

PresSim: An End-to-end Framework for Dynamic Ground Pressure Profile Generation from Monocular Videos Using Physics-based 3D Simulation

Author: Lukowicz Paul
Ray Lala Shakti Swarup
Suh Sungho
Zhou Bo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/02/2023
Field of study

Ground pressure exerted by the human body is a valuable source of information for human activity recognition (HAR) in unobtrusive pervasive sensing. While data collection from pressure sensors to develop HAR solutions requires significant resources and effort, we present a novel end-to-end framework, PresSim, to synthesize sensor data from videos of human activities to reduce such effort significantly. PresSim adopts a 3-stage process: first, extract the 3D activity information from videos with computer vision architectures; then simulate the floor mesh deformation profiles based on the 3D activity information and gravity-included physics simulation; lastly, generate the simulated pressure sensor data with deep learning models. We explored two approaches for the 3D activity information: inverse kinematics with mesh re-targeting, and volumetric pose and shape estimation. We validated PresSim with an experimental setup with a monocular camera to provide input and a pressure-sensing fitness mat (80x28 spatial resolution) to provide the sensor ground truth, where nine participants performed a set of predefined yoga sequences.Comment: Percom2023 workshop(UMUM2023

arXiv.org e-Print Archive