Search CORE

53 research outputs found

3D-BEVIS: Bird's-Eye-View Instance Segmentation

Author: A Dai
D Rethage
F Engelmann
L-C Chen
O Ronneberger
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

Recent deep learning models achieve impressive results on 3D scene analysis tasks by operating directly on unstructured point clouds. A lot of progress was made in the field of object classification and semantic segmentation. However, the task of instance segmentation is less explored. In this work, we present 3D-BEVIS, a deep learning framework for 3D semantic instance segmentation on point clouds. Following the idea of previous proposal-free instance segmentation approaches, our model learns a feature embedding and groups the obtained feature space into semantic instances. Current point-based methods scale linearly with the number of points by processing local sub-parts of a scene individually. However, to perform instance segmentation by clustering, globally consistent features are required. Therefore, we propose to combine local point geometry with global context information from an intermediate bird's-eye view representation.Comment: camera-ready version for GCPR '1

arXiv.org e-Print Archive

Crossref

Publikationsserver der RWTH Aachen University

MPG.PuRe

Joint geometry and color point cloud denoising based on graph wavelets

Author: Irfan Muhammad Abeer
Magli Enrico
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2021
Field of study

A point cloud is an effective 3D geometrical presentation of data paired with different attributes such as transparency, normal and color of each point. The imperfect acquisition process of a 3D point cloud usually generates a significant amount of noise. Hence, point cloud denoising has received a lot of attention. Most of the existing techniques perform point cloud denoising based only on the geometry information of the neighbouring points; there are very few works considering the problem of denoising of color attributes of a point cloud, and taking advantage of the correlation between geometry and color. In this article, we introduce a novel non-iterative set-up for the denoising of point cloud based on spectral graph wavelet transform (SGW) that jointly exploits geometry and color to perform denoising of geometry and color attributes in graph spectral domain. The designed framework is based on the construction of joint geometry and color graph that compacts the energy of smooth graph signals in the low-frequency bands. The noise is then removed from the spectral graph wavelet coefficients by applying data-driven adaptive soft-thresholding. Extensive simulation results show that the proposed denoising technique significantly outperforms state-of-the-art methods using both subjective and objective quality metrics

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Open Access Repository

RDFC-GAN: RGB-Depth Fusion CycleGAN for Indoor Depth Completion

Author: Che Zhengping
Feng Feifei
Qi Mengshi
Qiao Xiuquan
Tang Jian
Wang Haowen
Wang Mingyuan
Xu Zhiyuan
Publication venue
Publication date: 06/06/2023
Field of study

The raw depth image captured by indoor depth sensors usually has an extensive range of missing depth values due to inherent limitations such as the inability to perceive transparent objects and the limited distance range. The incomplete depth map with missing values burdens many downstream vision tasks, and a rising number of depth completion methods have been proposed to alleviate this issue. While most existing methods can generate accurate dense depth maps from sparse and uniformly sampled depth maps, they are not suitable for complementing large contiguous regions of missing depth values, which is common and critical in images captured in indoor environments. To overcome these challenges, we design a novel two-branch end-to-end fusion network named RDFC-GAN, which takes a pair of RGB and incomplete depth images as input to predict a dense and completed depth map. The first branch employs an encoder-decoder structure, by adhering to the Manhattan world assumption and utilizing normal maps from RGB-D information as guidance, to regress the local dense depth values from the raw depth map. In the other branch, we propose an RGB-depth fusion CycleGAN to transfer the RGB image to the fine-grained textured depth map. We adopt adaptive fusion modules named W-AdaIN to propagate the features across the two branches, and we append a confidence fusion head to fuse the two outputs of the branches for the final depth map. Extensive experiments on NYU-Depth V2 and SUN RGB-D demonstrate that our proposed method clearly improves the depth completion performance, especially in a more realistic setting of indoor environments, with the help of our proposed pseudo depth maps in training.Comment: Haowen Wang and Zhengping Che are with equal contributions. Under review. An earlier version has been accepted by CVPR 2022 (arXiv:2203.10856

arXiv.org e-Print Archive

Robust Change Detection Based on Neural Descriptor Fields

Author: Du Yilun
Fu Jiahui
Leonard John J.
Singh Kurran
Tenenbaum Joshua B.
Publication venue
Publication date: 01/08/2022
Field of study

The ability to reason about changes in the environment is crucial for robots operating over extended periods of time. Agents are expected to capture changes during operation so that actions can be followed to ensure a smooth progression of the working session. However, varying viewing angles and accumulated localization errors make it easy for robots to falsely detect changes in the surrounding world due to low observation overlap and drifted object associations. In this paper, based on the recently proposed category-level Neural Descriptor Fields (NDFs), we develop an object-level online change detection approach that is robust to partially overlapping observations and noisy localization results. Utilizing the shape completion capability and SE(3)-equivariance of NDFs, we represent objects with compact shape codes encoding full object shapes from partial observations. The objects are then organized in a spatial tree structure based on object centers recovered from NDFs for fast queries of object neighborhoods. By associating objects via shape code similarity and comparing local object-neighbor spatial layout, our proposed approach demonstrates robustness to low observation overlap and localization noises. We conduct experiments on both synthetic and real-world sequences and achieve improved change detection results compared to multiple baseline methods. Project webpage: https://yilundu.github.io/ndf_changeComment: 8 pages, 8 figures, and 2 tables. Accepted to IROS 2022. Project webpage: https://yilundu.github.io/ndf_chang

arXiv.org e-Print Archive

3D data fusion from multiple sensors and its applications

Author: Marin Giulio
Publication venue
Publication date: 31/01/2017
Field of study

The introduction of depth cameras in the mass market contributed to make computer vision applicable to many real world applications, such as human interaction in virtual environments, autonomous driving, robotics and 3D reconstruction. All these problems were originally tackled by means of standard cameras, but the intrinsic ambiguity in the bidimensional images led to the development of depth cameras technologies. Stereo vision was first introduced to provide an estimate of the 3D geometry of the scene. Structured light depth cameras were developed to use the same concepts of stereo vision but overcome some of the problems of passive technologies. Finally, Time-of-Flight (ToF) depth cameras solve the same depth estimation problem by using a different technology. This thesis focuses on the acquisition of depth data from multiple sensors and presents techniques to efficiently combine the information of different acquisition systems. The three main technologies developed to provide depth estimation are first reviewed, presenting operating principles and practical issues of each family of sensors. The use of multiple sensors then is investigated, providing practical solutions to the problem of 3D reconstruction and gesture recognition. Data from stereo vision systems and ToF depth cameras are combined together to provide a higher quality depth map. A confidence measure of depth data from the two systems is used to guide the depth data fusion. The lack of datasets with data from multiple sensors is addressed by proposing a system for the collection of data and ground truth depth, and a tool to generate synthetic data from standard cameras and ToF depth cameras. For gesture recognition, a depth camera is paired with a Leap Motion device to boost the performance of the recognition task. A set of features from the two devices is used in a classification framework based on Support Vector Machines and Random Forests

Archivio istituzionale della ricerca - Università di Padova

Point-based Gesture Recognition Techniques

Author: Al-Jumeily D
Asadianfam S
Jayabalan M
Kolivand H
Sharma V
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date
Field of study

Gesture recognition is a computing process that attempts to recognize and interpret human gestures through the use of mathematical algorithms. In this paper, we describe Point Based Gesture Recognition and Point Clouds nearest neighbors and sampling. Also, we explore these techniques with previous studies

LJMU Research Online (Liverpool John Moores University)

3D modeling by low-cost range cameras: methods and potentialities

Author: Ravanelli Roberta
Publication venue
Publication date: 20/02/2017
Field of study

Nowadays the demand of 3D models for the documentation and visualization of objects and environments is continually increasing. However, the traditional 3D modeling techniques and systems (i.e. photogrammetry and laser scanners) can be very expensive and/or onerous, as they often need qualified technicians and specific post-processing phases. Thus, it is important to find new instruments, able to provide low-cost 3D data in real time and in a user-friendly way. Range cameras seem one of the most promising tools to achieve this goal: they are low-cost 3D scanners, able to easily collect dense point clouds at high frame rate, in a short range (few meters) from the imaged objects. Such sensors, though, still remain a relatively new 3D measurement technology, not yet exhaustively studied. Thus, it is essential to assess the metric quality of the depth data retrieved by these devices. This thesis is precisely included in this background: the aim is to evaluate the potentialities of range cameras for geomatic applications and to provide useful indications for their practical use. Therefore the three most popular and/or promising low-cost range cameras, namely the Microsoft Kinect v1, the Micorsoft Kinect v2 and the Occipital Structure Sensor, were firstly characterized from a geomatic point of view in order to assess the metric quality of the depth data retrieved by them. These investigations showed that such sensors present a depth precision and a depth accuracy in the range of some millimeters to few centimeters, depending both on the operational principle adopted by the single device (Structured Light or Time of Flight) and on the depth itself. On this basis, two different models were identified for precision and accuracy vs. depth: parabolic for the Structured Light (the Kinect v1 and the Structure Sensor) and linear for Time of Flight (the Kinect v2) sensors, respectively. Then the effectiveness of such accuracy models was demonstrated to be globally compliant with the found precision models for all of the three sensors. Furthermore, the proposed calibration model was validated for the Structure Sensor: with calibration, the overall RMSE, decreased from 27 to 16 mm. Finally four case studies were carried out in order to evaluate: • the performances of the Kinect v2 sensor for monitoring oscillatory motions (relevant for structural and/or industrial monitoring), demonstrating a good ability of the system to detect movements and displacements; • the integration feasibility of Kinect v2 with a classical stereo system, highlighting the need of an integration of range cameras into 3D classical photogrammetric systems especially to overpass limitations due to acquisition completeness; • the potentialities of the Structure Sensor for the 3D surveying of indoor environments, showing a more than sufficient accuracy for most applications; • the potentialities of the Structure Sensor to document archaeological small finds, where metric accuracy seems to be rather good while textured models shows some misalignments. In conclusion, although the experimental results demonstrated that range cameras have the capability to give good and encouraging results, the performances of traditional 3D modeling techniques in terms of accuracy and precision are still superior and must be preferred when the accuracy requirements are restrictive. But for a very wide and continuously increasing range of applications, when the required accuracy can be at the level from few millimeters (very close-range) to few centimeters, then range cameras can be a valuable alternative, especially when non expert users are involved. Furthermore, the technology on which these sensors are based is continually evolving, driven also by the new generation of AR/VR reality kits, and certainly also their geometric performances will soon improve

Archivio della ricerca- Università di Roma La Sapienza

RE&AM-based methods and tools for biomedical engineering

Author: Servi Michaela
Publication venue
Publication date: 01/01/2020
Field of study

Florence Research

In-Field Estimation of Orange Number and Size by 3D Laser Scanning

Author: Manzano Agugliaro Francisco
Miranda Fuentes Antonio
Méndez Valeriano
Pérez Romero Antonio Miguel
Rodríguez Lizana Antonio
Sola Guirado Rubén
Zapata Sierra Antonio
Publication venue: 'MDPI AG'
Publication date: 01/01/2019
Field of study

The estimation of fruit load of an orchard prior to harvest is useful for planning harvest logistics and trading decisions. The manual fruit counting and the determination of the harvesting capacity of the field results are expensive and time-consuming. The automatic counting of fruits and their geometry characterization with 3D LiDAR models can be an interesting alternative. Field research has been conducted in the province of Cordoba (Southern Spain) on 24 ‘Salustiana’ variety orange trees—Citrus sinensis (L.) Osbeck—(12 were pruned and 12 unpruned). Harvest size and the number of each fruit were registered. Likewise, the unitary weight of the fruits and their diameter were determined (N = 160). The orange trees were also modelled with 3D LiDAR with colour capture for their subsequent segmentation and fruit detection by using a K-means algorithm. In the case of pruned trees, a significant regression was obtained between the real and modelled fruit number (R2 = 0.63, p = 0.01). The opposite case occurred in the unpruned ones (p = 0.18) due to a leaf occlusion problem. The mean diameters proportioned by the algorithm (72.15 ± 22.62 mm) did not present significant differences (p = 0.35) with the ones measured on fruits (72.68 ± 5.728 mm). Even though the use of 3D LiDAR scans is time-consuming, the harvest size estimation obtained in this research is very accurate

Multidisciplinary Digital Publishing Institute

Repositorio Institucional de la Universidad de Córdoba

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

idUS. Depósito de Investigación Universidad de Sevilla

Repositorio Institucional de la Universidad de Almería (Spain)

A Survey of Surface Reconstruction from Point Clouds

Author: Alliez Pierre
Berger Matthew
Guennebaud Gael
Levine Joshua
Seversky Lee
Sharf Andrei
Silva Claudio
Tagliasacchi Andrea
Publication venue: 'Wiley'
Publication date: 01/01/2016
Field of study

International audienceThe area of surface reconstruction has seen substantial progress in the past two decades. The traditional problem addressed by surface reconstruction is to recover the digital representation of a physical shape that has been scanned, where the scanned data contains a wide variety of defects. While much of the earlier work has been focused on reconstructing a piece-wise smooth representation of the original shape, recent work has taken on more specialized priors to address significantly challenging data imperfections, where the reconstruction can take on different representations – not necessarily the explicit geometry. We survey the field of surface reconstruction, and provide a categorization with respect to priors, data imperfections, and reconstruction output. By considering a holistic view of surface reconstruction, we show a detailed characterization of the field, highlight similarities between diverse reconstruction techniques, and provide directions for future work in surface reconstruction

Infoscience - École polytechnique fédérale de Lausanne

Crossref

INRIA a CCSD electronic archive server

Oskar Bordeaux