Search CORE

6 research outputs found

Deep Reinforcement Learning of Volume-guided Progressive View Inpainting for 3D Point Scene Completion from a Single Depth Image

Author: Cui Shuguang
Du Dong
Han Xiaoguang
Liu Ligang
Pan Pan
Xiong Zixiang
Yang Mingdai
Yang Xin
Yu Jingming
Zhang Zhaoxuan
Publication venue
Publication date: 11/03/2019
Field of study

We present a deep reinforcement learning method of progressive view inpainting for 3D point scene completion under volume guidance, achieving high-quality scene reconstruction from only a single depth image with severe occlusion. Our approach is end-to-end, consisting of three modules: 3D scene volume reconstruction, 2D depth map inpainting, and multi-view selection for completion. Given a single depth image, our method first goes through the 3D volume branch to obtain a volumetric scene reconstruction as a guide to the next view inpainting step, which attempts to make up the missing information; the third step involves projecting the volume under the same view of the input, concatenating them to complete the current view depth, and integrating all depth into the point cloud. Since the occluded areas are unavailable, we resort to a deep Q-Network to glance around and pick the next best view for large hole completion progressively until a scene is adequately reconstructed while guaranteeing validity. All steps are learned jointly to achieve robust and consistent results. We perform qualitative and quantitative evaluations with extensive experiments on the SUNCG data, obtaining better results than the state of the art.Comment: Accepted as CVPR 2019 Ora

arXiv.org e-Print Archive

Next-Best View Policy for 3D Reconstruction

Author: Aguilar Justine Aletta
Atienza Rowel
Cajote Rhandley
Casimiro Joel
Nilles Aldrin Michael
Peralta Daryl
Publication venue
Publication date: 06/09/2020
Field of study

Manually selecting viewpoints or using commonly available flight planners like circular path for large-scale 3D reconstruction using drones often results in incomplete 3D models. Recent works have relied on hand-engineered heuristics such as information gain to select the Next-Best Views. In this work, we present a learning-based algorithm called Scan-RL to learn a Next-Best View (NBV) Policy. To train and evaluate the agent, we created Houses3K, a dataset of 3D house models. Our experiments show that using Scan-RL, the agent can scan houses with fewer number of steps and a shorter distance compared to our baseline circular path. Experimental results also demonstrate that a single NBV policy can be used to scan multiple houses including those that were not seen during training. The link to Scan-RL is available at https://github.com/darylperalta/ScanRL and Houses3K dataset can be found at https://github.com/darylperalta/Houses3K.Comment: To be published in ECCV 2020 Workshops; typos in abstract correcte

arXiv.org e-Print Archive

Robust Image Matching By Dynamic Feature Selection

Author: Chen Jianchun
Fang Yi
Huang Hao
Li Xiang
Wang Lingjing
Publication venue
Publication date: 13/08/2020
Field of study

Estimating dense correspondences between images is a long-standing image under-standing task. Recent works introduce convolutional neural networks (CNNs) to extract high-level feature maps and find correspondences through feature matching. However,high-level feature maps are in low spatial resolution and therefore insufficient to provide accurate and fine-grained features to distinguish intra-class variations for correspondence matching. To address this problem, we generate robust features by dynamically selecting features at different scales. To resolve two critical issues in feature selection,i.e.,how many and which scales of features to be selected, we frame the feature selection process as a sequential Markov decision-making process (MDP) and introduce an optimal selection strategy using reinforcement learning (RL). We define an RL environment for image matching in which each individual action either requires new features or terminates the selection episode by referring a matching score. Deep neural networks are incorporated into our method and trained for decision making. Experimental results show that our method achieves comparable/superior performance with state-of-the-art methods on three benchmarks, demonstrating the effectiveness of our feature selection strategy

arXiv.org e-Print Archive

3D-NVS: A 3D Supervision Approach for Next View Selection

Author: Ashutosh Kumar
Chaudhuri Subhasis
Kumar Saurabh
Publication venue
Publication date: 03/12/2020
Field of study

We present a classification based approach for the next best view selection and show how we can plausibly obtain a supervisory signal for this task. The proposed approach is end-to-end trainable and aims to get the best possible 3D reconstruction quality with a pair of passively acquired 2D views. The proposed model consists of two stages: a classifier and a reconstructor network trained jointly via the indirect 3D supervision from ground truth voxels. While testing, the proposed method assumes no prior knowledge of the underlying 3D shape for selecting the next best view. We demonstrate the proposed method's effectiveness via detailed experiments on synthetic and real images and show how it provides improved reconstruction quality than the existing state of the art 3D reconstruction and the next best view prediction techniques.Comment: Submitted to CVPR-2

arXiv.org e-Print Archive

KAPLAN: A 3D Point Descriptor for Shape Completion

Author: Cherabier Ian
Oswald Martin R.
Pollefeys Marc
Richard Audrey
Schindler Konrad
Publication venue
Publication date: 31/07/2020
Field of study

We present a novel 3D shape completion method that operates directly on unstructured point clouds, thus avoiding resource-intensive data structures like voxel grids. To this end, we introduce KAPLAN, a 3D point descriptor that aggregates local shape information via a series of 2D convolutions. The key idea is to project the points in a local neighborhood onto multiple planes with different orientations. In each of those planes, point properties like normals or point-to-plane distances are aggregated into a 2D grid and abstracted into a feature representation with an efficient 2D convolutional encoder. Since all planes are encoded jointly, the resulting representation nevertheless can capture their correlations and retains knowledge about the underlying 3D shape, without expensive 3D convolutions. Experiments on public datasets show that KAPLAN achieves state-of-the-art performance for 3D shape completion.Comment: 18 pages, 15 figure

arXiv.org e-Print Archive

Synthetic Data for Deep Learning

Author: Nikolenko Sergey I.
Publication venue
Publication date: 25/09/2019
Field of study

Synthetic data is an increasingly popular tool for training deep learning models, especially in computer vision but also in other areas. In this work, we attempt to provide a comprehensive survey of the various directions in the development and application of synthetic data. First, we discuss synthetic datasets for basic computer vision problems, both low-level (e.g., optical flow estimation) and high-level (e.g., semantic segmentation), synthetic environments and datasets for outdoor and urban scenes (autonomous driving), indoor scenes (indoor navigation), aerial navigation, simulation environments for robotics, applications of synthetic data outside computer vision (in neural programming, bioinformatics, NLP, and more); we also survey the work on improving synthetic data development and alternative ways to produce it such as GANs. Second, we discuss in detail the synthetic-to-real domain adaptation problem that inevitably arises in applications of synthetic data, including synthetic-to-real refinement with GAN-based models and domain adaptation at the feature/model level without explicit data transformations. Third, we turn to privacy-related applications of synthetic data and review the work on generating synthetic datasets with differential privacy guarantees. We conclude by highlighting the most promising directions for further work in synthetic data studies.Comment: 156 pages, 24 figures, 719 reference

arXiv.org e-Print Archive