Search CORE

193 research outputs found

Distributed bundle adjustment with block-based sparse matrix compression for super large scale datasets

Author: Chen Nengcheng
Jiang Yuyao
Lu Xingyue
Qiu Huanbin
Qu Hao
Zeng Xiaoru
Zheng Maoteng
Zhu Junfeng
Publication venue
Publication date: 13/08/2023
Field of study

We propose a distributed bundle adjustment (DBA) method using the exact Levenberg-Marquardt (LM) algorithm for super large-scale datasets. Most of the existing methods partition the global map to small ones and conduct bundle adjustment in the submaps. In order to fit the parallel framework, they use approximate solutions instead of the LM algorithm. However, those methods often give sub-optimal results. Different from them, we utilize the exact LM algorithm to conduct global bundle adjustment where the formation of the reduced camera system (RCS) is actually parallelized and executed in a distributed way. To store the large RCS, we compress it with a block-based sparse matrix compression format (BSMC), which fully exploits its block feature. The BSMC format also enables the distributed storage and updating of the global RCS. The proposed method is extensively evaluated and compared with the state-of-the-art pipelines using both synthetic and real datasets. Preliminary results demonstrate the efficient memory usage and vast scalability of the proposed method compared with the baselines. For the first time, we conducted parallel bundle adjustment using LM algorithm on a real datasets with 1.18 million images and a synthetic dataset with 10 million images (about 500 times that of the state-of-the-art LM-based BA) on a distributed computing system.Comment: camera ready version for ICCV202

arXiv.org e-Print Archive

Object Detection: Current and Future Directions

Author: Javier Ruiz-del-Solar
Rodrigo Verschae
Publication venue: 'Frontiers Media SA'
Publication date: 01/01/2015
Field of study

Frontiers - Publisher Connector

Discrete Visual Perception

Author: Komodakis Nikos
Paragios Nikos
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 25/08/2014
Field of study

International audienceComputational vision and biomedical image have made tremendous progress of the past decade. This is mostly due the development of efficient learning and inference algorithms which allow better, faster and richer modeling of visual perception tasks. Graph-based representations are among the most prominent tools to address such perception through the casting of perception as a graph optimization problem. In this paper, we brieﬂy introduce the interest of such representations, discuss their strength and limitations and present their application to address a variety of problems in computer vision and biomedical image analysis

HAL-CentraleSupelec

CiteSeerX

Crossref

INRIA a CCSD electronic archive server

HAL-Ecole des Ponts ParisTech

HAL - UPEC / UPEM

HAL-Rennes 1

Visual Geometry Grounded Deep Structure From Motion

Author: Karaev Nikita
Novotny David
Rupprecht Christian
Wang Jianyuan
Publication venue
Publication date: 07/12/2023
Field of study

Structure-from-motion (SfM) is a long-standing problem in the computer vision community, which aims to reconstruct the camera poses and 3D structure of a scene from a set of unconstrained 2D images. Classical frameworks solve this problem in an incremental manner by detecting and matching keypoints, registering images, triangulating 3D points, and conducting bundle adjustment. Recent research efforts have predominantly revolved around harnessing the power of deep learning techniques to enhance specific elements (e.g., keypoint matching), but are still based on the original, non-differentiable pipeline. Instead, we propose a new deep pipeline VGGSfM, where each component is fully differentiable and thus can be trained in an end-to-end manner. To this end, we introduce new mechanisms and simplifications. First, we build on recent advances in deep 2D point tracking to extract reliable pixel-accurate tracks, which eliminates the need for chaining pairwise matches. Furthermore, we recover all cameras simultaneously based on the image and track features instead of gradually registering cameras. Finally, we optimise the cameras and triangulate 3D points via a differentiable bundle adjustment layer. We attain state-of-the-art performance on three popular datasets, CO3D, IMC Phototourism, and ETH3D.Comment: 8 figures. Project page: https://vggsfm.github.io

arXiv.org e-Print Archive

Large-Scale Mapping of Small Roads in Lidar Images Using Deep Convolutional Neural Networks

Author: Kampffmeyer Michael C.
Salberg Arnt Børre
Trier Øivind Due
Publication venue: Springer Nature
Publication date: 01/01/2017
Field of study

Detailed and complete mapping of forest roads is important for the forest industry since they are used for timber transport by trucks with long trailers. This paper proposes a new automatic method for large-scale mapping forest roads from airborne laser scanning data. The method is based on a fully convolutional neural network that performs end-to-end segmentation. To train the network, a large set of image patches with corresponding road label information are applied. The final network is then applied to detect and map forest roads from lidar data covering the Etnedal municipality in Norway. The results show that we are able to map the forest roads with an overall accuracy of 97.2%. We conclude that the method has a strong potential for large-scale operational mapping of forest roads

Crossref

Munin - Open Research Archive

DIOR: Dataset for Indoor-Outdoor Reidentification -- Long Range 3D/2D Skeleton Gait Collection Pipeline, Semi-Automated Gait Keypoint Labeling and Baseline Evaluation Methods

Author: Chen Yuyang
Dantu Karthik
Jawade Bhavin
Masilamani Praveen Raj
Setlur Srirangaraj
Publication venue
Publication date: 21/09/2023
Field of study

In recent times, there is an increased interest in the identification and re-identification of people at long distances, such as from rooftop cameras, UAV cameras, street cams, and others. Such recognition needs to go beyond face and use whole-body markers such as gait. However, datasets to train and test such recognition algorithms are not widely prevalent, and fewer are labeled. This paper introduces DIOR -- a framework for data collection, semi-automated annotation, and also provides a dataset with 14 subjects and 1.649 million RGB frames with 3D/2D skeleton gait labels, including 200 thousands frames from a long range camera. Our approach leverages advanced 3D computer vision techniques to attain pixel-level accuracy in indoor settings with motion capture systems. Additionally, for outdoor long-range settings, we remove the dependency on motion capture systems and adopt a low-cost, hybrid 3D computer vision and learning pipeline with only 4 low-cost RGB cameras, successfully achieving precise skeleton labeling on far-away subjects, even when their height is limited to a mere 20-25 pixels within an RGB frame. On publication, we will make our pipeline open for others to use

arXiv.org e-Print Archive

HDMNet: A Hierarchical Matching Network with Double Attention for Large-scale Outdoor LiDAR Point Cloud Registration

Author: Chen Guang
Lu Fan
Xue Weiyi
Publication venue
Publication date: 28/10/2023
Field of study

Outdoor LiDAR point clouds are typically large-scale and complexly distributed. To achieve efficient and accurate registration, emphasizing the similarity among local regions and prioritizing global local-to-local matching is of utmost importance, subsequent to which accuracy can be enhanced through cost-effective fine registration. In this paper, a novel hierarchical neural network with double attention named HDMNet is proposed for large-scale outdoor LiDAR point cloud registration. Specifically, A novel feature consistency enhanced double-soft matching network is introduced to achieve two-stage matching with high flexibility while enlarging the receptive field with high efficiency in a patch-to patch manner, which significantly improves the registration performance. Moreover, in order to further utilize the sparse matching information from deeper layer, we develop a novel trainable embedding mask to incorporate the confidence scores of correspondences obtained from pose estimation of deeper layer, eliminating additional computations. The high-confidence keypoints in the sparser point cloud of the deeper layer correspond to a high-confidence spatial neighborhood region in shallower layer, which will receive more attention, while the features of non-key regions will be masked. Extensive experiments are conducted on two large-scale outdoor LiDAR point cloud datasets to demonstrate the high accuracy and efficiency of the proposed HDMNet.Comment: Accepted by WACV202

arXiv.org e-Print Archive