22,403 research outputs found
Dynamic Body VSLAM with Semantic Constraints
Image based reconstruction of urban environments is a challenging problem
that deals with optimization of large number of variables, and has several
sources of errors like the presence of dynamic objects. Since most large scale
approaches make the assumption of observing static scenes, dynamic objects are
relegated to the noise modeling section of such systems. This is an approach of
convenience since the RANSAC based framework used to compute most multiview
geometric quantities for static scenes naturally confine dynamic objects to the
class of outlier measurements. However, reconstructing dynamic objects along
with the static environment helps us get a complete picture of an urban
environment. Such understanding can then be used for important robotic tasks
like path planning for autonomous navigation, obstacle tracking and avoidance,
and other areas. In this paper, we propose a system for robust SLAM that works
in both static and dynamic environments. To overcome the challenge of dynamic
objects in the scene, we propose a new model to incorporate semantic
constraints into the reconstruction algorithm. While some of these constraints
are based on multi-layered dense CRFs trained over appearance as well as motion
cues, other proposed constraints can be expressed as additional terms in the
bundle adjustment optimization process that does iterative refinement of 3D
structure and camera / object motion trajectories. We show results on the
challenging KITTI urban dataset for accuracy of motion segmentation and
reconstruction of the trajectory and shape of moving objects relative to ground
truth. We are able to show average relative error reduction by a significant
amount for moving object trajectory reconstruction relative to state-of-the-art
methods like VISO 2, as well as standard bundle adjustment algorithms
Real-time Spatial Detection and Tracking of Resources in a Construction Environment
Construction accidents with heavy equipment and bad decision making can be based on poor knowledge of the site environment and in both cases may lead to work interruptions and costly delays. Supporting the construction environment with real-time generated three-dimensional (3D) models can help preventing accidents as well as support management by modeling infrastructure assets in 3D. Such models can be integrated in the path planning of construction equipment operations for obstacle avoidance or in a 4D model that simulates construction processes. Detecting and guiding resources, such as personnel, machines and materials in and to the right place on time requires methods and technologies supplying information in real-time. This paper presents research in real-time 3D laser scanning and modeling using high range frame update rate scanning technology. Existing and emerging sensors and techniques in three-dimensional modeling are explained. The presented research successfully developed computational models and algorithms for the real-time detection, tracking, and three-dimensional modeling of static and dynamic construction resources, such as workforce, machines, equipment, and materials based on a 3D video range camera. In particular, the proposed algorithm for rapidly modeling three-dimensional scenes is explained. Laboratory and outdoor field experiments that were conducted to validate the algorithm’s performance and results are discussed
DynMF: Neural Motion Factorization for Real-time Dynamic View Synthesis with 3D Gaussian Splatting
Accurately and efficiently modeling dynamic scenes and motions is considered
so challenging a task due to temporal dynamics and motion complexity. To
address these challenges, we propose DynMF, a compact and efficient
representation that decomposes a dynamic scene into a few neural trajectories.
We argue that the per-point motions of a dynamic scene can be decomposed into a
small set of explicit or learned trajectories. Our carefully designed neural
framework consisting of a tiny set of learned basis queried only in time allows
for rendering speed similar to 3D Gaussian Splatting, surpassing 120 FPS, while
at the same time, requiring only double the storage compared to static scenes.
Our neural representation adequately constrains the inherently underconstrained
motion field of a dynamic scene leading to effective and fast optimization.
This is done by biding each point to motion coefficients that enforce the
per-point sharing of basis trajectories. By carefully applying a sparsity loss
to the motion coefficients, we are able to disentangle the motions that
comprise the scene, independently control them, and generate novel motion
combinations that have never been seen before. We can reach state-of-the-art
render quality within just 5 minutes of training and in less than half an hour,
we can synthesize novel views of dynamic scenes with superior photorealistic
quality. Our representation is interpretable, efficient, and expressive enough
to offer real-time view synthesis of complex dynamic scene motions, in
monocular and multi-view scenarios.Comment: Project page: https://agelosk.github.io/dynmf
RGBD Datasets: Past, Present and Future
Since the launch of the Microsoft Kinect, scores of RGBD datasets have been
released. These have propelled advances in areas from reconstruction to gesture
recognition. In this paper we explore the field, reviewing datasets across
eight categories: semantics, object pose estimation, camera tracking, scene
reconstruction, object tracking, human actions, faces and identification. By
extracting relevant information in each category we help researchers to find
appropriate data for their needs, and we consider which datasets have succeeded
in driving computer vision forward and why.
Finally, we examine the future of RGBD datasets. We identify key areas which
are currently underexplored, and suggest that future directions may include
synthetic data and dense reconstructions of static and dynamic scenes.Comment: 8 pages excluding references (CVPR style
General Dynamic Scene Reconstruction from Multiple View Video
This paper introduces a general approach to dynamic scene reconstruction from
multiple moving cameras without prior knowledge or limiting constraints on the
scene structure, appearance, or illumination. Existing techniques for dynamic
scene reconstruction from multiple wide-baseline camera views primarily focus
on accurate reconstruction in controlled environments, where the cameras are
fixed and calibrated and background is known. These approaches are not robust
for general dynamic scenes captured with sparse moving cameras. Previous
approaches for outdoor dynamic scene reconstruction assume prior knowledge of
the static background appearance and structure. The primary contributions of
this paper are twofold: an automatic method for initial coarse dynamic scene
segmentation and reconstruction without prior knowledge of background
appearance or structure; and a general robust approach for joint segmentation
refinement and dense reconstruction of dynamic scenes from multiple
wide-baseline static or moving cameras. Evaluation is performed on a variety of
indoor and outdoor scenes with cluttered backgrounds and multiple dynamic
non-rigid objects such as people. Comparison with state-of-the-art approaches
demonstrates improved accuracy in both multiple view segmentation and dense
reconstruction. The proposed approach also eliminates the requirement for prior
knowledge of scene structure and appearance
A Framework for Designing 3d Virtual Environments
The process of design and development of virtual environments can be supported by tools and frameworks, to save time in technical aspects and focusing on the content. In this paper we present an academic framework which provides several levels of abstraction to ease this work. It includes state-of-the-art components we devised or integrated adopting open-source solutions in order to face specific problems. Its architecture is modular and customizable, the code is open-source.\u
- …