483 research outputs found
Monocular-Based Pose Determination of Uncooperative Known and Unknown Space Objects
In order to support spacecraft proximity operations, such as on-orbit servicing and spacecraft formation flying, several vision-based techniques exist to determine the relative pose of an uncooperative orbiting object with respect to the spacecraft. Depending on whether the object is known or unknown, a shape model of the orbiting target object may have to be constructed autonomously by making use of only optical measurements. In this paper, we investigate two vision-based approaches for pose estimation of uncooperative orbiting targets: one that is general and versatile such that it does not require a priori knowledge of any information of the target, and the other one that requires knowledge of the target's shape geometry. The former uses an estimation algorithm of translational and rotational dynamics to sequentially perform simultaneous pose determination and 3D shape reconstruction of the unknown target, while the latter relies on a known 3D model of the target's geometry to provide a point-by-point pose solution. The architecture and implementation of both methods are presented and their achievable performance is evaluated through numerical simulations. In addition, a computer vision processing strategy for feature detection and matching and the Structure from Motion (SfM) algorithm for on-board 3D reconstruction are also discussed and validated by using a dataset of images that are synthetically generated according to a chaser/target relative motion in Geosynchronous Orbit (GEO)
Monocular-Based Pose Determination of Uncooperative Space Objects
Vision-based methods to determine the relative pose of an uncooperative orbiting object are investigated in applications to spacecraft proximity operations, such as on-orbit servicing, spacecraft formation flying, and small bodies exploration. Depending on whether the object is known or unknown, a shape model of the orbiting target object may have to be constructed autonomously in real-time by making use of only optical measurements. The Simultaneous Estimation of Pose and Shape (SEPS) algorithm that does not require a priori knowledge of the pose and shape of the target is presented. This makes use of a novel measurement equation and filter that can efficiently use optical flow information along with a star tracker to estimate the target's angular rotational and translational relative velocity as well as its center of gravity. Depending on the mission constraints, SEPS can be augmented by a more accurate offline, on-board 3D reconstruction of the target shape, which allows for the estimation of the pose as a known target. The use of Structure from Motion (SfM) for this purpose is discussed. A model-based approach for pose estimation of known targets is also presented. The architecture and implementation of both the proposed approaches are elucidated and their performance metrics are evaluated through numerical simulations by using a dataset of images that are synthetically generated according to a chaser/target relative motion in Geosynchronous Orbit (GEO)
3 Dimensional Dense Reconstruction: A Review of Algorithms and Dataset
3D dense reconstruction refers to the process of obtaining the complete shape
and texture features of 3D objects from 2D planar images. 3D reconstruction is
an important and extensively studied problem, but it is far from being solved.
This work systematically introduces classical methods of 3D dense
reconstruction based on geometric and optical models, as well as methods based
on deep learning. It also introduces datasets for deep learning and the
performance and advantages and disadvantages demonstrated by deep learning
methods on these datasets.Comment: 16 page
Distributed scene reconstruction from multiple mobile platforms
Recent research on mobile robotics has produced new designs that provide
house-hold robots with omnidirectional motion. The image sensor embedded
in these devices motivates the application of 3D vision techniques on them
for navigation and mapping purposes. In addition to this, distributed cheapsensing
systems acting as unitary entity have recently been discovered as an
efficient alternative to expensive mobile equipment.
In this work we present an implementation of a visual reconstruction method,
structure from motion (SfM), on a low-budget, omnidirectional mobile platform,
and extend this method to distributed 3D scene reconstruction with
several instances of such a platform.
Our approach overcomes the challenges yielded by the plaform. The unprecedented
levels of noise produced by the image compression typical of
the platform is processed by our feature filtering methods, which ensure
suitable feature matching populations for epipolar geometry estimation by
means of a strict quality-based feature selection. The robust pose estimation
algorithms implemented, along with a novel feature tracking system,
enable our incremental SfM approach to novelly deal with ill-conditioned
inter-image configurations provoked by the omnidirectional motion. The
feature tracking system developed efficiently manages the feature scarcity
produced by noise and outputs quality feature tracks, which allow robust
3D mapping of a given scene even if - due to noise - their length is shorter
than what it is usually assumed for performing stable 3D reconstructions.
The distributed reconstruction from multiple instances of SfM is attained
by applying loop-closing techniques. Our multiple reconstruction system
merges individual 3D structures and resolves the global scale problem with
minimal overlaps, whereas in the literature 3D mapping is obtained by overlapping
stretches of sequences. The performance of this system is demonstrated
in the 2-session case.
The management of noise, the stability against ill-configurations and the
robustness of our SfM system is validated on a number of experiments and
compared with state-of-the-art approaches. Possible future research areas
are also discussed
Single View Modeling and View Synthesis
This thesis develops new algorithms to produce 3D content from a single camera. Today, amateurs can use hand-held camcorders to capture and display the 3D world in 2D, using mature technologies. However, there is always a strong desire to record and re-explore the 3D world in 3D. To achieve this goal, current approaches usually make use of a camera array, which suffers from tedious setup and calibration processes, as well as lack of portability, limiting its application to lab experiments.
In this thesis, I try to produce the 3D contents using a single camera, making it as simple as shooting pictures. It requires a new front end capturing device rather than a regular camcorder, as well as more sophisticated algorithms. First, in order to capture the highly detailed object surfaces, I designed and developed a depth camera based on a novel technique called light fall-off stereo (LFS). The LFS depth camera outputs color+depth image sequences and achieves 30 fps, which is necessary for capturing dynamic scenes. Based on the output color+depth images, I developed a new approach that builds 3D models of dynamic and deformable objects. While the camera can only capture part of a whole object at any instance, partial surfaces are assembled together to form a complete 3D model by a novel warping algorithm.
Inspired by the success of single view 3D modeling, I extended my exploration into 2D-3D video conversion that does not utilize a depth camera. I developed a semi-automatic system that converts monocular videos into stereoscopic videos, via view synthesis. It combines motion analysis with user interaction, aiming to transfer as much depth inferring work from the user to the computer. I developed two new methods that analyze the optical flow in order to provide additional qualitative depth constraints. The automatically extracted depth information is presented in the user interface to assist with user labeling work.
In this thesis, I developed new algorithms to produce 3D contents from a single camera. Depending on the input data, my algorithm can build high fidelity 3D models for dynamic and deformable objects if depth maps are provided. Otherwise, it can turn the video clips into stereoscopic video
Recommended from our members
Structure-From-Motion Photogrammetry of Antarctic Historical Aerial Photographs in Conjunction with Ground Control Derived from Satellite Data
A longer temporal scale of Antarctic observations is vital to better understanding glacier dynamics and improving ice sheet model projections. One underutilized data source that expands the temporal scale is aerial photography, specifically imagery collected prior to 1990. However, processing Antarctic historical aerial imagery using modern photogrammetry software is difficult, as it requires precise information about the data collection process and extensive in situ ground control is required. Often, the necessary orientation metadata for older aerial imagery is lost and in situ data collection in regions like Antarctica is extremely difficult to obtain, limiting the use of traditional photogrammetric methods. Here, we test an alternative methodology to generate elevations from historical Antarctic aerial imagery. Instead of relying on pre-existing ground control, we use structure-from-motion photogrammetry techniques to process the imagery with manually derived ground control from high-resolution satellite imagery. This case study is based on vertical aerial image sets collected over Byrd Glacier, East Antarctica in December 1978 and January 1979. Our results are the oldest, highest resolution digital elevation models (DEMs) ever generated for an Antarctic glacier. We use these DEMs to estimate glacier dynamics and show that surface elevation of Byrd Glacier has been constant for the past ∼40 years
Towards Live 3D Reconstruction from Wearable Video: An Evaluation of V-SLAM, NeRF, and Videogrammetry Techniques
Mixed reality (MR) is a key technology which promises to change the future of
warfare. An MR hybrid of physical outdoor environments and virtual military
training will enable engagements with long distance enemies, both real and
simulated. To enable this technology, a large-scale 3D model of a physical
environment must be maintained based on live sensor observations. 3D
reconstruction algorithms should utilize the low cost and pervasiveness of
video camera sensors, from both overhead and soldier-level perspectives.
Mapping speed and 3D quality can be balanced to enable live MR training in
dynamic environments. Given these requirements, we survey several 3D
reconstruction algorithms for large-scale mapping for military applications
given only live video. We measure 3D reconstruction performance from common
structure from motion, visual-SLAM, and photogrammetry techniques. This
includes the open source algorithms COLMAP, ORB-SLAM3, and NeRF using
Instant-NGP. We utilize the autonomous driving academic benchmark KITTI, which
includes both dashboard camera video and lidar produced 3D ground truth. With
the KITTI data, our primary contribution is a quantitative evaluation of 3D
reconstruction computational speed when considering live video.Comment: Accepted to 2022 Interservice/Industry Training, Simulation, and
Education Conference (I/ITSEC), 13 page
TMO: Textured Mesh Acquisition of Objects with a Mobile Device by using Differentiable Rendering
We present a new pipeline for acquiring a textured mesh in the wild with a
single smartphone which offers access to images, depth maps, and valid poses.
Our method first introduces an RGBD-aided structure from motion, which can
yield filtered depth maps and refines camera poses guided by corresponding
depth. Then, we adopt the neural implicit surface reconstruction method, which
allows for high-quality mesh and develops a new training process for applying a
regularization provided by classical multi-view stereo methods. Moreover, we
apply a differentiable rendering to fine-tune incomplete texture maps and
generate textures which are perceptually closer to the original scene. Our
pipeline can be applied to any common objects in the real world without the
need for either in-the-lab environments or accurate mask images. We demonstrate
results of captured objects with complex shapes and validate our method
numerically against existing 3D reconstruction and texture mapping methods.Comment: Accepted to CVPR23. Project Page: https://jh-choi.github.io/TMO
Doctor of Philosophy
dissertation3D reconstruction from image pairs relies on finding corresponding points between images and using the corresponding points to estimate a dense disparity map. Today's correspondence-finding algorithms primarily use image features or pixel intensities common between image pairs. Some 3D computer vision applications, however, don't produce the desired results using correspondences derived from image features or pixel intensities. Two examples are the multimodal camera rig and the center region of a coaxial camera rig. Additionally, traditional stereo correspondence-finding techniques which use image features or pixel intensities sometimes produce inaccurate results. This thesis presents a novel image correspondence-finding technique that aligns pairs of image sequences using the optical flow fields. The optical flow fields provide information about the structure and motion of the scene which is not available in still images, but which can be used to align images taken from different camera positions. The method applies to applications where there is inherent motion between the camera rig and the scene and where the scene has enough visual texture to produce optical flow. We apply the technique to a traditional binocular stereo rig consisting of an RGB/IR camera pair and to a coaxial camera rig. We present results for synthetic flow fields and for real images sequences with accuracy metrics and reconstructed depth maps
- …