483 research outputs found

    Monocular-Based Pose Determination of Uncooperative Known and Unknown Space Objects

    Get PDF
    In order to support spacecraft proximity operations, such as on-orbit servicing and spacecraft formation flying, several vision-based techniques exist to determine the relative pose of an uncooperative orbiting object with respect to the spacecraft. Depending on whether the object is known or unknown, a shape model of the orbiting target object may have to be constructed autonomously by making use of only optical measurements. In this paper, we investigate two vision-based approaches for pose estimation of uncooperative orbiting targets: one that is general and versatile such that it does not require a priori knowledge of any information of the target, and the other one that requires knowledge of the target's shape geometry. The former uses an estimation algorithm of translational and rotational dynamics to sequentially perform simultaneous pose determination and 3D shape reconstruction of the unknown target, while the latter relies on a known 3D model of the target's geometry to provide a point-by-point pose solution. The architecture and implementation of both methods are presented and their achievable performance is evaluated through numerical simulations. In addition, a computer vision processing strategy for feature detection and matching and the Structure from Motion (SfM) algorithm for on-board 3D reconstruction are also discussed and validated by using a dataset of images that are synthetically generated according to a chaser/target relative motion in Geosynchronous Orbit (GEO)

    Monocular-Based Pose Determination of Uncooperative Space Objects

    Get PDF
    Vision-based methods to determine the relative pose of an uncooperative orbiting object are investigated in applications to spacecraft proximity operations, such as on-orbit servicing, spacecraft formation flying, and small bodies exploration. Depending on whether the object is known or unknown, a shape model of the orbiting target object may have to be constructed autonomously in real-time by making use of only optical measurements. The Simultaneous Estimation of Pose and Shape (SEPS) algorithm that does not require a priori knowledge of the pose and shape of the target is presented. This makes use of a novel measurement equation and filter that can efficiently use optical flow information along with a star tracker to estimate the target's angular rotational and translational relative velocity as well as its center of gravity. Depending on the mission constraints, SEPS can be augmented by a more accurate offline, on-board 3D reconstruction of the target shape, which allows for the estimation of the pose as a known target. The use of Structure from Motion (SfM) for this purpose is discussed. A model-based approach for pose estimation of known targets is also presented. The architecture and implementation of both the proposed approaches are elucidated and their performance metrics are evaluated through numerical simulations by using a dataset of images that are synthetically generated according to a chaser/target relative motion in Geosynchronous Orbit (GEO)

    3 Dimensional Dense Reconstruction: A Review of Algorithms and Dataset

    Full text link
    3D dense reconstruction refers to the process of obtaining the complete shape and texture features of 3D objects from 2D planar images. 3D reconstruction is an important and extensively studied problem, but it is far from being solved. This work systematically introduces classical methods of 3D dense reconstruction based on geometric and optical models, as well as methods based on deep learning. It also introduces datasets for deep learning and the performance and advantages and disadvantages demonstrated by deep learning methods on these datasets.Comment: 16 page

    Distributed scene reconstruction from multiple mobile platforms

    Get PDF
    Recent research on mobile robotics has produced new designs that provide house-hold robots with omnidirectional motion. The image sensor embedded in these devices motivates the application of 3D vision techniques on them for navigation and mapping purposes. In addition to this, distributed cheapsensing systems acting as unitary entity have recently been discovered as an efficient alternative to expensive mobile equipment. In this work we present an implementation of a visual reconstruction method, structure from motion (SfM), on a low-budget, omnidirectional mobile platform, and extend this method to distributed 3D scene reconstruction with several instances of such a platform. Our approach overcomes the challenges yielded by the plaform. The unprecedented levels of noise produced by the image compression typical of the platform is processed by our feature filtering methods, which ensure suitable feature matching populations for epipolar geometry estimation by means of a strict quality-based feature selection. The robust pose estimation algorithms implemented, along with a novel feature tracking system, enable our incremental SfM approach to novelly deal with ill-conditioned inter-image configurations provoked by the omnidirectional motion. The feature tracking system developed efficiently manages the feature scarcity produced by noise and outputs quality feature tracks, which allow robust 3D mapping of a given scene even if - due to noise - their length is shorter than what it is usually assumed for performing stable 3D reconstructions. The distributed reconstruction from multiple instances of SfM is attained by applying loop-closing techniques. Our multiple reconstruction system merges individual 3D structures and resolves the global scale problem with minimal overlaps, whereas in the literature 3D mapping is obtained by overlapping stretches of sequences. The performance of this system is demonstrated in the 2-session case. The management of noise, the stability against ill-configurations and the robustness of our SfM system is validated on a number of experiments and compared with state-of-the-art approaches. Possible future research areas are also discussed

    Single View Modeling and View Synthesis

    Get PDF
    This thesis develops new algorithms to produce 3D content from a single camera. Today, amateurs can use hand-held camcorders to capture and display the 3D world in 2D, using mature technologies. However, there is always a strong desire to record and re-explore the 3D world in 3D. To achieve this goal, current approaches usually make use of a camera array, which suffers from tedious setup and calibration processes, as well as lack of portability, limiting its application to lab experiments. In this thesis, I try to produce the 3D contents using a single camera, making it as simple as shooting pictures. It requires a new front end capturing device rather than a regular camcorder, as well as more sophisticated algorithms. First, in order to capture the highly detailed object surfaces, I designed and developed a depth camera based on a novel technique called light fall-off stereo (LFS). The LFS depth camera outputs color+depth image sequences and achieves 30 fps, which is necessary for capturing dynamic scenes. Based on the output color+depth images, I developed a new approach that builds 3D models of dynamic and deformable objects. While the camera can only capture part of a whole object at any instance, partial surfaces are assembled together to form a complete 3D model by a novel warping algorithm. Inspired by the success of single view 3D modeling, I extended my exploration into 2D-3D video conversion that does not utilize a depth camera. I developed a semi-automatic system that converts monocular videos into stereoscopic videos, via view synthesis. It combines motion analysis with user interaction, aiming to transfer as much depth inferring work from the user to the computer. I developed two new methods that analyze the optical flow in order to provide additional qualitative depth constraints. The automatically extracted depth information is presented in the user interface to assist with user labeling work. In this thesis, I developed new algorithms to produce 3D contents from a single camera. Depending on the input data, my algorithm can build high fidelity 3D models for dynamic and deformable objects if depth maps are provided. Otherwise, it can turn the video clips into stereoscopic video

    Towards Live 3D Reconstruction from Wearable Video: An Evaluation of V-SLAM, NeRF, and Videogrammetry Techniques

    Full text link
    Mixed reality (MR) is a key technology which promises to change the future of warfare. An MR hybrid of physical outdoor environments and virtual military training will enable engagements with long distance enemies, both real and simulated. To enable this technology, a large-scale 3D model of a physical environment must be maintained based on live sensor observations. 3D reconstruction algorithms should utilize the low cost and pervasiveness of video camera sensors, from both overhead and soldier-level perspectives. Mapping speed and 3D quality can be balanced to enable live MR training in dynamic environments. Given these requirements, we survey several 3D reconstruction algorithms for large-scale mapping for military applications given only live video. We measure 3D reconstruction performance from common structure from motion, visual-SLAM, and photogrammetry techniques. This includes the open source algorithms COLMAP, ORB-SLAM3, and NeRF using Instant-NGP. We utilize the autonomous driving academic benchmark KITTI, which includes both dashboard camera video and lidar produced 3D ground truth. With the KITTI data, our primary contribution is a quantitative evaluation of 3D reconstruction computational speed when considering live video.Comment: Accepted to 2022 Interservice/Industry Training, Simulation, and Education Conference (I/ITSEC), 13 page

    TMO: Textured Mesh Acquisition of Objects with a Mobile Device by using Differentiable Rendering

    Full text link
    We present a new pipeline for acquiring a textured mesh in the wild with a single smartphone which offers access to images, depth maps, and valid poses. Our method first introduces an RGBD-aided structure from motion, which can yield filtered depth maps and refines camera poses guided by corresponding depth. Then, we adopt the neural implicit surface reconstruction method, which allows for high-quality mesh and develops a new training process for applying a regularization provided by classical multi-view stereo methods. Moreover, we apply a differentiable rendering to fine-tune incomplete texture maps and generate textures which are perceptually closer to the original scene. Our pipeline can be applied to any common objects in the real world without the need for either in-the-lab environments or accurate mask images. We demonstrate results of captured objects with complex shapes and validate our method numerically against existing 3D reconstruction and texture mapping methods.Comment: Accepted to CVPR23. Project Page: https://jh-choi.github.io/TMO

    Doctor of Philosophy

    Get PDF
    dissertation3D reconstruction from image pairs relies on finding corresponding points between images and using the corresponding points to estimate a dense disparity map. Today's correspondence-finding algorithms primarily use image features or pixel intensities common between image pairs. Some 3D computer vision applications, however, don't produce the desired results using correspondences derived from image features or pixel intensities. Two examples are the multimodal camera rig and the center region of a coaxial camera rig. Additionally, traditional stereo correspondence-finding techniques which use image features or pixel intensities sometimes produce inaccurate results. This thesis presents a novel image correspondence-finding technique that aligns pairs of image sequences using the optical flow fields. The optical flow fields provide information about the structure and motion of the scene which is not available in still images, but which can be used to align images taken from different camera positions. The method applies to applications where there is inherent motion between the camera rig and the scene and where the scene has enough visual texture to produce optical flow. We apply the technique to a traditional binocular stereo rig consisting of an RGB/IR camera pair and to a coaxial camera rig. We present results for synthetic flow fields and for real images sequences with accuracy metrics and reconstructed depth maps
    • …