4,968 research outputs found

    Building with Drones: Accurate 3D Facade Reconstruction using MAVs

    Full text link
    Automatic reconstruction of 3D models from images using multi-view Structure-from-Motion methods has been one of the most fruitful outcomes of computer vision. These advances combined with the growing popularity of Micro Aerial Vehicles as an autonomous imaging platform, have made 3D vision tools ubiquitous for large number of Architecture, Engineering and Construction applications among audiences, mostly unskilled in computer vision. However, to obtain high-resolution and accurate reconstructions from a large-scale object using SfM, there are many critical constraints on the quality of image data, which often become sources of inaccuracy as the current 3D reconstruction pipelines do not facilitate the users to determine the fidelity of input data during the image acquisition. In this paper, we present and advocate a closed-loop interactive approach that performs incremental reconstruction in real-time and gives users an online feedback about the quality parameters like Ground Sampling Distance (GSD), image redundancy, etc on a surface mesh. We also propose a novel multi-scale camera network design to prevent scene drift caused by incremental map building, and release the first multi-scale image sequence dataset as a benchmark. Further, we evaluate our system on real outdoor scenes, and show that our interactive pipeline combined with a multi-scale camera network approach provides compelling accuracy in multi-view reconstruction tasks when compared against the state-of-the-art methods.Comment: 8 Pages, 2015 IEEE International Conference on Robotics and Automation (ICRA '15), Seattle, WA, US

    Opt: A Domain Specific Language for Non-linear Least Squares Optimization in Graphics and Imaging

    Full text link
    Many graphics and vision problems can be expressed as non-linear least squares optimizations of objective functions over visual data, such as images and meshes. The mathematical descriptions of these functions are extremely concise, but their implementation in real code is tedious, especially when optimized for real-time performance on modern GPUs in interactive applications. In this work, we propose a new language, Opt (available under http://optlang.org), for writing these objective functions over image- or graph-structured unknowns concisely and at a high level. Our compiler automatically transforms these specifications into state-of-the-art GPU solvers based on Gauss-Newton or Levenberg-Marquardt methods. Opt can generate different variations of the solver, so users can easily explore tradeoffs in numerical precision, matrix-free methods, and solver approaches. In our results, we implement a variety of real-world graphics and vision applications. Their energy functions are expressible in tens of lines of code, and produce highly-optimized GPU solver implementations. These solver have performance competitive with the best published hand-tuned, application-specific GPU solvers, and orders of magnitude beyond a general-purpose auto-generated solver

    Shape basis interpretation for monocular deformable 3D reconstruction

    Get PDF
    © 2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.In this paper, we propose a novel interpretable shape model to encode object non-rigidity. We first use the initial frames of a monocular video to recover a rest shape, used later to compute a dissimilarity measure based on a distance matrix measurement. Spectral analysis is then applied to this matrix to obtain a reduced shape basis, that in contrast to existing approaches, can be physically interpreted. In turn, these pre-computed shape bases are used to linearly span the deformation of a wide variety of objects. We introduce the low-rank basis into a sequential approach to recover both camera motion and non-rigid shape from the monocular video, by simply optimizing the weights of the linear combination using bundle adjustment. Since the number of parameters to optimize per frame is relatively small, specially when physical priors are considered, our approach is fast and can potentially run in real time. Validation is done in a wide variety of real-world objects, undergoing both inextensible and extensible deformations. Our approach achieves remarkable robustness to artifacts such as noisy and missing measurements and shows an improved performance to competing methods.Peer ReviewedPostprint (author's final draft

    Theseus: A Library for Differentiable Nonlinear Optimization

    Full text link
    We present Theseus, an efficient application-agnostic open source library for differentiable nonlinear least squares (DNLS) optimization built on PyTorch, providing a common framework for end-to-end structured learning in robotics and vision. Existing DNLS implementations are application specific and do not always incorporate many ingredients important for efficiency. Theseus is application-agnostic, as we illustrate with several example applications that are built using the same underlying differentiable components, such as second-order optimizers, standard costs functions, and Lie groups. For efficiency, Theseus incorporates support for sparse solvers, automatic vectorization, batching, GPU acceleration, and gradient computation with implicit differentiation and direct loss minimization. We do extensive performance evaluation in a set of applications, demonstrating significant efficiency gains and better scalability when these features are incorporated. Project page: https://sites.google.com/view/theseus-a

    Advances in Simultaneous Localization and Mapping in Confined Underwater Environments Using Sonar and Optical Imaging.

    Full text link
    This thesis reports on the incorporation of surface information into a probabilistic simultaneous localization and mapping (SLAM) framework used on an autonomous underwater vehicle (AUV) designed for underwater inspection. AUVs operating in cluttered underwater environments, such as ship hulls or dams, are commonly equipped with Doppler-based sensors, which---in addition to navigation---provide a sparse representation of the environment in the form of a three-dimensional (3D) point cloud. The goal of this thesis is to develop perceptual algorithms that take full advantage of these sparse observations for correcting navigational drift and building a model of the environment. In particular, we focus on three objectives. First, we introduce a novel representation of this 3D point cloud as collections of planar features arranged in a factor graph. This factor graph representation probabalistically infers the spatial arrangement of each planar segment and can effectively model smooth surfaces (such as a ship hull). Second, we show how this technique can produce 3D models that serve as input to our pipeline that produces the first-ever 3D photomosaics using a two-dimensional (2D) imaging sonar. Finally, we propose a model-assisted bundle adjustment (BA) framework that allows for robust registration between surfaces observed from a Doppler sensor and visual features detected from optical images. Throughout this thesis, we show methods that produce 3D photomosaics using a combination of triangular meshes (derived from our SLAM framework or given a-priori), optical images, and sonar images. Overall, the contributions of this thesis greatly increase the accuracy, reliability, and utility of in-water ship hull inspection with AUVs despite the challenges they face in underwater environments. We provide results using the Hovering Autonomous Underwater Vehicle (HAUV) for autonomous ship hull inspection, which serves as the primary testbed for the algorithms presented in this thesis. The sensor payload of the HAUV consists primarily of: a Doppler velocity log (DVL) for underwater navigation and ranging, monocular and stereo cameras, and---for some applications---an imaging sonar.PhDElectrical Engineering: SystemsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/120750/1/paulozog_1.pd

    Structure from Motion with Higher-level Environment Representations

    Get PDF
    Computer vision is an important area focusing on understanding, extracting and using the information from vision-based sensor. It has many applications such as vision-based 3D reconstruction, simultaneous localization and mapping(SLAM) and data-driven understanding of the real world. Vision is a fundamental sensing modality in many different fields of application. While the traditional structure from motion mostly uses sparse point-based feature, this thesis aims to explore the possibility of using higher order feature representation. It starts with a joint work which uses straight line for feature representation and performs bundle adjustment with straight line parameterization. Then, we further try an even higher order representation where we use Bezier spline for parameterization. We start with a simple case where all contours are lying on the plane and uses Bezier splines to parametrize the curves in the background and optimize on both camera position and Bezier splines. For application, we present a complete end-to-end pipeline which produces meaningful dense 3D models from natural data of a 3D object: the target object is placed on a structured but unknown planar background that is modeled with splines. The data is captured using only a hand-held monocular camera. However, this application is limited to a planar scenario and we manage to push the parameterizations into real 3D. Following the potential of this idea, we introduce a more flexible higher-order extension of points that provide a general model for structural edges in the environment, no matter if straight or curved. Our model relies on linked B´ezier curves, the geometric intuition of which proves great benefits during parameter initialization and regularization. We present the first fully automatic pipeline that is able to generate spline-based representations without any human supervision. Besides a full graphical formulation of the problem, we introduce both geometric and photometric cues as well as higher-level concepts such overall curve visibility and viewing angle restrictions to automatically manage the correspondences in the graph. Results prove that curve-based structure from motion with splines is able to outperform state-of-the-art sparse feature-based methods, as well as to model curved edges in the environment

    Enhancement to Camera Calibration: Representation, Robust Statistics, and 3D Calibration Tool

    Get PDF
    This thesis demonstrates theenhancement to camera calibrationin three aspects: representation of pose, robust statistics and 3D calibration tool. Camera calibration is the reconstruction of digital camera information based on digital images of an object in 3D space, since the digital images are 2D projections of a 3D object onto the camera sensor. Camera calibration is the estimation of the interior orientation (IO) parameters and exterior orientation (EO) parameters of a digital camera. Camera calibration is an essential part of image metrology. If the quality of camera calibration cannot be guaranteed, neither can the reliability of the subsequent analysis and applications based on digital images. The first enhancement of camera calibration is in representation of pose. A formal definition of singularity of representation is given mathematically. An example is offered to show how singularity can lead to difficulty or failure in optimization. The spherical coordinate system is introduced as a representation method instead of other widely-used representations. Thespherical coordinate systemrepresents camera poses according to camera calibration tool images in digital image processing. With the introduction of the v frame in digital images, the singularities of spherical coordinate system are demonstrated mathematically. The application ofrobust statisticsin optimization is the second enhancement of camera calibration. In photogrammetry, it is typical to collect thousands of observed data points for bundle adjustment. Unexpected outliers in observed data are unavoidable, and thus, the algorithm accuracy may not reach our goal. The least squares estimator is a widely used estimation method in camera calibration, but its sensitivity to outliers makes the algorithm unreliable, and it can even fail to fit the observations. By closely analyzing and comparing the characteristics of the least squares estimator, robust estimators with alternative assumptions are shown to detect and de-weight outliers that are not well processed with the classical assumptions, and provide a reliable fit to the observations. Among all possible robust estimators, two robust estimators from M-estimator family are applied to optimization in existing camera calibration algorithm. The robustified method can considerably improve accuracy for camera calibration estimation. Anew metric \bar{D}is introduced, which is the distance between two camera calibrations considering all of the estimated camera IO parameters. \bar{D} can be used to evaluate the performance among various estimators. After applying the robust estimator, the system improves the accuracy and performance in camera calibration up to 25\%. The influence of a robustified estimator modification is also considered. It is established that the modification has impact on the estimation accuracy. The third enhancement is the design and application of a3D calibration toolfor data collection. An all-new 3D calibration tool is designed to improve camera calibration accuracy over the 2D calibration tool. The comparison of the 3D and 2D calibration tools is conducted experimentally and theoretically. The experimental analysis is based on camera calibration results and the corresponding \bar{D} matrix, which shows that the 3D calibration tool improves accuracy. The mathematical analysis is based on the calculated covariance matrix of camera calibration without other impact factors. The experimental and theoretical analyses show that the 3D calibration tool can obtain more accurate calibration results compared with the 2D calibration tool, establishing that a carefully designed 3D calibration tool will yield better estimates than a 2D calibration tool
    • …
    corecore