112 research outputs found

    Affine Approximation for Direct Batch Recovery of Euclidean Motion From Sparse Data

    Get PDF
    We present a batch method for recovering Euclidian camera motion from sparse image data. The main purpose of the algorithm is to recover the motion parameters using as much of the available information and as few computational steps as possible. The algorithmthus places itself in the gap between factorisation schemes, which make use of all available information in the initial recovery step, and sequential approaches which are able to handle sparseness in the image data. Euclidian camera matrices are approximated via the affine camera model, thus making the recovery direct in the sense that no intermediate projective reconstruction is made. Using a little known closure constraint, the FA-closure, we are able to formulate the camera coefficients linearly in the entries of the affine fundamental matrices. The novelty of the presented work is twofold: Firstly the presented formulation allows for a particularly good conditioning of the estimation of the initial motion parameters but also for an unprecedented diversity in the choice of possible regularisation terms. Secondly, the new autocalibration scheme presented here is in practice guaranteed to yield a Least Squares Estimate of the calibration parameters. As a bi-product, the affine camera model is rehabilitated as a useful model for most cameras and scene configurations, e.g. wide angle lenses observing a scene at close range. Experiments on real and synthetic data demonstrate the ability to reconstruct scenes which are very problematic for previous structure from motion techniques due to local ambiguities and error accumulation

    Detail Enhancing Denoising of Digitized 3D Models from a Mobile Scanning System

    Get PDF
    The acquisition process of digitizing a large-scale environment produces an enormous amount of raw geometry data. This data is corrupted by system noise, which leads to 3D surfaces that are not smooth and details that are distorted. Any scanning system has noise associate with the scanning hardware, both digital quantization errors and measurement inaccuracies, but a mobile scanning system has additional system noise introduced by the pose estimation of the hardware during data acquisition. The combined system noise generates data that is not handled well by existing noise reduction and smoothing techniques. This research is focused on enhancing the 3D models acquired by mobile scanning systems used to digitize large-scale environments. These digitization systems combine a variety of sensors – including laser range scanners, video cameras, and pose estimation hardware – on a mobile platform for the quick acquisition of 3D models of real world environments. The data acquired by such systems are extremely noisy, often with significant details being on the same order of magnitude as the system noise. By utilizing a unique 3D signal analysis tool, a denoising algorithm was developed that identifies regions of detail and enhances their geometry, while removing the effects of noise on the overall model. The developed algorithm can be useful for a variety of digitized 3D models, not just those involving mobile scanning systems. The challenges faced in this study were the automatic processing needs of the enhancement algorithm, and the need to fill a hole in the area of 3D model analysis in order to reduce the effect of system noise on the 3D models. In this context, our main contributions are the automation and integration of a data enhancement method not well known to the computer vision community, and the development of a novel 3D signal decomposition and analysis tool. The new technologies featured in this document are intuitive extensions of existing methods to new dimensionality and applications. The totality of the research has been applied towards detail enhancing denoising of scanned data from a mobile range scanning system, and results from both synthetic and real models are presented

    Adorym: A multi-platform generic x-ray image reconstruction framework based on automatic differentiation

    Full text link
    We describe and demonstrate an optimization-based x-ray image reconstruction framework called Adorym. Our framework provides a generic forward model, allowing one code framework to be used for a wide range of imaging methods ranging from near-field holography to and fly-scan ptychographic tomography. By using automatic differentiation for optimization, Adorym has the flexibility to refine experimental parameters including probe positions, multiple hologram alignment, and object tilts. It is written with strong support for parallel processing, allowing large datasets to be processed on high-performance computing systems. We demonstrate its use on several experimental datasets to show improved image quality through parameter refinement

    Statistical Models and Optimization Algorithms for High-Dimensional Computer Vision Problems

    Get PDF
    Data-driven and computational approaches are showing significant promise in solving several challenging problems in various fields such as bioinformatics, finance and many branches of engineering. In this dissertation, we explore the potential of these approaches, specifically statistical data models and optimization algorithms, for solving several challenging problems in computer vision. In doing so, we contribute to the literatures of both statistical data models and computer vision. In the context of statistical data models, we propose principled approaches for solving robust regression problems, both linear and kernel, and missing data matrix factorization problem. In computer vision, we propose statistically optimal and efficient algorithms for solving the remote face recognition and structure from motion (SfM) problems. The goal of robust regression is to estimate the functional relation between two variables from a given data set which might be contaminated with outliers. Under the reasonable assumption that there are fewer outliers than inliers in a data set, we formulate the robust linear regression problem as a sparse learning problem, which can be solved using efficient polynomial-time algorithms. We also provide sufficient conditions under which the proposed algorithms correctly solve the robust regression problem. We then extend our robust formulation to the case of kernel regression, specifically to propose a robust version for relevance vector machine (RVM) regression. Matrix factorization is used for finding a low-dimensional representation for data embedded in a high-dimensional space. Singular value decomposition is the standard algorithm for solving this problem. However, when the matrix has many missing elements this is a hard problem to solve. We formulate the missing data matrix factorization problem as a low-rank semidefinite programming problem (essentially a rank constrained SDP), which allows us to find accurate and efficient solutions for large-scale factorization problems. Face recognition from remotely acquired images is a challenging problem because of variations due to blur and illumination. Using the convolution model for blur, we show that the set of all images obtained by blurring a given image forms a convex set. We then use convex optimization techniques to find the distances between a given blurred (probe) image and the gallery images to find the best match. Further, using a low-dimensional linear subspace model for illumination variations, we extend our theory in a similar fashion to recognize blurred and poorly illuminated faces. Bundle adjustment is the final optimization step of the SfM problem where the goal is to obtain the 3-D structure of the observed scene and the camera parameters from multiple images of the scene. The traditional bundle adjustment algorithm, based on minimizing the l_2 norm of the image re-projection error, has cubic complexity in the number of unknowns. We propose an algorithm, based on minimizing the l_infinity norm of the re-projection error, that has quadratic complexity in the number of unknowns. This is achieved by reducing the large-scale optimization problem into many small scale sub-problems each of which can be solved using second-order cone programming

    "A Novel Feature-Based Approach for Indoor Monocular SLAM"

    Get PDF
    Camera tracking and the construction of a robust and accurate map in unknown environments are still challenging tasks in computer vision and robotic applications. Visual Simultaneous Localization and Mapping (SLAM) along with Augmented Reality (AR) are two important applications, and their performance is entirely dependent on the accuracy of the camera tracking routine. This paper presents a novel feature-based approach for the monocular SLAM problem using a hand-held camera in room-sized workspaces with a maximum scene depth of 4–5 m. In the core of the proposed method, there is a Particle Filter (PF) responsible for the estimation of extrinsic parameters of the camera. In addition, contrary to key-frame based methods, the proposed system tracks the camera frame by frame and constructs a robust and accurate map incrementally. Moreover, the proposed algorithm initially constructs a metric sparse map. To this end, a chessboard pattern with a known cell size has been placed in front of the camera for a few frames. This enables the algorithm to accurately compute the pose of the camera and therefore, the depth of the primary detected natural feature points are easily calculated. Afterwards, camera pose estimation for each new incoming frame is carried out in a framework that is merely working with a set of visible natural landmarks. Moreover, to recover the depth of the newly detected landmarks, a delayed approach based on linear triangulation is used. The proposed method is applied to a realworld VGA quality video (640 × 480 pixels) where the translation error of the camera pose is less than 2 cm on average and the orientation error is less than 3 degrees, which indicates the effectiveness and accuracy of the developed algorithm

    Deep learning for inverse problems in remote sensing: super-resolution and SAR despeckling

    Get PDF
    L'abstract è presente nell'allegato / the abstract is in the attachmen

    International Conference on Continuous Optimization (ICCOPT) 2019 Conference Book

    Get PDF
    The Sixth International Conference on Continuous Optimization took place on the campus of the Technical University of Berlin, August 3-8, 2019. The ICCOPT is a flagship conference of the Mathematical Optimization Society (MOS), organized every three years. ICCOPT 2019 was hosted by the Weierstrass Institute for Applied Analysis and Stochastics (WIAS) Berlin. It included a Summer School and a Conference with a series of plenary and semi-plenary talks, organized and contributed sessions, and poster sessions. This book comprises the full conference program. It contains, in particular, the scientific program in survey style as well as with all details, and information on the social program, the venue, special meetings, and more

    Robust and Accurate Structure from Motion of Rigid and Nonrigid Objects

    Get PDF
    As a central theme in computer vision, the problem of 3D structure and motion recovery from image sequences has been widely studied during the past three decades, and considerable progress has been made in theory, as well as in prac- tice. However, there are still several challenges remaining, including algorithm robustness and accuracy, especially for nonrigid modeling. The thesis focuses on solving these challenges and several new robust and accurate algorithms have been proposed. The first part of the thesis reviews the state-of-the-art techniques of structure and motion factorization. First, an introduction of structure from motion and some mathematical background of the technique is presented. Then, the general idea and different formulations of structure from motion for rigid and nonrigid objects are discussed. The second part covers the proposed quasi-perspective projection model and its application to structure and motion factorization. Previous algorithms are based on either a simplified affine assumption or a complicated full perspective projection model. The affine model is widely adopted due to its simplicity, whereas the extension to full perspective suffers from recovering projective depths. A quasi-perspective model is proposed to fill the gap between the two models. It is more accurate than the affine model from both theoretical analysis and experimental studies. More geometric properties of the model are investigated in the context of one- and two-view geometry. Finally, the model was applied to structure from motion and a framework of rigid and nonrigid factorization under quasi-perspective assumption is established. The last part of the thesis is focused on the robustness and three new al- gorithms are proposed. First, a spatial-and-temporal-weighted factorization algorithm is proposed to handle significant image noise, where the uncertainty of image measurement is estimated from a new perspective by virtue of repro- jection residuals. Second, a rank-4 affine factorization algorithm is proposed to avoid the difficulty of image alignment with erroneous data, followed by a robust factorization scheme that can work with missing and outlying data. Third, the robust algorithm is extended to nonrigid scenarios and a new augmented nonrigid factorization algorithm is proposed to handle imperfect tracking data. The main contributions of the thesis are as follows: The proposed quasi- perspective projection model fills the gap between the simplicity of the affine model and the accuracy of the perspective model. Its application to structure and motion factorization greatly increases the efficiency and accuracy of the algorithm. The proposed robust algorithms do not require prior information of image measurement and greatly improve the overall accuracy and robustness of previous approaches. Moreover, the algorithms can also be applied directly to structure from motion of nonrigid objects

    Super resolution and dynamic range enhancement of image sequences

    Get PDF
    Camera producers try to increase the spatial resolution of a camera by reducing size of sites on sensor array. However, shot noise causes the signal to noise ratio drop as sensor sites get smaller. This fact motivates resolution enhancement to be performed through software. Super resolution (SR) image reconstruction aims to combine degraded images of a scene in order to form an image which has higher resolution than all observations. There is a demand for high resolution images in biomedical imaging, surveillance, aerial/satellite imaging and high-definition TV (HDTV) technology. Although extensive research has been conducted in SR, attention has not been given to increase the resolution of images under illumination changes. In this study, a unique framework is proposed to increase the spatial resolution and dynamic range of a video sequence using Bayesian and Projection onto Convex Sets (POCS) methods. Incorporating camera response function estimation into image reconstruction allows dynamic range enhancement along with spatial resolution improvement. Photometrically varying input images complicate process of projecting observations onto common grid by violating brightness constancy. A contrast invariant feature transform is proposed in this thesis to register input images with high illumination variation. Proposed algorithm increases the repeatability rate of detected features among frames of a video. Repeatability rate is increased by computing the autocorrelation matrix using the gradients of contrast stretched input images. Presented contrast invariant feature detection improves repeatability rate of Harris corner detector around %25 on average. Joint multi-frame demosaicking and resolution enhancement is also investigated in this thesis. Color constancy constraint set is devised and incorporated into POCS framework for increasing resolution of color-filter array sampled images. Proposed method provides fewer demosaicking artifacts compared to existing POCS method and a higher visual quality in final image

    Robust motion segmentation with subspace constraints

    No full text
    Motion segmentation is an important task in computer vision with many applications such as dynamic scene understanding and multi-body structure from motion. When the point correspondences across frames are given, motion segmentation can be addressed as a subspace clustering problem under an affine camera model. In the first two parts of this thesis, we target the general subspace clustering problem and propose two novel methods, namely Efficient Dense Subspace Clustering (EDSC) and the Robust Shape Interaction Matrix (RSIM) method. Instead of following the standard compressive sensing approach, in EDSC we formulate subspace clustering as a Frobenius norm minimization problem, which inherently yields denser connections between data points. While in the noise-free case we rely on the self-expressiveness of the observations, in the presence of noise we recover a clean dictionary to represent the data. Our formulation lets us solve the subspace clustering problem efficiently. More specifically, for outlier-free observations, the solution can be obtained in closed-form, and in the presence of outliers, we solve the problem by performing a series of linear operations. Furthermore, we show that our Frobenius norm formulation shares the same solution as the popular nuclear norm minimization approach when the data is free of any noise. In RSIM, we revisit the Shape Interaction Matrix (SIM) method, one of the earliest approaches for motion segmentation (or subspace clustering), and reveal its connections to several recent subspace clustering methods. We derive a simple, yet effective algorithm to robustify the SIM method and make it applicable to real-world scenarios where the data is corrupted by noise. We validate the proposed method by intuitive examples and justify it with the matrix perturbation theory. Moreover, we show that RSIM can be extended to handle missing data with a Grassmannian gradient descent method. The above subspace clustering methods work well for motion segmentation, yet they require that point trajectories across frames are known {\it a priori}. However, finding point correspondences is in itself a challenging task. Existing approaches tackle the correspondence estimation and motion segmentation problems separately. In the third part of this thesis, given a set of feature points detected in each frame of the sequence, we develop an approach which simultaneously performs motion segmentation and finds point correspondences across the frames. We formulate this problem in terms of Partial Permutation Matrices (PPMs) and aim to match feature descriptors while simultaneously encouraging point trajectories to satisfy subspace constraints. This lets us handle outliers in both point locations and feature appearance. The resulting optimization problem is solved via the Alternating Direction Method of Multipliers (ADMM), where each subproblem has an efficient solution. In particular, we show that most of the subproblems can be solved in closed-form, and one binary assignment subproblem can be solved by the Hungarian algorithm. Obtaining reliable feature tracks in a frame-by-frame manner is desirable in applications such as online motion segmentation. In the final part of the thesis, we introduce a novel multi-body feature tracker that exploits a multi-body rigidity assumption to improve tracking robustness under a general perspective camera model. A conventional approach to addressing this problem would consist of alternating between solving two subtasks: motion segmentation and feature tracking under rigidity constraints for each segment. This approach, however, requires knowing the number of motions, as well as assigning points to motion groups, which is typically sensitive to motion estimates. By contrast, we introduce a segmentation-free solution to multi-body feature tracking that bypasses the motion assignment step and reduces to solving a series of subproblems with closed-form solutions. In summary, in this thesis, we exploit the powerful subspace constraints and develop robust motion segmentation methods in different challenging scenarios where the trajectories are either given as input, or unknown beforehand. We also present a general robust multi-body feature tracker which can be used as the first step of motion segmentation to get reliable trajectories
    corecore