6 research outputs found

    Dense matching of multiple wide-baseline views

    Full text link

    An iterative scheme for motion-based scene segmentation

    Get PDF
    We present an approach for dense estimation of motion and depth of a scene containing a multiple number of dif-ferently moving objects with the camera system itself being in motion. The estimates are used to segregate the image sequence into a number of independently moving objects by assigning the object hypothesis with maximum a poste-riori (MAP) probability to each image point. Different to previous approaches in 3-dimensional (3D) scene analysis, we tackle this task by first simultaneously estimating motion and depth for a salient set of feature points in a recursive manner. Based on the evolving set of estimated motion pro-files, the scene depth is recovered densely from spatially and temporally separated views. Given the dense depth map and the set of tracked motion estimates, the likelihood of each image point to belong to one of the distinct motion profiles can be determined and dense scene segmentation can be performed. Within our probabilistic model the expectation-maximization (EM) algorithm is used to solve the inherent missing data problem. A Markov Random Field (MRF) is used to express our expectations on spatial and temporal continuity of objects. 1

    Enhanced Image-Based Visual Servoing Dealing with Uncertainties

    Get PDF
    Nowadays, the applications of robots in industrial automation have been considerably increased. There is increasing demand for the dexterous and intelligent robots that can work in unstructured environment. Visual servoing has been developed to meet this need by integration of vision sensors into robotic systems. Although there has been significant development in visual servoing, there still exist some challenges in making it fully functional in the industry environment. The nonlinear nature of visual servoing and also system uncertainties are part of the problems affecting the control performance of visual servoing. The projection of 3D image to 2D image which occurs in the camera creates a source of uncertainty in the system. Another source of uncertainty lies in the camera and robot manipulator's parameters. Moreover, limited field of view (FOV) of the camera is another issues influencing the control performance. There are two main types of visual servoing: position-based and image-based. This project aims to develop a series of new methods of image-based visual servoing (IBVS) which can address the nonlinearity and uncertainty issues and improve the visual servoing performance of industrial robots. The first method is an adaptive switch IBVS controller for industrial robots in which the adaptive law deals with the uncertainties of the monocular camera in eye-in-hand configuration. The proposed switch control algorithm decouples the rotational and translational camera motions and decomposes the IBVS control into three separate stages with different gains. This method can increase the system response speed and improve the tracking performance of IBVS while dealing with camera uncertainties. The second method is an image feature reconstruction algorithm based on the Kalman filter which is proposed to handle the situation where the image features go outside the camera's FOV. The combination of the switch controller and the feature reconstruction algorithm can not only improve the system response speed and tracking performance of IBVS, but also can ensure the success of servoing in the case of the feature loss. Next, in order to deal with the external disturbance and uncertainties due to the depth of the features, the third new control method is designed to combine proportional derivative (PD) control with sliding mode control (SMC) on a 6-DOF manipulator. The properly tuned PD controller can ensure the fast tracking performance and SMC can deal with the external disturbance and depth uncertainties. In the last stage of the thesis, the fourth new semi off-line trajectory planning method is developed to perform IBVS tasks for a 6-DOF robotic manipulator system. In this method, the camera's velocity screw is parametrized using time-based profiles. The parameters of the velocity profile are then determined such that the velocity profile takes the robot to its desired position. This is done by minimizing the error between the initial and desired features. The algorithm for planning the orientation of the robot is decoupled from the position planning of the robot. This allows a convex optimization problem which lead to a faster and more efficient algorithm. The merit of the proposed method is that it respects all of the system constraints. This method also considers the limitation caused by camera's FOV. All the developed algorithms in the thesis are validated via tests on a 6-DOF Denso robot in an eye-in-hand configuration

    Depth-Assisted Semantic Segmentation, Image Enhancement and Parametric Modeling

    Get PDF
    This dissertation addresses the problem of employing 3D depth information on solving a number of traditional challenging computer vision/graphics problems. Humans have the abilities of perceiving the depth information in 3D world, which enable humans to reconstruct layouts, recognize objects and understand the geometric space and semantic meanings of the visual world. Therefore it is significant to explore how the 3D depth information can be utilized by computer vision systems to mimic such abilities of humans. This dissertation aims at employing 3D depth information to solve vision/graphics problems in the following aspects: scene understanding, image enhancements and 3D reconstruction and modeling. In addressing scene understanding problem, we present a framework for semantic segmentation and object recognition on urban video sequence only using dense depth maps recovered from the video. Five view-independent 3D features that vary with object class are extracted from dense depth maps and used for segmenting and recognizing different object classes in street scene images. We demonstrate a scene parsing algorithm that uses only dense 3D depth information to outperform using sparse 3D or 2D appearance features. In addressing image enhancement problem, we present a framework to overcome the imperfections of personal photographs of tourist sites using the rich information provided by large-scale internet photo collections (IPCs). By augmenting personal 2D images with 3D information reconstructed from IPCs, we address a number of traditionally challenging image enhancement techniques and achieve high-quality results using simple and robust algorithms. In addressing 3D reconstruction and modeling problem, we focus on parametric modeling of flower petals, the most distinctive part of a plant. The complex structure, severe occlusions and wide variations make the reconstruction of their 3D models a challenging task. We overcome these challenges by combining data driven modeling techniques with domain knowledge from botany. Taking a 3D point cloud of an input flower scanned from a single view, each segmented petal is fitted with a scale-invariant morphable petal shape model, which is constructed from individually scanned 3D exemplar petals. Novel constraints based on botany studies are incorporated into the fitting process for realistically reconstructing occluded regions and maintaining correct 3D spatial relations. The main contribution of the dissertation is in the intelligent usage of 3D depth information on solving traditional challenging vision/graphics problems. By developing some advanced algorithms either automatically or with minimum user interaction, the goal of this dissertation is to demonstrate that computed 3D depth behind the multiple images contains rich information of the visual world and therefore can be intelligently utilized to recognize/ understand semantic meanings of scenes, efficiently enhance and augment single 2D images, and reconstruct high-quality 3D models

    Computer vision in the space of light rays: plenoptic videogeometry and polydioptric camera design

    Get PDF
    Most of the cameras used in computer vision, computer graphics, and image processing applications are designed to capture images that are similar to the images we see with our eyes. This enables an easy interpretation of the visual information by a human observer. Nowadays though, more and more processing of visual information is done by computers. Thus, it is worth questioning if these human inspired ``eyes'' are the optimal choice for processing visual information using a machine. In this thesis I will describe how one can study problems in computer vision without reference to a specific camera model by studying the geometry and statistics of the space of light rays that surrounds us. The study of the geometry will allow us to determine all the possible constraints that exist in the visual input and could be utilized if we had a perfect sensor. Since no perfect sensor exists we use signal processing techniques to examine how well the constraints between different sets of light rays can be exploited given a specific camera model. A camera is modeled as a spatio-temporal filter in the space of light rays which lets us express the image formation process in a function approximation framework. This framework then allows us to relate the geometry of the imaging camera to the performance of the vision system with regard to the given task. In this thesis I apply this framework to problem of camera motion estimation. I show how by choosing the right camera design we can solve for the camera motion using linear, scene-independent constraints that allow for robust solutions. This is compared to motion estimation using conventional cameras. In addition we show how we can extract spatio-temporal models from multiple video sequences using multi-resolution subdivison surfaces

    Motion — Stereo Integration for Depth Estimation

    No full text
    corecore