83,833 research outputs found

    A sliding mode approach to visual motion estimation

    Get PDF
    The problem of estimating motion from a sequence of images has been a major research theme in machine vision for many years and remains one of the most challenging ones. In this work, we use sliding mode observers to estimate the motion of a moving body with the aid of a CCD camera. We consider a variety of dynamical systems which arise in machine vision applications and develop a novel identication procedure for the estimation of both constant and time varying parameters. The basic procedure introduced for parameter estimation is to recast image feature dynamics linearly in terms of unknown parameters and construct a sliding mode observer to produce asymptotically correct estimates of the observed image features, and then use “equivalent control” to explicitly compute parameters. Much of our analysis has been substantiated by computer simulations and real experiments

    Electronic Image Stabilization for Mobile Robotic Vision Systems

    Get PDF
    When a camera is affixed on a dynamic mobile robot, image stabilization is the first step towards more complex analysis on the video feed. This thesis presents a novel electronic image stabilization (EIS) algorithm for small inexpensive highly dynamic mobile robotic platforms with onboard camera systems. The algorithm combines optical flow motion parameter estimation with angular rate data provided by a strapdown inertial measurement unit (IMU). A discrete Kalman filter in feedforward configuration is used for optimal fusion of the two data sources. Performance evaluations are conducted by a simulated video truth model (capturing the effects of image translation, rotation, blurring, and moving objects), and live test data. Live data was collected from a camera and IMU affixed to the DAGSI Whegs™ mobile robotic platform as it navigated through a hallway. Template matching, feature detection, optical flow, and inertial measurement techniques are compared and analyzed to determine the most suitable algorithm for this specific type of image stabilization. Pyramidal Lucas-Kanade optical flow using Shi-Tomasi good features in combination with inertial measurement is the EIS algorithm found to be superior. In the presence of moving objects, fusion of inertial measurement reduces optical flow root-mean-squared (RMS) error in motion parameter estimates by 40%. No previous image stabilization algorithm to date directly fuses optical flow estimation with inertial measurement by way of Kalman filtering

    Bayesian Methods for Radiometric Calibration in Motion Picture Encoding Workflows

    Get PDF
    A method for estimating the Camera Response Function (CRF) of an electronic motion picture camera is presented in this work. The accurate estimation of the CRF allows for proper encoding of camera exposures into motion picture post-production workflows, like the Academy Color Encoding Specification (ACES), this being a necessary step to correctly combine images from different capture sources into one cohesive final production and minimize non-creative manual adjustments. Although there are well known standard CRFs implemented in typical video camera workflows, motion picture workflows and newer High Dynamic Range (HDR) imaging workflows have introduced new standard CRFs as well as custom and proprietary CRFs that need to be known for proper post-production encoding of the camera footage. Current methods to estimate this function rely on the use of measurement charts, using multiple static images taken under different exposures or lighting conditions, or assume a simplistic model of the function’s shape. All these methods become problematic and tough to fit into motion picture production and post-production workflows where the use of test charts and varying camera or scene setups becomes impractical and where a method based solely on camera footage, comprised of a single image or a series of images, would be advantageous. This work presents a methodology initially based on the work of Lin, Gu, Yamazaki and Shum that takes into account edge color mixtures in an image or image sequence, that are affected by the non-linearity introduced by a CRF. In addition, a novel feature based on image noise is introduced to overcome some of the limitations of edge color mixtures. These features provide information that is included in the likelihood probability distribution in a Bayesian framework to estimate the CRF as the expected value of a posterior probability distribution, which is itself approximated by a Markov Chain Monte Carlo (MCMC) sampling algorithm. This allows for a more complete description of the CRF over methods like Maximum Likelihood (ML) and Maximum A Posteriori (MAP). The CRF function is modeled by Principal Component Analysis (PCA) of the Database of Response Functions (DoRF) compiled by Grossberg and Nayar, and the prior probability distribution is modeled by a Gaussian Mixture Model (GMM) of the PCA coefficients for the responses in the DoRF. CRF estimation results are presented for an ARRI electronic motion picture camera, showing the improved estimation accuracy and practicality of this method over previous methods for motion picture post-production workflows

    Motion-based Segmentation and Classification of Video Objects

    Full text link
    In this thesis novel algorithms for the segmentation and classification of video objects are developed. The segmentation procedure is based on motion and is able to extract moving objects acquired by either a static or a moving camera. The classification of those objects is performed by matching their outlines gathered from a number of consecutive frames of the video with preprocessed views of prototypical objects stored in a database. This thesis contributes to four areas of image processing and computer vision: motion analysis, implicit active contour models, motion-based segmentation, and object classification. In detail, in the field of motion analysis, the tensor-based motion estimation approach is extended by a non-maximum suppression scheme, which improves the identification of relevant image structures significantly. In order to analyze videos that contain large image displacements, a feature-based motion estimation method is developed. In addition, to include camera operations into the segmentation process, a robust camera motion estimator based on least trimmed squares regression is presented. In the area of implicit active contour models, a model that unifies geometric and geodesic active contours is developed. For this model an efficient numerical implementation based on a new narrow-band method and a semi-implicit discretization is provided. Compared to standard algorithms these optimizations reduce the computational complexity significantly. Integrating the results of the motion analysis into the fast active contour implementation, novel algorithms for motion-based segmentation are developed. In the field of object classification, a shape-based classification approach is extended and adapted to image sequence processing. Finally, a system for video object classification is derived by combining the proposed motion-based segmentation algorithms with the shape-based classification approach

    Vision-based 3D Pose Retrieval and Reconstruction

    Get PDF
    The people analysis and the understandings of their motions are the key components in many applications like sports sciences, biomechanics, medical rehabilitation, animated movie productions and the game industry. In this context, retrieval and reconstruction of the articulated 3D human poses are considered as the significant sub-elements. In this dissertation, we address the problem of retrieval and reconstruction of the 3D poses from a monocular video or even from a single RGB image. We propose a few data-driven pipelines to retrieve and reconstruct the 3D poses by exploiting the motion capture data as a prior. The main focus of our proposed approaches is to bridge the gap between the separate media of the 3D marker-based recording and the capturing of motions or photographs using a simple RGB camera. In principal, we leverage both media together efficiently for 3D pose estimation. We have shown that our proposed methodologies need not any synchronized 3D-2D pose-image pairs to retrieve and reconstruct the final 3D poses, and are flexible enough to capture motion in any studio-like indoor environment or outdoor natural environment. In first part of the dissertation, we propose model based approaches for full body human motion reconstruction from the video input by employing just 2D joint positions of the four end effectors and the head. We resolve the 3D-2D pose-image cross model correspondence by developing an intermediate container the knowledge base through the motion capture data which contains information about how people move. It includes the 3D normalized pose space and the corresponding synchronized 2D normalized pose space created by utilizing a number of virtual cameras. We first detect and track the features of these five joints from the input motion sequences using SURF, MSER and colorMSER feature detectors, which vote for the possible 2D locations for these joints in the video. The extraction of suitable feature sets from both, the input control signals and the motion capture data, enables us to retrieve the closest instances from the motion capture dataset through employing the fast searching and retrieval techniques. We develop a graphical structure online lazy neighbourhood graph in order to make the similarity search more accurate and robust by deploying the temporal coherence of the input control signals. The retrieved prior poses are exploited further in order to stabilize the feature detection and tracking process. Finally, the 3D motion sequences are reconstructed by a non-linear optimizer that takes into account multiple energy terms. We evaluate our approaches with a series of experiment scenarios designed in terms of performing actors, camera viewpoints and the noisy inputs. Only a little preprocessing is needed by our methods and the reconstruction processes run close to real time. The second part of the dissertation is dedicated to 3D human pose estimation from a monocular single image. First, we propose an efficient 3D pose retrieval strategy which leads towards a novel data driven approach to reconstruct a 3D human pose from a monocular still image. We design and devise multiple feature sets for global similarity search. At runtime, we search for the similar poses from a motion capture dataset in a definite feature space made up of specific joints. We introduce two-fold method for camera estimation, where we exploit the view directions at which we perform sampling of the MoCap dataset as well as the MoCap priors to minimize the projection error. We also benefit from the MoCap priors and the joints' weights in order to learn a low-dimensional local 3D pose model which is constrained further by multiple energies to infer the final 3D human pose. We thoroughly evaluate our approach on synthetically generated examples, the real internet images and the hand-drawn sketches. We achieve state-of-the-arts results when the test and MoCap data are from the same dataset and obtain competitive results when the motion capture data is taken from a different dataset. Second, we propose a dual source approach for 3D pose estimation from a single RGB image. One major challenge for 3D pose estimation from a single RGB image is the acquisition of sufficient training data. In particular, collecting large amounts of training data that contain unconstrained images and are annotated with accurate 3D poses is infeasible. We therefore propose to use two independent training sources. The first source consists of images with annotated 2D poses and the second source consists of accurate 3D motion capture data. To integrate both sources, we propose a dual-source approach that combines 2D pose estimation with efficient and robust 3D pose retrieval. In our experiments, we show that our approach achieves state-of-the-art results and is even competitive when the skeleton structures of the two sources differ substantially. In the last part of the dissertation, we focus on how the different techniques, developed for the human motion capturing, retrieval and reconstruction can be adapted to handle the quadruped motion capture data and which new applications may appear. We discuss some particularities which must be considered during capturing the large animal motions. For retrieval, we derive the suitable feature sets in order to perform fast searches into the MoCap dataset for similar motion segments. At the end, we present a data-driven approach to reconstruct the quadruped motions from the video input data

    Homography-based ground plane detection using a single on-board camera

    Get PDF
    This study presents a robust method for ground plane detection in vision-based systems with a non-stationary camera. The proposed method is based on the reliable estimation of the homography between ground planes in successive images. This homography is computed using a feature matching approach, which in contrast to classical approaches to on-board motion estimation does not require explicit ego-motion calculation. As opposed to it, a novel homography calculation method based on a linear estimation framework is presented. This framework provides predictions of the ground plane transformation matrix that are dynamically updated with new measurements. The method is specially suited for challenging environments, in particular traffic scenarios, in which the information is scarce and the homography computed from the images is usually inaccurate or erroneous. The proposed estimation framework is able to remove erroneous measurements and to correct those that are inaccurate, hence producing a reliable homography estimate at each instant. It is based on the evaluation of the difference between the predicted and the observed transformations, measured according to the spectral norm of the associated matrix of differences. Moreover, an example is provided on how to use the information extracted from ground plane estimation to achieve object detection and tracking. The method has been successfully demonstrated for the detection of moving vehicles in traffic environments
    corecore