3,114 research outputs found

    Model-Based High-Dimensional Pose Estimation with Application to Hand Tracking

    Get PDF
    This thesis presents novel techniques for computer vision based full-DOF human hand motion estimation. Our main contributions are: A robust skin color estimation approach; A novel resolution-independent and memory efficient representation of hand pose silhouettes, which allows us to compute area-based similarity measures in near-constant time; A set of new segmentation-based similarity measures; A new class of similarity measures that work for nearly arbitrary input modalities; A novel edge-based similarity measure that avoids any problematic thresholding or discretizations and can be computed very efficiently in Fourier space; A template hierarchy to minimize the number of similarity computations needed for finding the most likely hand pose observed; And finally, a novel image space search method, which we naturally combine with our hierarchy. Consequently, matching can efficiently be formulated as a simultaneous template tree traversal and function maximization

    Automated Markerless Extraction of Walking People Using Deformable Contour Models

    No full text
    We develop a new automated markerless motion capture system for the analysis of walking people. We employ global evidence gathering techniques guided by biomechanical analysis to robustly extract articulated motion. This forms a basis for new deformable contour models, using local image cues to capture shape and motion at a more detailed level. We extend the greedy snake formulation to include temporal constraints and occlusion modelling, increasing the capability of this technique when dealing with cluttered and self-occluding extraction targets. This approach is evaluated on a large database of indoor and outdoor video data, demonstrating fast and autonomous motion capture for walking people

    On the merits of the Gaussian Mixture as a model for oriented edgel distributions

    Get PDF
    The aim of this report is to establish the credibility of the Gaussian Mixture Model (GMM) as a model for the distributions of oriented edgels of rigid and biological objects in noisy images. This is tackled in two stages: first, the response of the Soble filter to noisy pixels is analysed to show that the result holds for smooth ridid objects. Second, arguments are presented to support the proposition that the model can also effectively capture the added uncertainty introduced by natural shape variation, as found in images of biological objects. The result has particular application in the extension of the Generalized Hough Transform (GHT) to deformable shapes; in particular if offers a tailored and manipulable alternative to the non-parametric kernel density estimate used by Ecabert and Thiran

    Automatic Lumbar Vertebrae Segmentation in Fluoroscopic Images via Optimised Concurrent Hough Transform

    No full text
    Low back pain is a very common problem in the industrialised countries and its associated cost is enormous. Diagnosis of the underlying causes can be extremely difficult. Many studies have focused on mechanical disorders of the spine. Digital videofluoroscopy (DVF) was widely used to obtain images for motion studies. This can provide motion sequences of the lumbar spine, but the images obtained often suffer due to noise, exacerbated by the very low radiation dosage. Thus determining vertebrae position within the image sequence presents a considerable challenge. In this paper, we show how our new approach can automatically detect the positions and borders of vertebrae concurrently, relieving many of the problems experienced in other approaches. First, we use phase congruency to relieve difficulty associated with threshold selection in edge detection of the illumination variant DVF images. Then, our new Hough transform approach is applied to determine the moving vertebrae, concurrently. We include optimisation via a genetic algorithm as without it the extraction of moving multiple vertebrae is computationally daunting. Our results show that this new approach can indeed provide extractions of position and rotation which appear to be of sufficient quality to aid therapy and diagnosis of spinal disorders

    Blending Learning and Inference in Structured Prediction

    Full text link
    In this paper we derive an efficient algorithm to learn the parameters of structured predictors in general graphical models. This algorithm blends the learning and inference tasks, which results in a significant speedup over traditional approaches, such as conditional random fields and structured support vector machines. For this purpose we utilize the structures of the predictors to describe a low dimensional structured prediction task which encourages local consistencies within the different structures while learning the parameters of the model. Convexity of the learning task provides the means to enforce the consistencies between the different parts. The inference-learning blending algorithm that we propose is guaranteed to converge to the optimum of the low dimensional primal and dual programs. Unlike many of the existing approaches, the inference-learning blending allows us to learn efficiently high-order graphical models, over regions of any size, and very large number of parameters. We demonstrate the effectiveness of our approach, while presenting state-of-the-art results in stereo estimation, semantic segmentation, shape reconstruction, and indoor scene understanding

    ROAM: a Rich Object Appearance Model with Application to Rotoscoping

    Get PDF
    Rotoscoping, the detailed delineation of scene elements through a video shot, is a painstaking task of tremendous importance in professional post-production pipelines. While pixel-wise segmentation techniques can help for this task, professional rotoscoping tools rely on parametric curves that offer the artists a much better interactive control on the definition, editing and manipulation of the segments of interest. Sticking to this prevalent rotoscoping paradigm, we propose a novel framework to capture and track the visual aspect of an arbitrary object in a scene, given a first closed outline of this object. This model combines a collection of local foreground/background appearance models spread along the outline, a global appearance model of the enclosed object and a set of distinctive foreground landmarks. The structure of this rich appearance model allows simple initialization, efficient iterative optimization with exact minimization at each step, and on-line adaptation in videos. We demonstrate qualitatively and quantitatively the merit of this framework through comparisons with tools based on either dynamic segmentation with a closed curve or pixel-wise binary labelling

    A Combinatorial Solution to Non-Rigid 3D Shape-to-Image Matching

    Get PDF
    We propose a combinatorial solution for the problem of non-rigidly matching a 3D shape to 3D image data. To this end, we model the shape as a triangular mesh and allow each triangle of this mesh to be rigidly transformed to achieve a suitable matching to the image. By penalising the distance and the relative rotation between neighbouring triangles our matching compromises between image and shape information. In this paper, we resolve two major challenges: Firstly, we address the resulting large and NP-hard combinatorial problem with a suitable graph-theoretic approach. Secondly, we propose an efficient discretisation of the unbounded 6-dimensional Lie group SE(3). To our knowledge this is the first combinatorial formulation for non-rigid 3D shape-to-image matching. In contrast to existing local (gradient descent) optimisation methods, we obtain solutions that do not require a good initialisation and that are within a bound of the optimal solution. We evaluate the proposed method on the two problems of non-rigid 3D shape-to-shape and non-rigid 3D shape-to-image registration and demonstrate that it provides promising results.Comment: 10 pages, 7 figure
    corecore