130 research outputs found

    Cognitive Robotics in Industrial Environments

    Get PDF

    Human Pose Estimation from Monocular Images : a Comprehensive Survey

    Get PDF
    Human pose estimation refers to the estimation of the location of body parts and how they are connected in an image. Human pose estimation from monocular images has wide applications (e.g., image indexing). Several surveys on human pose estimation can be found in the literature, but they focus on a certain category; for example, model-based approaches or human motion analysis, etc. As far as we know, an overall review of this problem domain has yet to be provided. Furthermore, recent advancements based on deep learning have brought novel algorithms for this problem. In this paper, a comprehensive survey of human pose estimation from monocular images is carried out including milestone works and recent advancements. Based on one standard pipeline for the solution of computer vision problems, this survey splits the problema into several modules: feature extraction and description, human body models, and modelin methods. Problem modeling methods are approached based on two means of categorization in this survey. One way to categorize includes top-down and bottom-up methods, and another way includes generative and discriminative methods. Considering the fact that one direct application of human pose estimation is to provide initialization for automatic video surveillance, there are additional sections for motion-related methods in all modules: motion features, motion models, and motion-based methods. Finally, the paper also collects 26 publicly available data sets for validation and provides error measurement methods that are frequently used

    Inferring Human Pose and Motion from Images

    No full text
    As optical gesture recognition technology advances, touchless human computer interfaces of the future will soon become a reality. One particular technology, markerless motion capture, has gained a large amount of attention, with widespread application in diverse disciplines, including medical science, sports analysis, advanced user interfaces, and virtual arts. However, the complexity of human anatomy makes markerless motion capture a non-trivial problem: I) parameterised pose configuration exhibits high dimensionality, and II) there is considerable ambiguity in surjective inverse mapping from observation to pose configuration spaces with a limited number of camera views. These factors together lead to multimodality in high dimensional space, making markerless motion capture an ill-posed problem. This study challenges these difficulties by introducing a new framework. It begins with automatically modelling specific subject template models and calibrating posture at the initial stage. Subsequent tracking is accomplished by embedding naturally-inspired global optimisation into the sequential Bayesian filtering framework. Tracking is enhanced by several robust evaluation improvements. Sparsity of images is managed by compressive evaluation, further accelerating computational efficiency in high dimensional space

    A Multi-camera Network System for Markerless 3D Human Body Voxel Reconstruction

    Full text link
    This paper presents a fully automated system for real-time 3D human visual hull reconstruction and skeleton vox-els extraction. The main contributions include: (1) A novel network based system is presented, which uses AXIS net-work cameras as video capture device, and performs a parallel processing among data capture, 3D voxel recon-struction and display. (2) A new human visual hull re-construction algorithm is given. This approach firstly seg-ments the foreground accurately by an efficient Gaussian Mixture Model (GMM) and a shadow model in HSV color space, then extends the standard Shape-From-Silhouette (SFS) algorithm with online Region-of-Interest (ROI) esti-mation and binary searching, and finally construct skele-ton probability visual hull with distance transform. Exper-iments with real video sequences show that the system can process eleven 640x480 video sequences at a frame rate of 15fps, and construct human body voxels reliably in complex scenarios with cast shadows, various body configurations and multiple persons. 1

    Dynamic shape capture using multi-view photometric stereo

    Full text link

    General Automatic Human Shape and Motion Capture Using Volumetric Contour Cues

    Get PDF
    Markerless motion capture algorithms require a 3D body with properly personalized skeleton dimension and/or body shape and appearance to successfully track a person. Unfortunately, many tracking methods consider model personalization a different problem and use manual or semi-automatic model initialization, which greatly reduces applicability. In this paper, we propose a fully automatic algorithm that jointly creates a rigged actor model commonly used for animation - skeleton, volumetric shape, appearance, and optionally a body surface - and estimates the actor's motion from multi-view video input only. The approach is rigorously designed to work on footage of general outdoor scenes recorded with very few cameras and without background subtraction. Our method uses a new image formation model with analytic visibility and analytically differentiable alignment energy. For reconstruction, 3D body shape is approximated as Gaussian density field. For pose and shape estimation, we minimize a new edge-based alignment energy inspired by volume raycasting in an absorbing medium. We further propose a new statistical human body model that represents the body surface, volumetric Gaussian density, as well as variability in skeleton shape. Given any multi-view sequence, our method jointly optimizes the pose and shape parameters of this model fully automatically in a spatiotemporal way

    Single camera pose estimation using Bayesian filtering and Kinect motion priors

    Full text link
    Traditional approaches to upper body pose estimation using monocular vision rely on complex body models and a large variety of geometric constraints. We argue that this is not ideal and somewhat inelegant as it results in large processing burdens, and instead attempt to incorporate these constraints through priors obtained directly from training data. A prior distribution covering the probability of a human pose occurring is used to incorporate likely human poses. This distribution is obtained offline, by fitting a Gaussian mixture model to a large dataset of recorded human body poses, tracked using a Kinect sensor. We combine this prior information with a random walk transition model to obtain an upper body model, suitable for use within a recursive Bayesian filtering framework. Our model can be viewed as a mixture of discrete Ornstein-Uhlenbeck processes, in that states behave as random walks, but drift towards a set of typically observed poses. This model is combined with measurements of the human head and hand positions, using recursive Bayesian estimation to incorporate temporal information. Measurements are obtained using face detection and a simple skin colour hand detector, trained using the detected face. The suggested model is designed with analytical tractability in mind and we show that the pose tracking can be Rao-Blackwellised using the mixture Kalman filter, allowing for computational efficiency while still incorporating bio-mechanical properties of the upper body. In addition, the use of the proposed upper body model allows reliable three-dimensional pose estimates to be obtained indirectly for a number of joints that are often difficult to detect using traditional object recognition strategies. Comparisons with Kinect sensor results and the state of the art in 2D pose estimation highlight the efficacy of the proposed approach.Comment: 25 pages, Technical report, related to Burke and Lasenby, AMDO 2014 conference paper. Code sample: https://github.com/mgb45/SignerBodyPose Video: https://www.youtube.com/watch?v=dJMTSo7-uF

    Developing a pipeline for gait analysis with a side-view depth sensor

    Get PDF
    This thesis presents computational methods for conducting gait analysis with a sideview depth sensor. First, a method to segment human body parts in a depth image is presented. A standard supervised segmentation algorithm is run on a novel graph representation of the depth image. It is demonstrated that the new graph structure improves the accuracy of the segmentation. This contribution is intended to allow fast labelling of depth images for training a human joint predictor. Next, a method is presented to select accurate 3D positions of human joints from multiple proposals. These proposals are generated by a predictor from a side-view depth image. Finally, a gait analysis system is built on the joint selection process. The system calculates standard parameters used in clinical gait analysis. Walking trials have been measured concurrently by a pressure-sensitive walkway and a side-view depth sensor. The estimated gait parameters are validated against the ground truth parameters from the walkway. As future work, the initial segmentation process could be applied to multi-view depth images for training a view-invariant joint predictor. The proposed gait analysis system can then be applied to the predicted joints

    A framework for digitisation of manual manufacturing task knowledge using gaming interface technology

    Get PDF
    Intense market competition and the global skill supply crunch are hurting the manufacturing industry, which is heavily dependent on skilled labour. Companies must look for innovative ways to acquire manufacturing skills from their experts and transfer them to novices and eventually to machines to remain competitive. There is a lack of systematic processes in the manufacturing industry and research for cost-effective capture and transfer of human skills. Therefore, the aim of this research is to develop a framework for digitisation of manual manufacturing task knowledge, a major constituent of which is human skill. The proposed digitisation framework is based on the theory of human-workpiece interactions that is developed in this research. The unique aspect of the framework is the use of consumer-grade gaming interface technology to capture and record manual manufacturing tasks in digital form to enable the extraction, decoding and transfer of manufacturing knowledge constituents that are associated with the task. The framework is implemented, tested and refined using 5 case studies, including 1 toy assembly task, 2 real-life-like assembly tasks, 1 simulated assembly task and 1 real-life composite layup task. It is successfully validated based on the outcomes of the case studies and a benchmarking exercise that was conducted to evaluate its performance. This research contributes to knowledge in five main areas, namely, (1) the theory of human-workpiece interactions to decipher human behaviour in manual manufacturing tasks, (2) a cohesive and holistic framework to digitise manual manufacturing task knowledge, especially tacit knowledge such as human action and reaction skills, (3) the use of low-cost gaming interface technology to capture human actions and the effect of those actions on workpieces during a manufacturing task, (4) a new way to use hidden Markov modelling to produce digital skill models to represent human ability to perform complex tasks and (5) extraction and decoding of manufacturing knowledge constituents from the digital skill models
    corecore