51 research outputs found

    Inferring Human Pose and Motion from Images

    No full text
    As optical gesture recognition technology advances, touchless human computer interfaces of the future will soon become a reality. One particular technology, markerless motion capture, has gained a large amount of attention, with widespread application in diverse disciplines, including medical science, sports analysis, advanced user interfaces, and virtual arts. However, the complexity of human anatomy makes markerless motion capture a non-trivial problem: I) parameterised pose configuration exhibits high dimensionality, and II) there is considerable ambiguity in surjective inverse mapping from observation to pose configuration spaces with a limited number of camera views. These factors together lead to multimodality in high dimensional space, making markerless motion capture an ill-posed problem. This study challenges these difficulties by introducing a new framework. It begins with automatically modelling specific subject template models and calibrating posture at the initial stage. Subsequent tracking is accomplished by embedding naturally-inspired global optimisation into the sequential Bayesian filtering framework. Tracking is enhanced by several robust evaluation improvements. Sparsity of images is managed by compressive evaluation, further accelerating computational efficiency in high dimensional space

    Cognitive Robotics in Industrial Environments

    Get PDF

    Metaheuristic Optimization Techniques for Articulated Human Tracking

    Get PDF
    Four adaptive metaheuristic optimization algorithms are proposed and demonstrated: Adaptive Parameter Particle Swarm Optimization (AP-PSO), Modified Artificial Bat (MAB), Differential Mutated Artificial Immune System (DM-AIS) and hybrid Particle Swarm Accelerated Artificial Immune System (PSO-AIS). The algorithms adapt their search parameters on the basis of the fitness of obtained solutions such that a good fitness value favors local search, while a poor fitness value favors global search. This efficient feedback of the solution quality, imparts excellent global and local search characteristic to the proposed algorithms. The algorithms are tested on the challenging Articulated Human Tracking (AHT) problem whose objective is to infer human pose, expressed in terms of joint angles, from a continuous video stream. The Particle Filter (PF) algorithms, widely applied in generative model based AHT, suffer from the 'curse of dimensionality' and 'degeneracy' challenges. The four proposed algorithms show stable performance throughout the course of numerical experiments. DM-AIS performs best among the proposed algorithms followed in order by PSO-AIS, AP-PSO, and MBA in terms of Most Appropriate Pose (MAP) tracking error. The MAP tracking error of the proposed algorithms is compared with four heuristic approaches: generic PF, Annealed Particle Filter (APF), Partitioned Sampled Annealed Particle Filter (PSAPF) and Hierarchical Particle Swarm Optimization (HPSO). They are found to outperform generic PF with a confidence level of 95%, PSAPF and HPSO with a confidence level of 85%. While DM-AIS and PSO-AIS outperform APF with a confidence level of 80%. Further, it is noted that the proposed algorithms outperform PSAPF and HPSO using a significantly lower number of function evaluations, 2500 versus 7200. The proposed algorithms demonstrate reduced particle requirements, hence improving computational efficiency and helping to alleviate the 'curse of dimensionality'. The adaptive nature of the algorithms is found to guide the whole swarm towards the optimal solution by sharing information and exploring a wider solution space which resolves the 'degeneracy' challenge. Furthermore, the decentralized structure of the algorithms renders them insensitive to accumulation of error and allows them to recover from catastrophic failures due to loss of image data, sudden change in motion pattern or discrete instances of algorithmic failure. The performance enhancements demonstrated by the proposed algorithms, attributed to their balanced local and global search capabilities, makes real-time AHT applications feasible. Finally, the utility of the proposed algorithms in low-dimensional system identification problems as well as high-dimensional AHT problems demonstrates their applicability in various problem domains

    Model-based human upper body tracking using interest points in real-time video

    Get PDF
    Vision-based human motion analysis has received huge attention from researchers because of the number of applications, such as automated surveillance, video indexing, human machine interaction, traffic monitoring, and vehicle navigation. However, it contains several open problems. To date, despite very promising proposed approaches, no explicit solution has been found to solve these open problems efficiently. In this regard, this thesis presents a model-based human upper body pose estimation and tracking system using interest points (IPs) in real-time video. In the first stage, we propose a novel IP-based background-subtraction algorithm to segment the foreground IPs of each frame from the background ones. Afterwards, the foreground IPs of any two consecutive frames are matched to each other using a dynamic hybrid localspatial IP matching algorithm, proposed in this research. The IP matching algorithm starts by using the local feature descriptors of the IPs to find an initial set of possible matches. Then two filtering steps are applied to the results to increase the precision by deleting the mismatched pairs. To improve the recall, a spatial matching process is applied to the remaining unmatched points. Finally, a two-stage hierarchical-global model-based pose estimation and tracking algorithm based on Particle Swarm Optimiation (PSO) is proposed to track the human upper body through consecutive frames. Given the pose and the foreground IPs in the previous frame and the matched points in the current frame, the proposed PSO-based pose estimation and tracking algorithm estimates the current pose hierarchically by minimizing the discrepancy between the hypothesized pose and the real matched observed points in the first stage. Then a global PSO is applied to the pose estimated by the first stage to do a consistency check and pose refinement

    Parallel bio-inspired methods for model optimization and pattern recognition

    Get PDF
    Nature based computational models are usually inherently parallel. The collaborative intelligence in those models emerges from the simultaneous instruction processing by simple independent units (neurons, ants, swarm members, etc...). This dissertation investigates the benefits of such parallel models in terms of efficiency and accuracy. First, the viability of a parallel implementation of bio-inspired metaheuristics for function optimization on consumer-level graphic cards is studied in detail. Then, in an effort to expose those parallel methods to the research community, the metaheuristic implementations were abstracted and grouped in an open source parameter/function optimization library libCudaOptimize. The library was verified against a well known benchmark for mathematical function minimization, and showed significant gains in both execution time and minimization accuracy. Crossing more into the application side, a parallel model of the human neocortex was developed. This model is able to detect, classify, and predict patterns in time-series data in an unsupervised way. Finally, libCudaOptimize was used to find the best parameters for this neocortex model, adapting it to gesture recognition within publicly available datasets

    Single View Human Pose Tracking

    Get PDF
    Recovery of human pose from videos has become a highly active research area in the last decade because of many attractive potential applications, such as surveillance, non-intrusive motion analysis and natural human machine interaction. Video based full body pose estimation is a very challenging task, because of the high degree of articulation of the human body, the large variety of possible human motions, and the diversity of human appearances. Methods for tackling this problem can be roughly categorized as either discriminative or generative. Discriminative methods can work on single images, and are able to recover the human poses efficiently. However, the accuracy and generality largely depend on the training data. Generative approaches usually formulate the problem as a tracking problem and adopt an explicit human model. Although arbitrary motions can be tracked, such systems usually have difficulties in adapting to different subjects and in dealing with tracking failures. In this thesis, an accurate, efficient and robust human pose tracking system from a single view camera is developed, mainly following a generative approach. A novel discriminative feature is also proposed and integrated into the tracking framework to improve the tracking performance. The human pose tracking system is proposed within a particle filtering framework. A reconfigurable skeleton model is constructed based on the Acclaim Skeleton File convention. A basic particle filter is first implemented for upper body tracking, which fuses time efficient cues from monocular sequences and achieves real-time tracking for constrained motions. Next, a 3D surface model is added to the skeleton model, and a full body tracking system is developed for more general and complex motions, assuming a stereo camera input. Partitioned sampling is adopted to deal with the high dimensionality problem, and the system is capable of running in near real-time. Multiple visual cues are investigated and compared, including a newly developed explicit depth cue. Based on the comparative analysis of cues, which reveals the importance of depth and good bottom-up features, a novel algorithm for detecting and identifying endpoint body parts from depth images is proposed. Inspired by the shape context concept, this thesis proposes a novel Local Shape Context (LSC) descriptor specifically for describing the shape features of body parts in depth images. This descriptor describes the local shape of different body parts with respect to a given reference point on a human silhouette, and is shown to be effective at detecting and classifying endpoint body parts. A new type of interest point is defined based on the LSC descriptor, and a hierarchical interest point selection algorithm is designed to further conserve computational resources. The detected endpoint body parts are then classified according to learned models based on the LSC feature. The algorithm is tested using a public dataset and achieves good accuracy with a 100Hz processing speed on a standard PC. Finally, the LSC descriptor is improved to be more generalized. Both the endpoint body parts and the limbs are detected simultaneously. The generalized algorithm is integrated into the tracking framework, which provides a very strong cue and enables tracking failure recovery. The skeleton model is also simplified to further increase the system efficiency. To evaluate the system on arbitrary motions quantitatively, a new dataset is designed and collected using a synchronized Kinect sensor and a marker based motion capture system, including 22 different motions from 5 human subjects. The system is capable of tracking full body motions accurately using a simple skeleton-only model in near real-time on a laptop PC before optimization