22 research outputs found

    Inferring Human Pose and Motion from Images

    No full text
    As optical gesture recognition technology advances, touchless human computer interfaces of the future will soon become a reality. One particular technology, markerless motion capture, has gained a large amount of attention, with widespread application in diverse disciplines, including medical science, sports analysis, advanced user interfaces, and virtual arts. However, the complexity of human anatomy makes markerless motion capture a non-trivial problem: I) parameterised pose configuration exhibits high dimensionality, and II) there is considerable ambiguity in surjective inverse mapping from observation to pose configuration spaces with a limited number of camera views. These factors together lead to multimodality in high dimensional space, making markerless motion capture an ill-posed problem. This study challenges these difficulties by introducing a new framework. It begins with automatically modelling specific subject template models and calibrating posture at the initial stage. Subsequent tracking is accomplished by embedding naturally-inspired global optimisation into the sequential Bayesian filtering framework. Tracking is enhanced by several robust evaluation improvements. Sparsity of images is managed by compressive evaluation, further accelerating computational efficiency in high dimensional space

    A framework for digitisation of manual manufacturing task knowledge using gaming interface technology

    Get PDF
    Intense market competition and the global skill supply crunch are hurting the manufacturing industry, which is heavily dependent on skilled labour. Companies must look for innovative ways to acquire manufacturing skills from their experts and transfer them to novices and eventually to machines to remain competitive. There is a lack of systematic processes in the manufacturing industry and research for cost-effective capture and transfer of human skills. Therefore, the aim of this research is to develop a framework for digitisation of manual manufacturing task knowledge, a major constituent of which is human skill. The proposed digitisation framework is based on the theory of human-workpiece interactions that is developed in this research. The unique aspect of the framework is the use of consumer-grade gaming interface technology to capture and record manual manufacturing tasks in digital form to enable the extraction, decoding and transfer of manufacturing knowledge constituents that are associated with the task. The framework is implemented, tested and refined using 5 case studies, including 1 toy assembly task, 2 real-life-like assembly tasks, 1 simulated assembly task and 1 real-life composite layup task. It is successfully validated based on the outcomes of the case studies and a benchmarking exercise that was conducted to evaluate its performance. This research contributes to knowledge in five main areas, namely, (1) the theory of human-workpiece interactions to decipher human behaviour in manual manufacturing tasks, (2) a cohesive and holistic framework to digitise manual manufacturing task knowledge, especially tacit knowledge such as human action and reaction skills, (3) the use of low-cost gaming interface technology to capture human actions and the effect of those actions on workpieces during a manufacturing task, (4) a new way to use hidden Markov modelling to produce digital skill models to represent human ability to perform complex tasks and (5) extraction and decoding of manufacturing knowledge constituents from the digital skill models

    Hand eye coordination in surgery

    Get PDF
    The coordination of the hand in response to visual target selection has always been regarded as an essential quality in a range of professional activities. This quality has thus far been elusive to objective scientific measurements, and is usually engulfed in the overall performance of the individuals. Parallels can be drawn to surgery, especially Minimally Invasive Surgery (MIS), where the physical constraints imposed by the arrangements of the instruments and visualisation methods require certain coordination skills that are unprecedented. With the current paradigm shift towards early specialisation in surgical training and shortened focused training time, selection process should identify trainees with the highest potentials in certain specific skills. Although significant effort has been made in objective assessment of surgical skills, it is only currently possible to measure surgeons’ abilities at the time of assessment. It has been particularly difficult to quantify specific details of hand-eye coordination and assess innate ability of future skills development. The purpose of this thesis is to examine hand-eye coordination in laboratory-based simulations, with a particular emphasis on details that are important to MIS. In order to understand the challenges of visuomotor coordination, movement trajectory errors have been used to provide an insight into the innate coordinate mapping of the brain. In MIS, novel spatial transformations, due to a combination of distorted endoscopic image projections and the “fulcrum” effect of the instruments, accentuate movement generation errors. Obvious differences in the quality of movement trajectories have been observed between novices and experts in MIS, however, this is difficult to measure quantitatively. A Hidden Markov Model (HMM) is used in this thesis to reveal the underlying characteristic movement details of a particular MIS manoeuvre and how such features are exaggerated by the introduction of rotation in the endoscopic camera. The proposed method has demonstrated the feasibility of measuring movement trajectory quality by machine learning techniques without prior arbitrary classification of expertise. Experimental results have highlighted these changes in novice laparoscopic surgeons, even after a short period of training. The intricate relationship between the hands and the eyes changes when learning a skilled visuomotor task has been previously studied. Reactive eye movement, when visual input is used primarily as a feedback mechanism for error correction, implies difficulties in hand-eye coordination. As the brain learns to adapt to this new coordinate map, eye movements then become predictive of the action generated. The concept of measuring this spatiotemporal relationship is introduced as a measure of hand-eye coordination in MIS, by comparing the Target Distance Function (TDF) between the eye fixation and the instrument tip position on the laparoscopic screen. Further validation of this concept using high fidelity experimental tasks is presented, where higher cognitive influence and multiple target selection increase the complexity of the data analysis. To this end, Granger-causality is presented as a measure of the predictability of the instrument movement with the eye fixation pattern. Partial Directed Coherence (PDC), a frequency-domain variation of Granger-causality, is used for the first time to measure hand-eye coordination. Experimental results are used to establish the strengths and potential pitfalls of the technique. To further enhance the accuracy of this measurement, a modified Jensen-Shannon Divergence (JSD) measure has been developed for enhancing the signal matching algorithm and trajectory segmentations. The proposed framework incorporates high frequency noise filtering, which represents non-purposeful hand and eye movements. The accuracy of the technique has been demonstrated by quantitative measurement of multiple laparoscopic tasks by expert and novice surgeons. Experimental results supporting visual search behavioural theory are presented, as this underpins the target selection process immediately prior to visual motor action generation. The effects of specialisation and experience on visual search patterns are also examined. Finally, pilot results from functional brain imaging are presented, where the Posterior Parietal Cortical (PPC) activation is measured using optical spectroscopy techniques. PPC has been demonstrated to involve in the calculation of the coordinate transformations between the visual and motor systems, which establishes the possibilities of exciting future studies in hand-eye coordination

    Real-time Immersive human-computer interaction based on tracking and recognition of dynamic hand gestures

    Get PDF
    With fast developing and ever growing use of computer based technologies, human-computer interaction (HCI) plays an increasingly pivotal role. In virtual reality (VR), HCI technologies provide not only a better understanding of three-dimensional shapes and spaces, but also sensory immersion and physical interaction. With the hand based HCI being a key HCI modality for object manipulation and gesture based communication, challenges are presented to provide users a natural, intuitive, effortless, precise, and real-time method for HCI based on dynamic hand gestures, due to the complexity of hand postures formed by multiple joints with high degrees-of-freedom, the speed of hand movements with highly variable trajectories and rapid direction changes, and the precision required for interaction between hands and objects in the virtual world. Presented in this thesis is the design and development of a novel real-time HCI system based on a unique combination of a pair of data gloves based on fibre-optic curvature sensors to acquire finger joint angles, a hybrid tracking system based on inertia and ultrasound to capture hand position and orientation, and a stereoscopic display system to provide an immersive visual feedback. The potential and effectiveness of the proposed system is demonstrated through a number of applications, namely, hand gesture based virtual object manipulation and visualisation, hand gesture based direct sign writing, and hand gesture based finger spelling. For virtual object manipulation and visualisation, the system is shown to allow a user to select, translate, rotate, scale, release and visualise virtual objects (presented using graphics and volume data) in three-dimensional space using natural hand gestures in real-time. For direct sign writing, the system is shown to be able to display immediately the corresponding SignWriting symbols signed by a user using three different signing sequences and a range of complex hand gestures, which consist of various combinations of hand postures (with each finger open, half-bent, closed, adduction and abduction), eight hand orientations in horizontal/vertical plans, three palm facing directions, and various hand movements (which can have eight directions in horizontal/vertical plans, and can be repetitive, straight/curve, clockwise/anti-clockwise). The development includes a special visual interface to give not only a stereoscopic view of hand gestures and movements, but also a structured visual feedback for each stage of the signing sequence. An excellent basis is therefore formed to develop a full HCI based on all human gestures by integrating the proposed system with facial expression and body posture recognition methods. Furthermore, for finger spelling, the system is shown to be able to recognise five vowels signed by two hands using the British Sign Language in real-time

    3D human action recognition and motion analysis using selective representations

    Get PDF
    With the advent of marker-based motion capture, attempts have been made to recognise and quantify attributes of “type”, “content” and “behaviour” from the motion data. Current work exists to obtain quick and easy identification of human motion for use in multiple settings, such as healthcare and gaming by using activity monitors, wearable technology and low-cost accelerometers. Yet, analysing human motion and generating representative features to enable recognition and analysis in an efficient and comprehensive manner has proved elusive thus far. This thesis proposes practical solutions that are based on insights from clinicians, and learning attributes from motion capture data itself. This culminates in an application framework that learns the type, content and behaviour of human motion for recognition, quantitative clinical analysis and outcome measures. While marker-based motion capture has many uses, it also has major limitations that are explored in this thesis, not least in terms of hardware costs and practical utilisation. These drawbacks have led to the creation of depth sensors capable of providing robust, accurate and low-cost solution to detecting and tracking anatomical landmarks on the human body, without physical markers. This advancement has led researchers to develop low-cost solutions to important healthcare tasks, such as human motion analysis as a clinical aid in prevention care. In this thesis a variety of obstacles in handling markerless motion capture are identified and overcome by employing parameterisation of Axis- Angles, applying Euler Angles transformations to Exponential Maps, and appropriate distance measures between postures. While developing an efficient, usable and deployable application framework for clinicians, this thesis introduces techniques to recognise, analyse and quantify human motion in the context of identifying age-related change and mobility. The central theme of this thesis is the creation of discriminative representations of the human body using novel encoding and extraction approaches usable for both marker-based and marker-less motion capture data. The encoding of the human pose is modelled based on the spatial-temporal characteristics to generate a compact, efficient parameterisation. This combination allows for the detection of multiple known and unknown motions in real-time. However, in the context of benchmarking a major drawback exists, the lack of a clinically valid and relevant dataset to enable benchmarking. Without a dataset of this type, it is difficult to validated algorithms aimed at healthcare application. To this end, this thesis introduces a dataset that will enable the computer science community to benchmark healthcare-related algorithms

    Gradual sampling and mutual information maximisation for markerless motion capture

    No full text
    The major issue in markerless motion capture is finding the global optimum from the multimodal setting where distinctive gestures may have similar likelihood values. Instead of only focusing on effective searching as many existing works, our approach resolves gesture ambiguity by designing a better-behaved observation likelihood. We extend Annealed Particle Filtering by a novel gradual sampling scheme that allows evaluations to concentrate on large mismatches of the tracking subject. Noticing the limitation of silhouettes in resolving gesture ambiguity, we incorporate appearance information in an illumination invariant way by maximising Mutual Information between an appearance model and the observation. This in turn strengthens the effectiveness of the better-behaved likelihood. Experiments on the benchmark datasets show that our tracking performance is comparable to or higher than the state-of-the-art studies, but with simpler setting and higher computational efficiency

    Gradual Sampling and Mutual Information Maximisation for Markerless Motion Capture

    No full text
    The major issue in markerless motion capture is finding the global optimum from the multimodal setting where distinctive gestures may have similar likelihood values. Instead of only focusing on effective searching as many existing works, our approach resolves gesture ambiguity by designing a better-behaved observation likelihood. We extend Annealed Particle Filtering by a novel gradual sampling scheme that allows evaluations to concentrate on large mismatches of the tracking subject. Noticing the limitation of silhouettes in resolving gesture ambiguity, we incorporate appearance information in an illumination invariant way by maximising Mutual Information between an appearance model and the observation. This in turn strengthens the effectiveness of the better-behaved likelihood. Experiments on the benchmark datasets show that our tracking performance is comparable to or higher than the state-of-the-art studies, but with simpler setting and higher computational efficiency

    Using biomechanical constraints to improve video-based motion capture

    Get PDF
    In motion capture applications whose aim is to recover human body postures from various input, the high dimensionality of the problem makes it desirable to reduce the size of the search-space by eliminating a priori impossible configurations. This can be carried out by constraining the posture recovery process in various ways. Most recent work in this area has focused on applying camera viewpoint-related constraints to eliminate erroneous solutions. When camera calibration parameters are available, they provide an extremely efficient tool for disambiguating not only posture estimation, but also 3D reconstruction and data segmentation. Increased robustness is indeed to be gained from enforcing such constraints, which we prove in the context of an optical motion capture framework. Our contribution in this respect resides in having applied such constraints consistently to each main step involved in a motion capture process, namely marker reconstruction and segmentation, followed by posture recovery. These steps are made inter-dependent, where each one constrains the other. A more application-independent approach is to encode constraints directly within the human body model, such as limits on the rotational joints. This being an almost unexplored research subject, our efforts were mainly directed at determining a new method for measuring, representing and applying such joint limits. To the present day, the few existing range of motion boundary representations present severe drawbacks that call for an alternative formulation. The joint limits paradigm we propose not only overcomes these drawbacks, but also allows to capture intra- and inter-joint rotation dependencies, these being essential to realistic joint motion representation. The range of motion boundary is defined by an implicit surface, its analytical expression enabling us to readily establish whether a given joint rotation is valid or not. Furthermore, its continuous and differentiable nature provides us with a means of elegantly incorporating such a constraint within an optimisation process for posture recovery. Applying constrained optimisation to our body model and stereo data extracted from video sequence, we demonstrate the clearly resulting decrease in posture estimation errors. As a bonus, we have integrated our joint limits representation in character animation packages to show how motion can be naturally constrained in this manner

    Real-time immersive human-computer interaction based on tracking and recognition of dynamic hand gestures

    Get PDF
    With fast developing and ever growing use of computer based technologies, human-computer interaction (HCI) plays an increasingly pivotal role. In virtual reality (VR), HCI technologies provide not only a better understanding of three-dimensional shapes and spaces, but also sensory immersion and physical interaction. With the hand based HCI being a key HCI modality for object manipulation and gesture based communication, challenges are presented to provide users a natural, intuitive, effortless, precise, and real-time method for HCI based on dynamic hand gestures, due to the complexity of hand postures formed by multiple joints with high degrees-of-freedom, the speed of hand movements with highly variable trajectories and rapid direction changes, and the precision required for interaction between hands and objects in the virtual world. Presented in this thesis is the design and development of a novel real-time HCI system based on a unique combination of a pair of data gloves based on fibre-optic curvature sensors to acquire finger joint angles, a hybrid tracking system based on inertia and ultrasound to capture hand position and orientation, and a stereoscopic display system to provide an immersive visual feedback. The potential and effectiveness of the proposed system is demonstrated through a number of applications, namely, hand gesture based virtual object manipulation and visualisation, hand gesture based direct sign writing, and hand gesture based finger spelling. For virtual object manipulation and visualisation, the system is shown to allow a user to select, translate, rotate, scale, release and visualise virtual objects (presented using graphics and volume data) in three-dimensional space using natural hand gestures in real-time. For direct sign writing, the system is shown to be able to display immediately the corresponding SignWriting symbols signed by a user using three different signing sequences and a range of complex hand gestures, which consist of various combinations of hand postures (with each finger open, half-bent, closed, adduction and abduction), eight hand orientations in horizontal/vertical plans, three palm facing directions, and various hand movements (which can have eight directions in horizontal/vertical plans, and can be repetitive, straight/curve, clockwise/anti-clockwise). The development includes a special visual interface to give not only a stereoscopic view of hand gestures and movements, but also a structured visual feedback for each stage of the signing sequence. An excellent basis is therefore formed to develop a full HCI based on all human gestures by integrating the proposed system with facial expression and body posture recognition methods. Furthermore, for finger spelling, the system is shown to be able to recognise five vowels signed by two hands using the British Sign Language in real-time.EThOS - Electronic Theses Online ServiceGBUnited Kingdo

    Early postnatal development of neocortex-wide activity patterns in GABAergic and pyramidal neurons

    Get PDF
    Before the onset of sensory experience, developing circuits generate synchronised activity that will not only influence its wiring, but ultimately contribute to behaviour. These complex functions rely on widely distributed cortical that simultaneously operate at multiple spatiotemporal scales. The timing of GABAergic maturation appears to align with the developmental trajectories of cortical regions, playing a crucial role in the functional development of individual brain areas. While local connectivity in cortical microcircuits has been extensively studied, the dynamics of brain-wide functional maturation, especially for GABAergic populations, remain underexplored. In this project, a dual-colour widefield calcium imaging approach was developed to examine the neocortex-wide dynamics of cortical GABAergic and excitatory neurons simultaneously across early postnatal development. This study provides the first broad description of neocortex-wide GABAergic developmental trajectories and their cross-talk with excitatory dynamics during the second and third postnatal weeks. The observed spontaneous activity revealed discrete activity domains, reflecting the modular organisation of the cortex. Both excitatory and GABAergic population exhibited an increase in the size and frequency of activity motifs, as well as changes in motif variability. However, as they matured, the distribution of these spatiotemporal properties displayed divergent trajectories across populations and regions. These findings suggest fundamental differences in the spatial organisation of both populations, indicating potential distinct roles in cortical network function development. Moreover, while excitatory and GABAergic dynamics exhibited high correlations, brief deviations from perfect timing were observed. This correlation patterns changed significantly during development and across regions, with the two populations gradually becoming more correlated as they matured. Manipulating inhibition in vivo disrupted these fluctuations, impacting both local activity and the wider functional network.These findings provide valuable insights into the developmental trajectories of spontaneous activity patterns in excitatory and GABAergic cell populations during early postnatal development. The interplay between both neuronal populations plays a critical role in shaping activity patterns, and understanding the underlying mechanisms of their development can provide valuable insights into neurodevelopmental disorders
    corecore