197 research outputs found

    Markerless Human Motion Analysis

    Get PDF
    Measuring and understanding human motion is crucial in several domains, ranging from neuroscience, to rehabilitation and sports biomechanics. Quantitative information about human motion is fundamental to study how our Central Nervous System controls and organizes movements to functionally evaluate motor performance and deficits. In the last decades, the research in this field has made considerable progress. State-of-the-art technologies that provide useful and accurate quantitative measures rely on marker-based systems. Unfortunately, markers are intrusive and their number and location must be determined a priori. Also, marker-based systems require expensive laboratory settings with several infrared cameras. This could modify the naturalness of a subject\u2019s movements and induce discomfort. Last, but not less important, they are computationally expensive in time and space. Recent advances on markerless pose estimation based on computer vision and deep neural networks are opening the possibility of adopting efficient video-based methods for extracting movement information from RGB video data. In this contest, this thesis presents original contributions to the following objectives: (i) the implementation of a video-based markerless pipeline to quantitatively characterize human motion; (ii) the assessment of its accuracy if compared with a gold standard marker-based system; (iii) the application of the pipeline to different domains in order to verify its versatility, with a special focus on the characterization of the motion of preterm infants and on gait analysis. With the proposed approach we highlight that, starting only from RGB videos and leveraging computer vision and machine learning techniques, it is possible to extract reliable information characterizing human motion comparable to that obtained with gold standard marker-based systems

    Biomechanical Markerless Motion Classification Based On Stick Model Development For Shop Floor Operator

    Get PDF
    Motion classification system marks a new era of industrial technology to monitor task performance and validate the quality of manual processes using automation. However, the current study trend pointed towards the marker-based motion capture system that demanded the expensive and extensive equipment setup. The markerless motion classification model is still underdeveloped in the manufacturing industry. Therefore, this research is purposed to develop a markerless motion classification model of shopfloor operators using stick model augmentation on the motion video and identify the best data mining strategy for the industrial motion classification. Eight participants within 23 to 24 years old participated in an experiment to perform four distinct motion sequences: moving box, moving pail, sweeping and mopping the floor, recorded in separate videos. All videos were augmented with a stick model made up of keypoints and lines using the programming model. The programming model incorporated the COCO dataset and OpenCV module to estimate the coordinates and body joints for a stick model overlay. The data extracted from the stick model featured the initial velocity, cumulative velocity and acceleration for each body joint. Motion data mining process included the data normalization, random subsampling method and data classification to discover the best information for separating motion classes. The motion vector data extracted were normalized with three different techniques: the decimal scaling normalization, min-max normalization, and Z-score normalization, to create three datasets for further data mining. All the datasets were experimented with eight classifiers to determine the best machine learning classifier and normalization technique to classify the model data. The eight tested classifiers were ZeroR, OneR, J48, random forest, random tree, Naïve Bayes, K-nearest neighbours (K = 5) and multilayer perceptron. The result showed that the random forest classifier scored the best performance with the highest recorded data classification accuracy in its min-max normalized dataset, 81.75% for the dataset before random subsampling and 92.37% for the resampled dataset. The min-max normalization gives only a slight advantage over the other normalization techniques using the same dataset. However, the random subsampling method dramatically improves the classification accuracy by eliminating the noise data and replacing them with replicated instances to balance the class. The best normalization method and data mining classifier were inserted into the motion classification model to complete the development process

    Stochastic optimization and interactive machine learning for human motion analysis

    Get PDF
    The analysis of human motion from visual data is a central issue in the computer vision research community as it enables a wide range of applications and it still remains a challenging problem when dealing with unconstrained scenarios and general conditions. Human motion analysis is used in the entertainment industry for movies or videogame production, in medical applications for rehabilitation or biomechanical studies. It is also used for human computer interaction in any kind of environment, and moreover, it is used for big data analysis from social networks such as Youtube or Flickr, to mention some of its use cases. In this thesis we have studied human motion analysis techniques with a focus on its application for smart room environments. That is, we have studied methods that will support the analysis of people behavior in the room, allowing interaction with computers in a natural manner and in general, methods that introduce computers in human activity environments to enable new kind of services but in an unobstrusive mode. The thesis is structured in two parts, where we study the problem of 3D pose estimation from multiple views and the recognition of gestures using range sensors. First, we propose a generic framework for hierarchically layered particle filtering (HPF) specially suited for motion capture tasks. Human motion capture problem generally involve tracking or optimization of high-dimensional state vectors where also one have to deal with multi-modal pdfs. HPF allow to overcome the problem by means of multiple passes through substate space variables. Then, based on the HPF framework, we propose a method to estimate the anthropometry of the subject, which at the end allows to obtain a human body model adjusted to the subject. Moreover, we introduce a new weighting function strategy for approximate partitioning of observations and a method that employs body part detections to improve particle propagation and weight evaluation, both integrated within the HPF framework. The second part of this thesis is centered in the detection of gestures, and we have focused the problem of reducing annotation and training efforts required to train a specific gesture. In order to reduce the efforts required to train a gesture detector, we propose a solution based on online random forests that allows training in real-time, while receiving new data in sequence. The main aspect that makes the solution effective is the method we propose to collect the hard negatives examples while training the forests. The method uses the detector trained up to the current frame to test on that frame, and then collects samples based on the response of the detector such that they will be more relevant for training. In this manner, training is more effective in terms of the number of annotated frames required.L'anàlisi del moviment humà a partir de dades visuals és un tema central en la recerca en visió per computador, per una banda perquè habilita un ampli espectre d'aplicacions i per altra perquè encara és un problema no resolt quan és aplicat en escenaris no controlats. L'analisi del moviment humà s'utilitza a l'indústria de l'entreteniment per la producció de pel·lícules i videojocs, en aplicacions mèdiques per rehabilitació o per estudis bio-mecànics. També s'utilitza en el camp de la interacció amb computadors o també per l'analisi de grans volums de dades de xarxes socials com Youtube o Flickr, per mencionar alguns exemples. En aquesta tesi s'han estudiat tècniques per l'anàlisi de moviment humà enfocant la seva aplicació en entorns de sales intel·ligents. És a dir, s'ha enfocat a mètodes que puguin permetre l'anàlisi del comportament de les persones a la sala, que permetin la interacció amb els dispositius d'una manera natural i, en general, mètodes que incorporin les computadores en espais on hi ha activitat de persones, per habilitar nous serveis de manera que no interfereixin en la activitat. A la primera part, es proposa un marc genèric per l'ús de filtres de partícules jeràrquics (HPF) especialment adequat per tasques de captura de moviment humà. La captura de moviment humà generalment implica seguiment i optimització de vectors d'estat de molt alta dimensió on a la vegada també s'han de tractar pdf's multi-modals. Els HPF permeten tractar aquest problema mitjançant multiples passades en subdivisions del vector d'estat. Basant-nos en el marc dels HPF, es proposa un mètode per estimar l'antropometria del subjecte, que a la vegada permet obtenir un model acurat del subjecte. També proposem dos nous mètodes per la captura de moviment humà. Per una banda, el APO es basa en una nova estratègia per les funcions de cost basada en la partició de les observacions. Per altra, el DD-HPF utilitza deteccions de parts del cos per millorar la propagació de partícules i l'avaluació de pesos. Ambdós mètodes són integrats dins el marc dels HPF. La segona part de la tesi es centra en la detecció de gestos, i s'ha enfocat en el problema de reduir els esforços d'anotació i entrenament requerits per entrenar un detector per un gest concret. Per tal de reduir els esforços requerits per entrenar un detector de gestos, proposem una solució basada en online random forests que permet l'entrenament en temps real, mentre es reben noves dades sequencialment. El principal aspecte que fa la solució efectiva és el mètode que proposem per obtenir mostres negatives rellevants, mentre s'entrenen els arbres de decisió. El mètode utilitza el detector entrenat fins al moment per recollir mostres basades en la resposta del detector, de manera que siguin més rellevants per l'entrenament. D'aquesta manera l'entrenament és més efectiu pel que fa al nombre de mostres anotades que es requereixen

    Artificial Intelligence-based Motion Tracking in Cancer Radiotherapy: A Review

    Full text link
    Radiotherapy aims to deliver a prescribed dose to the tumor while sparing neighboring organs at risk (OARs). Increasingly complex treatment techniques such as volumetric modulated arc therapy (VMAT), stereotactic radiosurgery (SRS), stereotactic body radiotherapy (SBRT), and proton therapy have been developed to deliver doses more precisely to the target. While such technologies have improved dose delivery, the implementation of intra-fraction motion management to verify tumor position at the time of treatment has become increasingly relevant. Recently, artificial intelligence (AI) has demonstrated great potential for real-time tracking of tumors during treatment. However, AI-based motion management faces several challenges including bias in training data, poor transparency, difficult data collection, complex workflows and quality assurance, and limited sample sizes. This review serves to present the AI algorithms used for chest, abdomen, and pelvic tumor motion management/tracking for radiotherapy and provide a literature summary on the topic. We will also discuss the limitations of these algorithms and propose potential improvements.Comment: 36 pages, 5 Figures, 4 Table

    Fusion of wearable and visual sensors for human motion analysis

    No full text
    Human motion analysis is concerned with the study of human activity recognition, human motion tracking, and the analysis of human biomechanics. Human motion analysis has applications within areas of entertainment, sports, and healthcare. For example, activity recognition, which aims to understand and identify different tasks from motion can be applied to create records of staff activity in the operating theatre at a hospital; motion tracking is already employed in some games to provide an improved user interaction experience and can be used to study how medical staff interact in the operating theatre; and human biomechanics, which is the study of the structure and function of the human body, can be used to better understand athlete performance, pathologies in certain patients, and assess the surgical skill of medical staff. As health services strive to improve the quality of patient care and meet the growing demands required to care for expanding populations around the world, solutions that can improve patient care, diagnosis of pathology, and the monitoring and training of medical staff are necessary. Surgical workflow analysis, for example, aims to assess and optimise surgical protocols in the operating theatre by evaluating the tasks that staff perform and measurable outcomes. Human motion analysis methods can be used to quantify the activities and performance of staff for surgical workflow analysis; however, a number of challenges must be overcome before routine motion capture of staff in an operating theatre becomes feasible. Current commercial human motion capture technologies have demonstrated that they are capable of acquiring human movement with sub-centimetre accuracy; however, the complicated setup procedures, size, and embodiment of current systems make them cumbersome and unsuited for routine deployment within an operating theatre. Recent advances in pervasive sensing have resulted in camera systems that can detect and analyse human motion, and small wear- able sensors that can measure a variety of parameters from the human body, such as heart rate, fatigue, balance, and motion. The work in this thesis investigates different methods that enable human motion to be more easily, reliably, and accurately captured through ambient and wearable sensor technologies to address some of the main challenges that have limited the use of motion capture technologies in certain areas of study. Sensor embodiment and accuracy of activity recognition is one of the challenges that affect the adoption of wearable devices for monitoring human activity. Using a single inertial sensor, which captures the movement of the subject, a variety of motion characteristics can be measured. For patients, wearable inertial sensors can be used in long-term activity monitoring to better understand the condition of the patient and potentially identify deviations from normal activity. For medical staff, inertial sensors can be used to capture tasks being performed for automated workflow analysis, which is useful for staff training, optimisation of existing processes, and early indications of complications within clinical procedures. Feature extraction and classification methods are introduced in thesis that demonstrate motion classification accuracies of over 90% for five different classes of walking motion using a single ear-worn sensor. To capture human body posture, current capture systems generally require a large number of sensors or reflective reference markers to be worn on the body, which presents a challenge for many applications, such as monitoring human motion in the operating theatre, as they may restrict natural movements and make setup complex and time consuming. To address this, a method is proposed, which uses a regression method to estimate motion using a subset of fewer wearable inertial sensors. This method is demonstrated using three sensors on the upper body and is shown to achieve mean estimation accuracies as low as 1.6cm, 1.1cm, and 1.4cm for the hand, elbow, and shoulders, respectively, when compared with the gold standard optical motion capture system. Using a subset of three sensors, mean errors for hand position reach 15.5cm. Unlike human motion capture systems that rely on vision and reflective reference point markers, commonly known as marker-based optical motion capture, wearable inertial sensors are prone to inaccuracies resulting from an accumulation of inaccurate measurements, which becomes increasingly prevalent over time. Two methods are introduced in this thesis, which aim to solve this challenge using visual rectification of the assumed state of the subject. Using a ceiling-mounted camera, a human detection and human motion tracking method is introduced to improve the average mean accuracy of tracking to within 5.8cm in a laboratory of 3m × 5m. To improve the accuracy of capturing the position of body parts and posture for human biomechanics, a camera is also utilised to track the body part movements and provide visual rectification of human pose estimates from inertial sensing. For most subjects, deviations of less than 10% from the ground truth are achieved for hand positions, which exhibit the greatest error, and the occurrence of sources of other common visual and inertial estimation errors, such as measurement noise, visual occlusion, and sensor calibration are shown to be reduced.Open Acces

    Spatiotemporal analysis of human actions using RGB-D cameras

    Get PDF
    Markerless human motion analysis has strong potential to provide cost-efficient solution for action recognition and body pose estimation. Many applications including humancomputer interaction, video surveillance, content-based video indexing, and automatic annotation among others will benefit from a robust solution to these problems. Depth sensing technologies in recent years have positively changed the climate of the automated vision-based human action recognition problem, deemed to be very difficult due to the various ambiguities inherent to conventional video. In this work, first a large set of invariant spatiotemporal features is extracted from skeleton joints (retrieved from depth sensor) in motion and evaluated as baseline performance. Next we introduce a discriminative Random Decision Forest-based feature selection framework capable of reaching impressive action recognition performance when combined with a linear SVM classifier. This approach improves upon the baseline performance obtained using the whole feature set with a significantly less number of features (one tenth of the original). The approach can also be used to provide insights on the spatiotemporal dynamics of human actions. A novel therapeutic action recognition dataset (WorkoutSU-10) is presented. We took advantage of this dataset as a benchmark in our tests to evaluate the reliability of our proposed methods. Recently the dataset has been published publically as a contribution to the action recognition community. In addition, an interactive action evaluation application is developed by utilizing the proposed methods to help with real life problems such as 'fall detection' in the elderly people or automated therapy program for patients with motor disabilities

    Coupled Action Recognition and Pose Estimation from Multiple Views

    Get PDF
    Action recognition and pose estimation are two closely related topics in understanding human body movements; information from one task can be leveraged to assist the other, yet the two are often treated separately. We present here a framework for coupled action recognition and pose estimation by formulating pose estimation as an optimization over a set of action-specific manifolds. The framework allows for integration of a 2D appearance-based action recognition system as a prior for 3D pose estimation and for refinement of the action labels using relational pose features based on the extracted 3D poses. Our experiments show that our pose estimation system is able to estimate body poses with high degrees of freedom using very few particles and can achieve state-of-the-art results on the HumanEva-II benchmark. We also thoroughly investigate the impact of pose estimation and action recognition accuracy on each other on the challenging TUM kitchen dataset. We demonstrate not only the feasibility of using extracted 3D poses for action recognition, but also improved performance in comparison to action recognition using low-level appearance feature

    Objective Assessment of the Finger Tapping Task in Parkinson's Disease and Control Subjects using Azure Kinect and Machine Learning

    Get PDF
    Parkinson's disease (PD) is characterised by a progressive worsening of motor functionalities. In particular, limited hand dexterity strongly correlates with PD diagnosis and staging. Objective detection of alterations in hand motor skills would allow, for example, prompt identification of the disease, its symptoms and the definition of adequate medical treatments. Among the clinical assessment tasks to diagnose and stage PD from hand impairment, the Finger Tapping (FT) task is a well-established tool. This preliminary study exploits a single RGB-Depth camera (Azure Kinect) and Google MediaPipe Hands to track and assess the Finger Tapping task. The system includes several stages. First, hand movements are tracked from FT video recordings and used to extract a series of clinically-relevant features. Then, the most significant features are selected and used to train and test several Machine Learning (ML) models, to distinguish subjects with PD from healthy controls. To test the proposed system, 35 PD subjects and 60 healthy volunteers were recruited. The best-performing ML model achieved a 94.4% Accuracy and 98.4% Fl score in a Leave-One-Subject-Out validation. Moreover, different clusters with respect to spatial and temporal variability in the FT trials among PD subjects were identified. This result suggests the possibility of exploiting the proposed system to perform an even finer identification of subgroups among the PD population

    A Survey of Applications and Human Motion Recognition with Microsoft Kinect

    Get PDF
    Microsoft Kinect, a low-cost motion sensing device, enables users to interact with computers or game consoles naturally through gestures and spoken commands without any other peripheral equipment. As such, it has commanded intense interests in research and development on the Kinect technology. In this paper, we present, a comprehensive survey on Kinect applications, and the latest research and development on motion recognition using data captured by the Kinect sensor. On the applications front, we review the applications of the Kinect technology in a variety of areas, including healthcare, education and performing arts, robotics, sign language recognition, retail services, workplace safety training, as well as 3D reconstructions. On the technology front, we provide an overview of the main features of both versions of the Kinect sensor together with the depth sensing technologies used, and review literatures on human motion recognition techniques used in Kinect applications. We provide a classification of motion recognition techniques to highlight the different approaches used in human motion recognition. Furthermore, we compile a list of publicly available Kinect datasets. These datasets are valuable resources for researchers to investigate better methods for human motion recognition and lower-level computer vision tasks such as segmentation, object detection and human pose estimation
    corecore