8,354 research outputs found
Abnormal Infant Movements Classification With Deep Learning on Pose-Based Features
The pursuit of early diagnosis of cerebral palsy has been an active research area with some very promising results using tools such as the General Movements Assessment (GMA). In our previous work, we explored the feasibility of extracting pose-based features from video sequences to automatically classify infant body movement into two categories, normal and abnormal. The classification was based upon the GMA, which was carried out on the video data by an independent expert reviewer. In this paper we extend our previous work by extracting the normalised pose-based feature sets, Histograms of Joint Orientation 2D (HOJO2D) and Histograms of Joint Displacement 2D (HOJD2D), for use in new deep learning architectures. We explore the viability of using these pose-based feature sets for automated classification within a deep learning framework by carrying out extensive experiments on five new deep learning architectures. Experimental results show that the proposed fully connected neural network FCNet performed robustly across different feature sets. Furthermore, the proposed convolutional neural network architectures demonstrated excellent performance in handling features in higher dimensionality. We make the code, extracted features and associated GMA labels publicly available
Markerless human pose estimation for biomedical applications: a survey
Markerless Human Pose Estimation (HPE) proved its potential to support decision making and assessment in many fields of application. HPE is often preferred to traditional marker-based Motion Capture systems due to the ease of setup, portability, and affordable cost of the technology. However, the exploitation of HPE in biomedical applications is still under investigation. This review aims to provide an overview of current biomedical applications of HPE. In this paper, we examine the main features of HPE approaches and discuss whether or not those features are of interest to biomedical applications. We also identify those areas where HPE is already in use and present peculiarities and trends followed by researchers and practitioners. We include here 25 approaches to HPE and more than 40 studies of HPE applied to motor development assessment, neuromuscolar rehabilitation, and gait & posture analysis. We conclude that markerless HPE offers great potential for extending diagnosis and rehabilitation outside hospitals and clinics, toward the paradigm of remote medical care
Dynamic Gaussian Splatting from Markerless Motion Capture can Reconstruct Infants Movements
Easy access to precise 3D tracking of movement could benefit many aspects of
rehabilitation. A challenge to achieving this goal is that while there are many
datasets and pretrained algorithms for able-bodied adults, algorithms trained
on these datasets often fail to generalize to clinical populations including
people with disabilities, infants, and neonates. Reliable movement analysis of
infants and neonates is important as spontaneous movement behavior is an
important indicator of neurological function and neurodevelopmental disability,
which can help guide early interventions. We explored the application of
dynamic Gaussian splatting to sparse markerless motion capture (MMC) data. Our
approach leverages semantic segmentation masks to focus on the infant,
significantly improving the initialization of the scene. Our results
demonstrate the potential of this method in rendering novel views of scenes and
tracking infant movements. This work paves the way for advanced movement
analysis tools that can be applied to diverse clinical populations, with a
particular emphasis on early detection in infants
Towards human-level performance on automatic pose estimation of infant spontaneous movements
Assessment of spontaneous movements can predict the long-term developmental
disorders in high-risk infants. In order to develop algorithms for automated
prediction of later disorders, highly precise localization of segments and
joints by infant pose estimation is required. Four types of convolutional
neural networks were trained and evaluated on a novel infant pose dataset,
covering the large variation in 1 424 videos from a clinical international
community. The localization performance of the networks was evaluated as the
deviation between the estimated keypoint positions and human expert
annotations. The computational efficiency was also assessed to determine the
feasibility of the neural networks in clinical practice. The best performing
neural network had a similar localization error to the inter-rater spread of
human expert annotations, while still operating efficiently. Overall, the
results of our study show that pose estimation of infant spontaneous movements
has a great potential to support research initiatives on early detection of
developmental disorders in children with perinatal brain injuries by
quantifying infant movements from video recordings with human-level
performance.Comment: Published in Computerized Medical Imaging and Graphics (CMIG
Multi-set canonical correlation analysis for 3D abnormal gait behaviour recognition based on virtual sample generation
Small sample dataset and two-dimensional (2D) approach are challenges to vision-based abnormal gait behaviour recognition (AGBR). The lack of three-dimensional (3D) structure of the human body causes 2D based methods to be limited in abnormal gait virtual sample generation (VSG). In this paper, 3D AGBR based on VSG and multi-set canonical correlation analysis (3D-AGRBMCCA) is proposed. First, the unstructured point cloud data of gait are obtained by using a structured light sensor. A 3D parametric body model is then deformed to fit the point cloud data, both in shape and posture. The features of point cloud data are then converted to a high-level structured representation of the body. The parametric body model is used for VSG based on the estimated body pose and shape data. Symmetry virtual samples, pose-perturbation virtual samples and various body-shape virtual samples with multi-views are generated to extend the training samples. The spatial-temporal features of the abnormal gait behaviour from different views, body pose and shape parameters are then extracted by convolutional neural network based Long Short-Term Memory model network. These are projected onto a uniform pattern space using deep learning based multi-set canonical correlation analysis. Experiments on four publicly available datasets show the proposed system performs well under various conditions
Under the Cover Infant Pose Estimation using Multimodal Data
Infant pose monitoring during sleep has multiple applications in both
healthcare and home settings. In a healthcare setting, pose detection can be
used for region of interest detection and movement detection for noncontact
based monitoring systems. In a home setting, pose detection can be used to
detect sleep positions which has shown to have a strong influence on multiple
health factors. However, pose monitoring during sleep is challenging due to
heavy occlusions from blanket coverings and low lighting. To address this, we
present a novel dataset, Simultaneously-collected multimodal Mannequin Lying
pose (SMaL) dataset, for under the cover infant pose estimation. We collect
depth and pressure imagery of an infant mannequin in different poses under
various cover conditions. We successfully infer full body pose under the cover
by training state-of-art pose estimation methods and leveraging existing
multimodal adult pose datasets for transfer learning. We demonstrate a
hierarchical pretraining strategy for transformer-based models to significantly
improve performance on our dataset. Our best performing model was able to
detect joints under the cover within 25mm 86% of the time with an overall mean
error of 16.9mm. Data, code and models publicly available at
https://github.com/DanielKyr/SMa
Generic 3D Representation via Pose Estimation and Matching
Though a large body of computer vision research has investigated developing
generic semantic representations, efforts towards developing a similar
representation for 3D has been limited. In this paper, we learn a generic 3D
representation through solving a set of foundational proxy 3D tasks:
object-centric camera pose estimation and wide baseline feature matching. Our
method is based upon the premise that by providing supervision over a set of
carefully selected foundational tasks, generalization to novel tasks and
abstraction capabilities can be achieved. We empirically show that the internal
representation of a multi-task ConvNet trained to solve the above core problems
generalizes to novel 3D tasks (e.g., scene layout estimation, object pose
estimation, surface normal estimation) without the need for fine-tuning and
shows traits of abstraction abilities (e.g., cross-modality pose estimation).
In the context of the core supervised tasks, we demonstrate our representation
achieves state-of-the-art wide baseline feature matching results without
requiring apriori rectification (unlike SIFT and the majority of learned
features). We also show 6DOF camera pose estimation given a pair local image
patches. The accuracy of both supervised tasks come comparable to humans.
Finally, we contribute a large-scale dataset composed of object-centric street
view scenes along with point correspondences and camera pose information, and
conclude with a discussion on the learned representation and open research
questions.Comment: Published in ECCV16. See the project website
http://3drepresentation.stanford.edu/ and dataset website
https://github.com/amir32002/3D_Street_Vie
- …