144 research outputs found
Non-contact measures to monitor hand movement of people with rheumatoid arthritis using a monocular RGB camera
Hand movements play an essential role in a person’s ability to interact with the environment. In hand biomechanics, the range of joint motion is a crucial metric to quantify changes due to degenerative pathologies, such as rheumatoid arthritis (RA). RA is a chronic condition where the immune system mistakenly attacks the joints, particularly those in the hands. Optoelectronic motion capture systems are gold-standard tools to quantify changes but are challenging to adopt outside laboratory settings. Deep learning executed on standard video data can capture RA participants in their natural environments, potentially supporting objectivity in remote consultation.
The three main research aims in this thesis were 1) to assess the extent to which current deep learning architectures, which have been validated for quantifying motion of other body segments, can be applied to hand kinematics using monocular RGB cameras, 2) to localise where in videos the hand motions of interest are to be found, 3) to assess the validity of 1) and 2) to determine disease status in RA.
First, hand kinematics for twelve healthy participants, captured with OpenPose were benchmarked against those captured using an optoelectronic system, showing acceptable instrument errors below 10°. Then, a gesture classifier was tested to segment video recordings of twenty-two healthy participants, achieving an accuracy of 93.5%. Finally, OpenPose and the classifier were applied to videos of RA participants performing hand exercises to determine disease status. The inferred disease activity exhibited agreement with the in-person ground truth in nine out of ten instances, outperforming virtual consultations, which agreed only six times out of ten.
These results demonstrate that this approach is more effective than estimated disease activity performed by human experts during video consultations. The end goal sets the foundation for a tool that RA participants can use to observe their disease activity from their home.Open Acces
Markerless 3D human pose tracking through multiple cameras and AI: Enabling high accuracy, robustness, and real-time performance
Tracking 3D human motion in real-time is crucial for numerous applications
across many fields. Traditional approaches involve attaching artificial
fiducial objects or sensors to the body, limiting their usability and
comfort-of-use and consequently narrowing their application fields. Recent
advances in Artificial Intelligence (AI) have allowed for markerless solutions.
However, most of these methods operate in 2D, while those providing 3D
solutions compromise accuracy and real-time performance. To address this
challenge and unlock the potential of visual pose estimation methods in
real-world scenarios, we propose a markerless framework that combines
multi-camera views and 2D AI-based pose estimation methods to track 3D human
motion. Our approach integrates a Weighted Least Square (WLS) algorithm that
computes 3D human motion from multiple 2D pose estimations provided by an
AI-driven method. The method is integrated within the Open-VICO framework
allowing simulation and real-world execution. Several experiments have been
conducted, which have shown high accuracy and real-time performance,
demonstrating the high level of readiness for real-world applications and the
potential to revolutionize human motion capture.Comment: 19 pages, 7 figure
Expressive Body Capture: 3D Hands, Face, and Body from a Single Image
To facilitate the analysis of human actions, interactions and emotions, we
compute a 3D model of human body pose, hand pose, and facial expression from a
single monocular image. To achieve this, we use thousands of 3D scans to train
a new, unified, 3D model of the human body, SMPL-X, that extends SMPL with
fully articulated hands and an expressive face. Learning to regress the
parameters of SMPL-X directly from images is challenging without paired images
and 3D ground truth. Consequently, we follow the approach of SMPLify, which
estimates 2D features and then optimizes model parameters to fit the features.
We improve on SMPLify in several significant ways: (1) we detect 2D features
corresponding to the face, hands, and feet and fit the full SMPL-X model to
these; (2) we train a new neural network pose prior using a large MoCap
dataset; (3) we define a new interpenetration penalty that is both fast and
accurate; (4) we automatically detect gender and the appropriate body models
(male, female, or neutral); (5) our PyTorch implementation achieves a speedup
of more than 8x over Chumpy. We use the new method, SMPLify-X, to fit SMPL-X to
both controlled images and images in the wild. We evaluate 3D accuracy on a new
curated dataset comprising 100 images with pseudo ground-truth. This is a step
towards automatic expressive human capture from monocular RGB data. The models,
code, and data are available for research purposes at
https://smpl-x.is.tue.mpg.de.Comment: To appear in CVPR 201
Computer Vision Solutions for Range of Motion Assessment
Joint range of motion (ROM) is an important indicator of physical functionality and musculoskeletal health. In sports, athletes require adequate levels of joint mobility to minimize the risk of injuries and maximize performance, while in rehabilitation, restoring joint ROM is essential for faster recovery and improved physical function. Traditional methods for measuring ROM include goniometry, inclinometry and visual estimation; all of which are limited in accuracy due to the subjective nature of the assessment. With the rapid development of technology, new systems based on computer vision are continuously introduced as a possible solution for more objective and accurate measurements of the range of motion. Therefore, this article aimed to evaluate novel computer vision-based systems based on their accuracy and practical applicability for a range of motion assessment. The review covers a variety of systems, including motion-capture systems (2D and 3D cameras), RGB-Depth cameras, commercial software systems and smartphone apps. Furthermore, this article also highlights the potential limitations of these systems and explores their potential future applications in sports and rehabilitation
EventCap: Monocular 3D Capture of High-Speed Human Motions using an Event Camera
The high frame rate is a critical requirement for capturing fast human motions. In this setting, existing markerless image-based methods are constrained by the lighting requirement, the high data bandwidth and the consequent high computation overhead. In this paper, we propose EventCap --- the first approach for 3D capturing of high-speed human motions using a single event camera. Our method combines model-based optimization and CNN-based human pose detection to capture high-frequency motion details and to reduce the drifting in the tracking. As a result, we can capture fast motions at millisecond resolution with significantly higher data efficiency than using high frame rate videos. Experiments on our new event-based fast human motion dataset demonstrate the effectiveness and accuracy of our method, as well as its robustness to challenging lighting conditions
Total Capture: A 3D Deformation Model for Tracking Faces, Hands, and Bodies
We present a unified deformation model for the markerless capture of multiple
scales of human movement, including facial expressions, body motion, and hand
gestures. An initial model is generated by locally stitching together models of
the individual parts of the human body, which we refer to as the "Frankenstein"
model. This model enables the full expression of part movements, including face
and hands by a single seamless model. Using a large-scale capture of people
wearing everyday clothes, we optimize the Frankenstein model to create "Adam".
Adam is a calibrated model that shares the same skeleton hierarchy as the
initial model but can express hair and clothing geometry, making it directly
usable for fitting people as they normally appear in everyday life. Finally, we
demonstrate the use of these models for total motion tracking, simultaneously
capturing the large-scale body movements and the subtle face and hand motion of
a social group of people
- …