2,158 research outputs found
Pain Analysis using Adaptive Hierarchical Spatiotemporal Dynamic Imaging
Automatic pain intensity estimation plays a pivotal role in healthcare and
medical fields. While many methods have been developed to gauge human pain
using behavioral or physiological indicators, facial expressions have emerged
as a prominent tool for this purpose. Nevertheless, the dependence on labeled
data for these techniques often renders them expensive and time-consuming. To
tackle this, we introduce the Adaptive Hierarchical Spatio-temporal Dynamic
Image (AHDI) technique. AHDI encodes spatiotemporal changes in facial videos
into a singular RGB image, permitting the application of simpler 2D deep models
for video representation. Within this framework, we employ a residual network
to derive generalized facial representations. These representations are
optimized for two tasks: estimating pain intensity and differentiating between
genuine and simulated pain expressions. For the former, a regression model is
trained using the extracted representations, while for the latter, a binary
classifier identifies genuine versus feigned pain displays. Testing our method
on two widely-used pain datasets, we observed encouraging results for both
tasks. On the UNBC database, we achieved an MSE of 0.27 outperforming the SOTA
which had an MSE of 0.40. On the BioVid dataset, our model achieved an accuracy
of 89.76%, which is an improvement of 5.37% over the SOTA accuracy. Most
notably, for distinguishing genuine from simulated pain, our accuracy stands at
94.03%, marking a substantial improvement of 8.98%. Our methodology not only
minimizes the need for extensive labeled data but also augments the precision
of pain evaluations, facilitating superior pain management
Recommended from our members
The role of HG in the analysis of temporal iteration and interaural correlation
Discriminatively Trained Latent Ordinal Model for Video Classification
We study the problem of video classification for facial analysis and human
action recognition. We propose a novel weakly supervised learning method that
models the video as a sequence of automatically mined, discriminative
sub-events (eg. onset and offset phase for "smile", running and jumping for
"highjump"). The proposed model is inspired by the recent works on Multiple
Instance Learning and latent SVM/HCRF -- it extends such frameworks to model
the ordinal aspect in the videos, approximately. We obtain consistent
improvements over relevant competitive baselines on four challenging and
publicly available video based facial analysis datasets for prediction of
expression, clinical pain and intent in dyadic conversations and on three
challenging human action datasets. We also validate the method with qualitative
results and show that they largely support the intuitions behind the method.Comment: Paper accepted in IEEE TPAMI. arXiv admin note: substantial text
overlap with arXiv:1604.0150
- …