15,410 research outputs found
Vision-based Human Fall Detection Systems using Deep Learning: A Review
Human fall is one of the very critical health issues, especially for elders
and disabled people living alone. The number of elder populations is increasing
steadily worldwide. Therefore, human fall detection is becoming an effective
technique for assistive living for those people. For assistive living, deep
learning and computer vision have been used largely. In this review article, we
discuss deep learning (DL)-based state-of-the-art non-intrusive (vision-based)
fall detection techniques. We also present a survey on fall detection benchmark
datasets. For a clear understanding, we briefly discuss different metrics which
are used to evaluate the performance of the fall detection systems. This
article also gives a future direction on vision-based human fall detection
techniques
Home-based physical therapy with an interactive computer vision system
In this paper, we present ExerciseCheck. ExerciseCheck is an interactive computer vision system that is sufficiently modular to work with different sources of human pose estimates, i.e., estimates from deep or traditional models that
interpret RGB or RGB-D camera input. In a pilot study, we first compare the pose estimates produced by four deep models based on RGB input with those of the MS Kinect based on RGB-D data. The results indicate a performance
gap that required us to choose the MS Kinect when we tested ExerciseCheck with Parkinson’s disease patients in their homes. ExerciseCheck is capable of customizing exercises, capturing exercise information, evaluating patient performance, providing therapeutic feedback to the patient and the therapist, checking the progress of the user over the course of the physical therapy, and supporting the patient
throughout this period. We conclude that ExerciseCheck is a user-friendly computer vision application that can assist patients by providing motivation and guidance to ensure correct execution of the required exercises. Our results also suggest that while there has been considerable progress in the field of pose estimation using deep learning, current deep learning models are not fully ready to replace
RGB-D sensors, especially when the exercises involved are complex, and the patient population being accounted for has to be carefully tracked for its “active range of motion.”Published versio
Future Person Localization in First-Person Videos
We present a new task that predicts future locations of people observed in
first-person videos. Consider a first-person video stream continuously recorded
by a wearable camera. Given a short clip of a person that is extracted from the
complete stream, we aim to predict that person's location in future frames. To
facilitate this future person localization ability, we make the following three
key observations: a) First-person videos typically involve significant
ego-motion which greatly affects the location of the target person in future
frames; b) Scales of the target person act as a salient cue to estimate a
perspective effect in first-person videos; c) First-person videos often capture
people up-close, making it easier to leverage target poses (e.g., where they
look) for predicting their future locations. We incorporate these three
observations into a prediction framework with a multi-stream
convolution-deconvolution architecture. Experimental results reveal our method
to be effective on our new dataset as well as on a public social interaction
dataset.Comment: Accepted to CVPR 201
Multimodal Signal Processing and Learning Aspects of Human-Robot Interaction for an Assistive Bathing Robot
We explore new aspects of assistive living on smart human-robot interaction
(HRI) that involve automatic recognition and online validation of speech and
gestures in a natural interface, providing social features for HRI. We
introduce a whole framework and resources of a real-life scenario for elderly
subjects supported by an assistive bathing robot, addressing health and hygiene
care issues. We contribute a new dataset and a suite of tools used for data
acquisition and a state-of-the-art pipeline for multimodal learning within the
framework of the I-Support bathing robot, with emphasis on audio and RGB-D
visual streams. We consider privacy issues by evaluating the depth visual
stream along with the RGB, using Kinect sensors. The audio-gestural recognition
task on this new dataset yields up to 84.5%, while the online validation of the
I-Support system on elderly users accomplishes up to 84% when the two
modalities are fused together. The results are promising enough to support
further research in the area of multimodal recognition for assistive social
HRI, considering the difficulties of the specific task. Upon acceptance of the
paper part of the data will be publicly available
- …