4,592 research outputs found
Estimating Energy Cost of Physical Activities from Video Using 3D-CNN Networks
This research proposes a machine learning model that can estimate the energy cost of physical activities from video input. Currently, wearable sensors are commonly used for this purpose, but they have limitations in terms of practicality and accuracy. A deep learning model using three dimensional convolutional neural network (3D-CNN) architecture was used to process the video data and predict the energy cost in terms of metabolic equivalents (METs). The proposed model was evaluated on a dataset of physical activity videos and achieved an average accuracy of 71% on energy category prediction task and an root mean squared error (RMSE) of 1.14 on energy cost prediction task. The findings suggest that this approach has the potential for practical applications in physical activity surveillance, health interventions, and at-home activity monitoring
Energy expenditure estimation using visual and inertial sensors
© The Institution of Engineering and Technology 2017. Deriving a person's energy expenditure accurately forms the foundation for tracking physical activity levels across many health and lifestyle monitoring tasks. In this study, the authors present a method for estimating calorific expenditure from combined visual and accelerometer sensors by way of an RGB-Depth camera and a wearable inertial sensor. The proposed individual-independent framework fuses information from both modalities which leads to improved estimates beyond the accuracy of single modality and manual metabolic equivalents of task (MET) lookup table based methods. For evaluation, the authors introduce a new dataset called SPHERE_RGBD + Inertial_calorie, for which visual and inertial data are simultaneously obtained with indirect calorimetry ground truth measurements based on gas exchange. Experiments show that the fusion of visual and inertial data reduces the estimation error by 8 and 18% compared with the use of visual only and inertial sensor only, respectively, and by 33% compared with a MET-based approach. The authors conclude from their results that the proposed approach is suitable for home monitoring in a controlled environment
Recommended from our members
Multi-sensor physical activity measurement in early childhood
The purpose of this dissertation was to develop, validate, and implement multi-sensor approaches for measuring physical activity and social/contextual covariates in 2-5 year-old children via wearable-, wireless communication-, and infrared-depth camera-based technologies. In Chapter 2, a three-phased study design was used to validate a method for estimating metered distances between wearable devices using accelerometer-derived Bluetooth signals. Results showed that distances, up to 20 meters, can be predicted between a single Bluetooth beacon and receiver using a Random Forest algorithm. When multiple Bluetooth beacons and receivers were used within the same environment, a moving average filter was required to recover observations lost due to noise. Overall, simulation and validation data suggest that accelerometer-derived Bluetooth signals can be used in studies of physical activity co-participation to 1) estimate metered distances between devices using a single beacon-receiver paradigm, as well as to 2) estimate the proportion of time that devices are proximal when using multiple beacons and receivers. Chapter 3 characterized the relationship between objectively measured physical activity and dyadic spatial proximities in 2 year-olds and their parents. Data revealed that the overall proportions of time that children and their parents spent in total physical activity were positively associated, and time series data revealed that this relationship remained consistent when analyzed hour-to-hour. Time spent engaged in sedentary behavior was also positively associated between children and parents; however, there was no association between child and parent moderate-vigorous physical activity volumes. Dyadic proximity results showed that girls spent more time in joint physical activity with their mothers than boys. Furthermore, children who engaged in >60 minutes of daily moderate-vigorous physical activity spent an additional 30 minutes in joint total physical activity with their mothers each day, on average, when compared to children who engaged in 60 minutes of daily moderate-vigorous physical activity participated in joint physical activity with their mothers across wider relative distances, on average, than did boys who engaged in physical activity at closer relative distances to their mothers. In Chapter 4, an original computer vision algorithm was applied to infrared-depth camera data for the purpose of converting three-dimensional videos into triaxial physical activity signals in young children. Physical activity data were collected in 2-5 year-old children during 20-minute semi-structured, indoor child-parent dyadic play sessions. Play session video data were converted into triaxial physical activity signals using a multi-phased computer vision algorithm for each child. Computer vision-derived triaxial physical activity cut points for 2-5 year-olds were calibrated against a direct observation reference system using a machine learning algorithm. Results revealed that triaxial activity signals, as measured by a dual-sensor camera, can be used to estimate both physical activity intensities and volumes in young children without the use of wearable technology. Collectively, these studies show that multi-sensor approaches to physical activity measurement are a valid means by which to measure physical activity and social/contextual covariates in young children using either wearable sensors or computer vision
Recommended from our members
Quantifying Physical Activity in Young Children Using a Three-Dimensional Camera
The purpose of this study was to determine the feasibility and validity of using three-dimensional (3D) video data and computer vision to estimate physical activity intensities in young children. Families with children (2–5-years-old) were invited to participate in semi-structured 20-minute play sessions that included a range of indoor play activities. During the play session, children’s physical activity (PA) was recorded using a 3D camera. PA video data were analyzed via direct observation, and 3D PA video data were processed and converted into triaxial PA accelerations using computer vision. PA video data from children (n = 10) were analyzed using direct observation as the ground truth, and the Receiver Operating Characteristic Area Under the Curve (AUC) was calculated in order to determine the classification accuracy of a Classification and Regression Tree (CART) algorithm for estimating PA intensity from video data. A CART algorithm accurately estimated the proportion of time that children spent sedentary (AUC = 0.89) in light PA (AUC = 0.87) and moderate-vigorous PA (AUC = 0.92) during the play session, and there were no significant differences (p \u3e 0.05) between the directly observed and CART-determined proportions of time spent in each activity intensity. A computer vision algorithm and 3D camera can be used to estimate the proportion of time that children spend in all activity intensities indoors
Jointly Learning Energy Expenditures and Activities using Egocentric Multimodal Signals
Physiological signals such as heart rate can provide valuable information about an individual’s state and activity. However, existing work on computer vision has not yet explored leveraging these signals to enhance egocentric video understanding. In this work, we propose a model for reasoning on multimodal data to jointly predict activities and energy expenditures. We use heart rate signals as privileged self-supervision to derive energy expenditure in a training stage. A multitask objective is used to jointly optimize the two tasks. Additionally, we introduce a dataset that contains 31 hours of egocentric video augmented with heart rate and acceleration signals. This study can lead to new applications such as a visual calorie counter
- …