10,648 research outputs found
DeepWalking: Enabling Smartphone-based Walking Speed Estimation Using Deep Learning
Walking speed estimation is an essential component of mobile apps in various
fields such as fitness, transportation, navigation, and health-care. Most
existing solutions are focused on specialized medical applications that utilize
body-worn motion sensors. These approaches do not serve effectively the general
use case of numerous apps where the user holding a smartphone tries to find his
or her walking speed solely based on smartphone sensors. However, existing
smartphone-based approaches fail to provide acceptable precision for walking
speed estimation. This leads to a question: is it possible to achieve
comparable speed estimation accuracy using a smartphone over wearable sensor
based obtrusive solutions?
We find the answer from advanced neural networks. In this paper, we present
DeepWalking, the first deep learning-based walking speed estimation scheme for
smartphone. A deep convolutional neural network (DCNN) is applied to
automatically identify and extract the most effective features from the
accelerometer and gyroscope data of smartphone and to train the network model
for accurate speed estimation. Experiments are performed with 10 participants
using a treadmill. The average root-mean-squared-error (RMSE) of estimated
walking speed is 0.16m/s which is comparable to the results obtained by
state-of-the-art approaches based on a number of body-worn sensors (i.e., RMSE
of 0.11m/s). The results indicate that a smartphone can be a strong tool for
walking speed estimation if the sensor data are effectively calibrated and
supported by advanced deep learning techniques.Comment: 6 pages, 9 figures, published in IEEE Global Communications
Conference (GLOBECOM
Human Pose Estimation using Global and Local Normalization
In this paper, we address the problem of estimating the positions of human
joints, i.e., articulated pose estimation. Recent state-of-the-art solutions
model two key issues, joint detection and spatial configuration refinement,
together using convolutional neural networks. Our work mainly focuses on
spatial configuration refinement by reducing variations of human poses
statistically, which is motivated by the observation that the scattered
distribution of the relative locations of joints e.g., the left wrist is
distributed nearly uniformly in a circular area around the left shoulder) makes
the learning of convolutional spatial models hard. We present a two-stage
normalization scheme, human body normalization and limb normalization, to make
the distribution of the relative joint locations compact, resulting in easier
learning of convolutional spatial models and more accurate pose estimation. In
addition, our empirical results show that incorporating multi-scale supervision
and multi-scale fusion into the joint detection network is beneficial.
Experiment results demonstrate that our method consistently outperforms
state-of-the-art methods on the benchmarks.Comment: ICCV201
Towards dense object tracking in a 2D honeybee hive
From human crowds to cells in tissue, the detection and efficient tracking of
multiple objects in dense configurations is an important and unsolved problem.
In the past, limitations of image analysis have restricted studies of dense
groups to tracking a single or subset of marked individuals, or to
coarse-grained group-level dynamics, all of which yield incomplete information.
Here, we combine convolutional neural networks (CNNs) with the model
environment of a honeybee hive to automatically recognize all individuals in a
dense group from raw image data. We create new, adapted individual labeling and
use the segmentation architecture U-Net with a loss function dependent on both
object identity and orientation. We additionally exploit temporal regularities
of the video recording in a recurrent manner and achieve near human-level
performance while reducing the network size by 94% compared to the original
U-Net architecture. Given our novel application of CNNs, we generate extensive
problem-specific image data in which labeled examples are produced through a
custom interface with Amazon Mechanical Turk. This dataset contains over
375,000 labeled bee instances across 720 video frames at 2 FPS, representing an
extensive resource for the development and testing of tracking methods. We
correctly detect 96% of individuals with a location error of ~7% of a typical
body dimension, and orientation error of 12 degrees, approximating the
variability of human raters. Our results provide an important step towards
efficient image-based dense object tracking by allowing for the accurate
determination of object location and orientation across time-series image data
efficiently within one network architecture.Comment: 15 pages, including supplementary figures. 1 supplemental movie
available as an ancillary fil
- …