10,648 research outputs found

    DeepWalking: Enabling Smartphone-based Walking Speed Estimation Using Deep Learning

    Full text link
    Walking speed estimation is an essential component of mobile apps in various fields such as fitness, transportation, navigation, and health-care. Most existing solutions are focused on specialized medical applications that utilize body-worn motion sensors. These approaches do not serve effectively the general use case of numerous apps where the user holding a smartphone tries to find his or her walking speed solely based on smartphone sensors. However, existing smartphone-based approaches fail to provide acceptable precision for walking speed estimation. This leads to a question: is it possible to achieve comparable speed estimation accuracy using a smartphone over wearable sensor based obtrusive solutions? We find the answer from advanced neural networks. In this paper, we present DeepWalking, the first deep learning-based walking speed estimation scheme for smartphone. A deep convolutional neural network (DCNN) is applied to automatically identify and extract the most effective features from the accelerometer and gyroscope data of smartphone and to train the network model for accurate speed estimation. Experiments are performed with 10 participants using a treadmill. The average root-mean-squared-error (RMSE) of estimated walking speed is 0.16m/s which is comparable to the results obtained by state-of-the-art approaches based on a number of body-worn sensors (i.e., RMSE of 0.11m/s). The results indicate that a smartphone can be a strong tool for walking speed estimation if the sensor data are effectively calibrated and supported by advanced deep learning techniques.Comment: 6 pages, 9 figures, published in IEEE Global Communications Conference (GLOBECOM

    Human Pose Estimation using Global and Local Normalization

    Full text link
    In this paper, we address the problem of estimating the positions of human joints, i.e., articulated pose estimation. Recent state-of-the-art solutions model two key issues, joint detection and spatial configuration refinement, together using convolutional neural networks. Our work mainly focuses on spatial configuration refinement by reducing variations of human poses statistically, which is motivated by the observation that the scattered distribution of the relative locations of joints e.g., the left wrist is distributed nearly uniformly in a circular area around the left shoulder) makes the learning of convolutional spatial models hard. We present a two-stage normalization scheme, human body normalization and limb normalization, to make the distribution of the relative joint locations compact, resulting in easier learning of convolutional spatial models and more accurate pose estimation. In addition, our empirical results show that incorporating multi-scale supervision and multi-scale fusion into the joint detection network is beneficial. Experiment results demonstrate that our method consistently outperforms state-of-the-art methods on the benchmarks.Comment: ICCV201

    Towards dense object tracking in a 2D honeybee hive

    Full text link
    From human crowds to cells in tissue, the detection and efficient tracking of multiple objects in dense configurations is an important and unsolved problem. In the past, limitations of image analysis have restricted studies of dense groups to tracking a single or subset of marked individuals, or to coarse-grained group-level dynamics, all of which yield incomplete information. Here, we combine convolutional neural networks (CNNs) with the model environment of a honeybee hive to automatically recognize all individuals in a dense group from raw image data. We create new, adapted individual labeling and use the segmentation architecture U-Net with a loss function dependent on both object identity and orientation. We additionally exploit temporal regularities of the video recording in a recurrent manner and achieve near human-level performance while reducing the network size by 94% compared to the original U-Net architecture. Given our novel application of CNNs, we generate extensive problem-specific image data in which labeled examples are produced through a custom interface with Amazon Mechanical Turk. This dataset contains over 375,000 labeled bee instances across 720 video frames at 2 FPS, representing an extensive resource for the development and testing of tracking methods. We correctly detect 96% of individuals with a location error of ~7% of a typical body dimension, and orientation error of 12 degrees, approximating the variability of human raters. Our results provide an important step towards efficient image-based dense object tracking by allowing for the accurate determination of object location and orientation across time-series image data efficiently within one network architecture.Comment: 15 pages, including supplementary figures. 1 supplemental movie available as an ancillary fil
    • …
    corecore