15 research outputs found

    Learning from Synthetic Humans

    Get PDF
    Estimating human pose, shape, and motion from images and videos are fundamental challenges with many applications. Recent advances in 2D human pose estimation use large amounts of manually-labeled training data for learning convolutional neural networks (CNNs). Such data is time consuming to acquire and difficult to extend. Moreover, manual labeling of 3D pose, depth and motion is impractical. In this work we present SURREAL (Synthetic hUmans foR REAL tasks): a new large-scale dataset with synthetically-generated but realistic images of people rendered from 3D sequences of human motion capture data. We generate more than 6 million frames together with ground truth pose, depth maps, and segmentation masks. We show that CNNs trained on our synthetic dataset allow for accurate human depth estimation and human part segmentation in real RGB images. Our results and the new dataset open up new possibilities for advancing person analysis using cheap and large-scale synthetic data.Comment: Appears in: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR 2017). 9 page

    A method to enhance the deep learning in an aerial image

    Full text link
    © 2017 IEEE. In this paper, we propose a kind of pre-processing method which can be applied to the depth learning method for the characteristics of aerial image. This method combines the color and spatial information to do the quick background filtering. In addition to increase execution speed, but also to reduce the rate of false positives

    Rescue Method Based on V2X Communication and Human Pose Estimation

    Get PDF
    Advances in science and technologies not only improve our workflow and increase the quality of life but also change the way emergency rescue operates. The eCall system is the evidence of such a new evolutionary step. However, studies are not available about 'smart' rescue methods of independent human (neither pedestrian nor a driver) who turned out to be on the roadside for one or another reason and need first aid. The research questions were how to increase the quality and speed of the rescue process by using Vehicle-to-Everything (V2X) communication and the human pose estimation function and how to increase the reliability of transmitted information. In the article, we presented an overview about the current state-of-the-art in human pose estimation technology, discussed current methods of rescue, concept, and disadvantages of the current approach, and finally proposed our own method of rescue based on blending of vehicular communication networks with advancements in human pose estimation function. The research novelty in a scientific sense is a concept of a new information management method, where Autonomous Vehicles (AVs) act as witnesses itself without any human intervention. An example of SOS Packet Format has been designed. We proposed a novel view on the future ambulance

    Real-time factored ConvNets: Extracting the x factor in human parsing

    Get PDF
    © 2017. The copyright of this document resides with its authors. We propose a real-time and lightweight multi-task style ConvNet (termed a Factored ConvNet) for human body parsing in images or video. Factored ConvNets have isolated areas which perform known sub-tasks, such as object localization or edge detection. We call this area and sub-task pair an X factor. Unlike multi-task ConvNets which have independent tasks, the Factored ConvNet’s sub-task has direct effect on the main task outcome. In this paper we show how to isolate the X factor of foreground/background (f/b) subtraction from the main task of segmenting human body images into 31 different body part types. Knowledge of this X factor leads to a number of benefits for the Factored ConvNet: 1) Ease of network transfer to other image domains, 2) ability to personalize to humans in video and 3) easy model performance boosts. All achieved by either efficient network update or replacement of the X factor whilst avoiding catastrophic forgetting of previously learnt body part dependencies and structure. We show these benefits on a large dataset of images and also on YouTube videos.SeeQuesto
    corecore