15,023 research outputs found

    Keep it SMPL: Automatic Estimation of 3D Human Pose and Shape from a Single Image

    Full text link
    We describe the first method to automatically estimate the 3D pose of the human body as well as its 3D shape from a single unconstrained image. We estimate a full 3D mesh and show that 2D joints alone carry a surprising amount of information about body shape. The problem is challenging because of the complexity of the human body, articulation, occlusion, clothing, lighting, and the inherent ambiguity in inferring 3D from 2D. To solve this, we first use a recently published CNN-based method, DeepCut, to predict (bottom-up) the 2D body joint locations. We then fit (top-down) a recently published statistical body shape model, called SMPL, to the 2D joints. We do so by minimizing an objective function that penalizes the error between the projected 3D model joints and detected 2D joints. Because SMPL captures correlations in human shape across the population, we are able to robustly fit it to very little data. We further leverage the 3D model to prevent solutions that cause interpenetration. We evaluate our method, SMPLify, on the Leeds Sports, HumanEva, and Human3.6M datasets, showing superior pose accuracy with respect to the state of the art.Comment: To appear in ECCV 201

    It's all Relative: Monocular 3D Human Pose Estimation from Weakly Supervised Data

    Get PDF
    We address the problem of 3D human pose estimation from 2D input images using only weakly supervised training data. Despite showing considerable success for 2D pose estimation, the application of supervised machine learning to 3D pose estimation in real world images is currently hampered by the lack of varied training images with corresponding 3D poses. Most existing 3D pose estimation algorithms train on data that has either been collected in carefully controlled studio settings or has been generated synthetically. Instead, we take a different approach, and propose a 3D human pose estimation algorithm that only requires relative estimates of depth at training time. Such training signal, although noisy, can be easily collected from crowd annotators, and is of sufficient quality for enabling successful training and evaluation of 3D pose algorithms. Our results are competitive with fully supervised regression based approaches on the Human3.6M dataset, despite using significantly weaker training data. Our proposed algorithm opens the door to using existing widespread 2D datasets for 3D pose estimation by allowing fine-tuning with noisy relative constraints, resulting in more accurate 3D poses.Comment: BMVC 2018. Project page available at http://www.vision.caltech.edu/~mronchi/projects/RelativePos

    Multi-Focal Visual Servoing Strategies

    Get PDF
    Multi-focal vision provides two or more vision devices with different fields of view and measurement accuracies. A main advantage of this concept is a flexible allocation of these sensor resources accounting for the current situational and task performance requirements. Particularly, vision devices with large fields of view and low accuracies can be use

    Flexible calibration of a stereo vision system by active display

    Get PDF
    Abstract Camera calibration plays a fundamental role for 3D computer vision since it is the first step to recover reliable metric information from 2D images. The calibration of a stereo-vision system is a two-step process: firstly, the calibration of the individual cameras must be carried out, then the two individual calibrations are combined to retrieve the relative placement between the two cameras, and to refine intrinsic and extrinsic parameters. The most commonly adopted calibration methodology uses multiple images of a physical checkerboard pattern. However, the process is time-consuming since the operator must move the calibration target into different positions, typically from 15 to 20. Moreover, the calibration of different optical setups requires the use of calibration boards, which differ for size and number of target points depending on the desired working volume. This paper proposes an innovative approach to the calibration, which is based on the use of a conventional computer screen to actively display the calibration checkerboard. The potential non-planarity of the screen is compensated by an iterative approach, which also estimate the actual screen shape during the calibration process. The use of an active display greatly enhances the flexibility of the stereo-camera calibration process since the same device can be used to calibrate different optical setups by simply varying number and size of the displayed squared patterns

    Development of a calibration pipeline for a monocular-view structured illumination 3D sensor utilizing an array projector

    Get PDF
    Commercial off-the-shelf digital projection systems are commonly used in active structured illumination photogrammetry of macro-scale surfaces due to their relatively low cost, accessibility, and ease of use. They can be described as inverse pinhole modelled. The calibration pipeline of a 3D sensor utilizing pinhole devices in a projector-camera setup configuration is already well-established. Recently, there have been advances in creating projection systems offering projection speeds greater than that available from conventional off-the-shelf digital projectors. However, they cannot be calibrated using well established techniques based on the pinole assumption. They are chip-less and without projection lens. This work is based on the utilization of unconventional projection systems known as array projectors which contain not one but multiple projection channels that project a temporal sequence of illumination patterns. None of the channels implement a digital projection chip or a projection lens. To workaround the calibration problem, previous realizations of a 3D sensor based on an array projector required a stereo-camera setup. Triangulation took place between the two pinhole modelled cameras instead. However, a monocular setup is desired as a single camera configuration results in decreased cost, weight, and form-factor. This study presents a novel calibration pipeline that realizes a single camera setup. A generalized intrinsic calibration process without model assumptions was developed that directly samples the illumination frustum of each array projection channel. An extrinsic calibration process was then created that determines the pose of the single camera through a downhill simplex optimization initialized by particle swarm. Lastly, a method to store the intrinsic calibration with the aid of an easily realizable calibration jig was developed for re-use in arbitrary measurement camera positions so that intrinsic calibration does not have to be repeated

    Distributed Robotic Vision for Calibration, Localisation, and Mapping

    Get PDF
    This dissertation explores distributed algorithms for calibration, localisation, and mapping in the context of a multi-robot network equipped with cameras and onboard processing, comparing against centralised alternatives where all data is transmitted to a singular external node on which processing occurs. With the rise of large-scale camera networks, and as low-cost on-board processing becomes increasingly feasible in robotics networks, distributed algorithms are becoming important for robustness and scalability. Standard solutions to multi-camera computer vision require the data from all nodes to be processed at a central node which represents a significant single point of failure and incurs infeasible communication costs. Distributed solutions solve these issues by spreading the work over the entire network, operating only on local calculations and direct communication with nearby neighbours. This research considers a framework for a distributed robotic vision platform for calibration, localisation, mapping tasks where three main stages are identified: an initialisation stage where calibration and localisation are performed in a distributed manner, a local tracking stage where visual odometry is performed without inter-robot communication, and a global mapping stage where global alignment and optimisation strategies are applied. In consideration of this framework, this research investigates how algorithms can be developed to produce fundamentally distributed solutions, designed to minimise computational complexity whilst maintaining excellent performance, and designed to operate effectively in the long term. Therefore, three primary objectives are sought aligning with these three stages

    Real-Time, Multiple Pan/Tilt/Zoom Computer Vision Tracking and 3D Positioning System for Unmanned Aerial System Metrology

    Get PDF
    The study of structural characteristics of Unmanned Aerial Systems (UASs) continues to be an important field of research for developing state of the art nano/micro systems. Development of a metrology system using computer vision (CV) tracking and 3D point extraction would provide an avenue for making these theoretical developments. This work provides a portable, scalable system capable of real-time tracking, zooming, and 3D position estimation of a UAS using multiple cameras. Current state-of-the-art photogrammetry systems use retro-reflective markers or single point lasers to obtain object poses and/or positions over time. Using a CV pan/tilt/zoom (PTZ) system has the potential to circumvent their limitations. The system developed in this paper exploits parallel-processing and the GPU for CV-tracking, using optical flow and known camera motion, in order to capture a moving object using two PTU cameras. The parallel-processing technique developed in this work is versatile, allowing the ability to test other CV methods with a PTZ system using known camera motion. Utilizing known camera poses, the object\u27s 3D position is estimated and focal lengths are estimated for filling the image to a desired amount. This system is tested against truth data obtained using an industrial system
    • …
    corecore