2,787 research outputs found

    Visibility Constrained Generative Model for Depth-based 3D Facial Pose Tracking

    Full text link
    In this paper, we propose a generative framework that unifies depth-based 3D facial pose tracking and face model adaptation on-the-fly, in the unconstrained scenarios with heavy occlusions and arbitrary facial expression variations. Specifically, we introduce a statistical 3D morphable model that flexibly describes the distribution of points on the surface of the face model, with an efficient switchable online adaptation that gradually captures the identity of the tracked subject and rapidly constructs a suitable face model when the subject changes. Moreover, unlike prior art that employed ICP-based facial pose estimation, to improve robustness to occlusions, we propose a ray visibility constraint that regularizes the pose based on the face model's visibility with respect to the input point cloud. Ablation studies and experimental results on Biwi and ICT-3DHP datasets demonstrate that the proposed framework is effective and outperforms completing state-of-the-art depth-based methods

    Posing 3D Models from Drawing

    Get PDF
    Inferring the 3D pose of a character from a drawing is a complex and under-constrained problem. Solving it may help automate various parts of an animation production pipeline such as pre-visualisation. In this paper, a novel way of inferring the 3D pose from a monocular 2D sketch is proposed. The proposed method does not make any external assumptions about the model, allowing it to be used on different types of characters. The inference of the 3D pose is formulated as an optimisation problem and a parallel variation of the Particle Swarm Optimisation algorithm called PARAC-LOAPSO is utilised for searching the minimum. Testing in isolation as well as part of a larger scene, the presented method is evaluated by posing a lamp, a horse and a human character. The results show that this method is robust, highly scalable and is able to be extended to various types of models

    Perceiving user's intention-for-interaction: A probabilistic multimodal data fusion scheme

    Get PDF
    International audienceUnderstanding people's intention, be it action or thought, plays a fundamental role in establishing coherent communication amongst people, especially in non-proactive robotics, where the robot has to understand explicitly when to start an interaction in a natural way. In this work, a novel approach is presented to detect people's intention-for-interaction. The proposed detector fuses multimodal cues, including estimated head pose, shoulder orientation and vocal activity detection, using a probabilistic discrete state Hidden Markov Model. The multimodal detector achieves up to 80% correct detection rates improving purely audio and RGB-D based variants

    Physical Interaction of Autonomous Robots in Complex Environments

    Get PDF
    Recent breakthroughs in the fields of computer vision and robotics are firmly changing the people perception about robots. The idea of robots that substitute humansisnowturningintorobotsthatcollaboratewiththem. Serviceroboticsconsidersrobotsaspersonalassistants. Itsafelyplacesrobotsindomesticenvironments in order to facilitate humans daily life. Industrial robotics is now reconsidering its basic idea of robot as a worker. Currently, the primary method to guarantee the personnels safety in industrial environments is the installation of physical barriers around the working area of robots. The development of new technologies and new algorithms in the sensor field and in the robotic one has led to a new generation of lightweight and collaborative robots. Therefore, industrial robotics leveraged the intrinsic properties of this kind of robots to generate a robot co-worker that is able to safely coexist, collaborate and interact inside its workspace with both personnels and objects. This Ph.D. dissertation focuses on the generation of a pipeline for fast object pose estimation and distance computation of moving objects,in both structured and unstructured environments,using RGB-D images. This pipeline outputs the command actions which let the robot complete its main task and fulfil the safety human-robot coexistence behaviour at once. The proposed pipeline is divided into an object segmentation part,a 6D.o.F. object pose estimation part and a real-time collision avoidance part for safe human-robot coexistence. Firstly, the segmentation module finds candidate object clusters out of RGB-D images of clutter scenes using a graph-based image segmentation technique. This segmentation technique generates a cluster of pixels for each object found in the image. The candidate object clusters are then fed as input to the 6 D.o.F. object pose estimation module. The latter is in charge of estimating both the translation and the orientation in 3D space of each candidate object clusters. The object pose is then employed by the robotic arm to compute a suitable grasping policy. The last module generates a force vector field of the environment surrounding the robot, the objects and the humans. This force vector field drives the robot toward its goal while any potential collision against objects and/or humans is safely avoided. This work has been carried out at Politecnico di Torino, in collaboration with Telecom Italia S.p.A

    Computational intelligence approaches to robotics, automation, and control [Volume guest editors]

    Get PDF
    No abstract available

    Capturing Hands in Action using Discriminative Salient Points and Physics Simulation

    Full text link
    Hand motion capture is a popular research field, recently gaining more attention due to the ubiquity of RGB-D sensors. However, even most recent approaches focus on the case of a single isolated hand. In this work, we focus on hands that interact with other hands or objects and present a framework that successfully captures motion in such interaction scenarios for both rigid and articulated objects. Our framework combines a generative model with discriminatively trained salient points to achieve a low tracking error and with collision detection and physics simulation to achieve physically plausible estimates even in case of occlusions and missing visual data. Since all components are unified in a single objective function which is almost everywhere differentiable, it can be optimized with standard optimization techniques. Our approach works for monocular RGB-D sequences as well as setups with multiple synchronized RGB cameras. For a qualitative and quantitative evaluation, we captured 29 sequences with a large variety of interactions and up to 150 degrees of freedom.Comment: Accepted for publication by the International Journal of Computer Vision (IJCV) on 16.02.2016 (submitted on 17.10.14). A combination into a single framework of an ECCV'12 multicamera-RGB and a monocular-RGBD GCPR'14 hand tracking paper with several extensions, additional experiments and detail

    Real-Time Hand Tracking Using a Sum of Anisotropic Gaussians Model

    Full text link
    Real-time marker-less hand tracking is of increasing importance in human-computer interaction. Robust and accurate tracking of arbitrary hand motion is a challenging problem due to the many degrees of freedom, frequent self-occlusions, fast motions, and uniform skin color. In this paper, we propose a new approach that tracks the full skeleton motion of the hand from multiple RGB cameras in real-time. The main contributions include a new generative tracking method which employs an implicit hand shape representation based on Sum of Anisotropic Gaussians (SAG), and a pose fitting energy that is smooth and analytically differentiable making fast gradient based pose optimization possible. This shape representation, together with a full perspective projection model, enables more accurate hand modeling than a related baseline method from literature. Our method achieves better accuracy than previous methods and runs at 25 fps. We show these improvements both qualitatively and quantitatively on publicly available datasets.Comment: 8 pages, Accepted version of paper published at 3DV 201

    Motion correction of PET/CT images

    Get PDF
    Indiana University-Purdue University Indianapolis (IUPUI)The advances in health care technology help physicians make more accurate diagnoses about the health conditions of their patients. Positron Emission Tomography/Computed Tomography (PET/CT) is one of the many tools currently used to diagnose health and disease in patients. PET/CT explorations are typically used to detect: cancer, heart diseases, disorders in the central nervous system. Since PET/CT studies can take up to 60 minutes or more, it is impossible for patients to remain motionless throughout the scanning process. This movements create motion-related artifacts which alter the quantitative and qualitative results produced by the scanning process. The patient's motion results in image blurring, reduction in the image signal to noise ratio, and reduced image contrast, which could lead to misdiagnoses. In the literature, software and hardware-based techniques have been studied to implement motion correction over medical files. Techniques based on the use of an external motion tracking system are preferred by researchers because they present a better accuracy. This thesis proposes a motion correction system that uses 3D affine registrations using particle swarm optimization and an off-the-shelf Microsoft Kinect camera to eliminate or reduce errors caused by the patient's motion during a medical imaging study
    • 

    corecore