57 research outputs found

    Real-Time Human Motion Capture with Multiple Depth Cameras

    Full text link
    Commonly used human motion capture systems require intrusive attachment of markers that are visually tracked with multiple cameras. In this work we present an efficient and inexpensive solution to markerless motion capture using only a few Kinect sensors. Unlike the previous work on 3d pose estimation using a single depth camera, we relax constraints on the camera location and do not assume a co-operative user. We apply recent image segmentation techniques to depth images and use curriculum learning to train our system on purely synthetic data. Our method accurately localizes body parts without requiring an explicit shape model. The body joint locations are then recovered by combining evidence from multiple views in real-time. We also introduce a dataset of ~6 million synthetic depth frames for pose estimation from multiple cameras and exceed state-of-the-art results on the Berkeley MHAD dataset.Comment: Accepted to computer robot vision 201

    Hybrid One-Shot 3D Hand Pose Estimation by Exploiting Uncertainties

    Full text link
    Model-based approaches to 3D hand tracking have been shown to perform well in a wide range of scenarios. However, they require initialisation and cannot recover easily from tracking failures that occur due to fast hand motions. Data-driven approaches, on the other hand, can quickly deliver a solution, but the results often suffer from lower accuracy or missing anatomical validity compared to those obtained from model-based approaches. In this work we propose a hybrid approach for hand pose estimation from a single depth image. First, a learned regressor is employed to deliver multiple initial hypotheses for the 3D position of each hand joint. Subsequently, the kinematic parameters of a 3D hand model are found by deliberately exploiting the inherent uncertainty of the inferred joint proposals. This way, the method provides anatomically valid and accurate solutions without requiring manual initialisation or suffering from track losses. Quantitative results on several standard datasets demonstrate that the proposed method outperforms state-of-the-art representatives of the model-based, data-driven and hybrid paradigms.Comment: BMVC 2015 (oral); see also http://lrs.icg.tugraz.at/research/hybridhape

    Collaborative voting of 3D features for robust gesture estimation

    Get PDF
    © 2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes,creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.Human body analysis raises special interest because it enables a wide range of interactive applications. In this paper we present a gesture estimator that discriminates body poses in depth images. A novel collaborative method is proposed to learn 3D features of the human body and, later, to estimate specific gestures. The collaborative estimation framework is inspired by decision forests, where each selected point (anchor point) contributes to the estimation by casting votes. The main idea is to detect a body part by accumulating the inference of other trained body parts. The collaborative voting encodes the global context of human pose, while 3D features represent local appearance. Body parts contributing to the detection are interpreted as a voting process. Experimental results for different 3D features prove the validity of the proposed algorithm.Peer ReviewedPostprint (author's final draft

    A wearable and non-wearable approach for gesture recognition: initial results

    Get PDF
    A natural way of communication between humans are gestures. Through this type of non-verbal communication, the human interaction may change since it is possible to send a particular message or capture the attention of the other peer. In the human-computer interaction the capture of such gestures has been a topic of interest where the goal is to classify human gestures in different scenarios. Applying machine learning techniques, one may be able to track and recognize human gestures and use the gathered information to assess the medical condition of a person regarding, for example, motor impairments. According to the type of movement and to the target population one may use different wearable or non-wearable sensors. In this work, we are using a hybrid approach for automatically detecting the ball throwing movement by applying a Microsoft Kinect (non-wearable) and the Pandlet (set of wearable sensors such as accelerometer, gyroscope, among others). After creating a dataset of 10 participants, a SVM model with a DTW kernel is trained and used as a classification tool. The system performance was quantified in terms of confusion matrix, accuracy, sensitivity and specificity, Area Under the Curve, and Mathews Correlation Coefficient metrics. The obtained results point out that the present system is able to recognize the selected throwing gestures and that the overall performance of the Kinect is better compared to the Pandlet.This article is a result of the project Deus Ex Machina: NORTE-01-0145-FEDER-000026, supported by Norte Portugal Regional Operational Program (NORTE 2020), under the PORTUGAL 2020 Partnership Agreement, through the European Regional Development Fund (ERDF).info:eu-repo/semantics/publishedVersio

    Integrated Multi-view 3D Image Capture and Motion Parallax 3D Display System

    Get PDF
    We propose an integrated 3D image capture and display system using a transversely moving camera, regular 2D display screen and user tracking that can facilitate the multi-view capture of a real scene or object and display the captured perspective views in 3D. The motion parallax 3D technique is used to capture the depth information of the object and display the corresponding views to the user using head tracking. The system is composed of two parts, the first part consists of a horizontally moving camera interfaced with a customized camera control and capture application. The second part consist of a regular LCD screen combined with web camera and user tracking application. The 3D multi-view images captured through the imaging setup are relayed to the display based on the user location and corresponding view is dynamically displayed on the screen based on the viewing angle of the user with respect to the screen. The developed prototype system provides the multi-view capture of 60 views with the step size of 1 cm and greater than 40˚ field-of-view overlap. The display system relays 60 views providing the viewing angle coverage of ±35˚ where the angular difference between two views is 1.2˚

    Human Pose Detection for Robotic-Assisted and Rehabilitation Environments

    Get PDF
    Assistance and rehabilitation robotic platforms must have precise sensory systems for human–robot interaction. Therefore, human pose estimation is a current topic of research, especially for the safety of human–robot collaboration and the evaluation of human biomarkers. Within this field of research, the evaluation of the low-cost marker-less human pose estimators of OpenPose and Detectron 2 has received much attention for their diversity of applications, such as surveillance, sports, videogames, and assessment in human motor rehabilitation. This work aimed to evaluate and compare the angles in the elbow and shoulder joints estimated by OpenPose and Detectron 2 during four typical upper-limb rehabilitation exercises: elbow side flexion, elbow flexion, shoulder extension, and shoulder abduction. A setup of two Kinect 2 RGBD cameras was used to obtain the ground truth of the joint and skeleton estimations during the different exercises. Finally, we provided a numerical comparison (RMSE and MAE) among the angle measurements obtained with OpenPose, Detectron 2, and the ground truth. The results showed how OpenPose outperforms Detectron 2 in these types of applications.Óscar G. Hernández holds a grant from the Spanish Fundación Carolina, the University of Alicante, and the National Autonomous University of Honduras

    Phase messaging method for time-of-flight cameras

    Get PDF
    Ubiquitous light emitting devices and low-cost commercial digital cameras facilitate optical wireless communication system such as visual MIMO where handheld cameras communicate with electronic displays. While intensity-based optical communications are more prevalent in camera-display messaging, we present a novel method that uses modulated light phase for messaging and time-of-flight (ToF) cameras for receivers. With intensity-based methods, light signals can be degraded by reflections and ambient illumination. By comparison, communication using ToF cameras is more robust against challenging lighting conditions. Additionally, the concept of phase messaging can be combined with intensity messaging for a significant data rate advantage. In this work, we design and construct a phase messaging array (PMA), which is the first of its kind, to communicate to a ToF depth camera by manipulating the phase of the depth camera's infrared light signal. The array enables message variation spatially using a plane of infrared light emitting diodes and temporally by varying the induced phase shift. In this manner, the phase messaging array acts as the transmitter by electronically controlling the light signal phase. The ToF camera acts as the receiver by observing and recording a time-varying depth. We show a complete implementation of a 3×3 prototype array with custom hardware and demonstrating average bit accuracy as high as 97.8%. The prototype data rate with this approach is 1 Kbps that can be extended to approximately 10 Mbps.National Science Foundation (U.S.) (grant CNS-106546#
    • …
    corecore