4,002 research outputs found

    RGBD Datasets: Past, Present and Future

    Full text link
    Since the launch of the Microsoft Kinect, scores of RGBD datasets have been released. These have propelled advances in areas from reconstruction to gesture recognition. In this paper we explore the field, reviewing datasets across eight categories: semantics, object pose estimation, camera tracking, scene reconstruction, object tracking, human actions, faces and identification. By extracting relevant information in each category we help researchers to find appropriate data for their needs, and we consider which datasets have succeeded in driving computer vision forward and why. Finally, we examine the future of RGBD datasets. We identify key areas which are currently underexplored, and suggest that future directions may include synthetic data and dense reconstructions of static and dynamic scenes.Comment: 8 pages excluding references (CVPR style

    Continuous Human Activity Tracking over a Large Area with Multiple Kinect Sensors

    Get PDF
    In recent years, researchers had been inquisitive about the use of technology to enhance the healthcare and wellness of patients with dementia. Dementia symptoms are associated with the decline in thinking skills and memory severe enough to reduce a person’s ability to pay attention and perform daily activities. Progression of dementia can be assessed by monitoring the daily activities of the patients. This thesis encompasses continuous localization and behavioral analysis of patient’s motion pattern over a wide area indoor living space using multiple calibrated Kinect sensors connected over the network. The skeleton data from all the sensor is transferred to the host computer via TCP sockets into Unity software where it is integrated into a single world coordinate system using calibration technique. Multiple cameras are placed with some overlap in the field of view for the successful calibration of the cameras and continuous tracking of the patients. Localization and behavioral data are stored in a CSV file for further analysis

    Modeling of Human Upper Body for Sign Language Recognition

    Get PDF
    Sign Language Recognition systems require not only the hand motion trajectory to be classified but also facial features, Human Upper Body (HUB) and hand position with respect to other HUB parts. Head, face, forehead, shoulders and chest are very crucial parts that can carry a lot of positioning information of hand gestures in gesture classification. In this paper as the main contribution, a fast and robust search algorithm for HUB parts based on head size has been introduced for real time implementations. Scaling the extracted parts during body orientation was attained using partial estimation of face size. Tracking the extracted parts for front and side view was achieved using CAMSHIFT [24]. The outcome of the system makes it applicable for real-time applications such as Sign Languages Recognition (SLR) systems

    FaceVR: Real-Time Facial Reenactment and Eye Gaze Control in Virtual Reality

    No full text
    We introduce FaceVR, a novel method for gaze-aware facial reenactment in the Virtual Reality (VR) context. The key component of FaceVR is a robust algorithm to perform real-time facial motion capture of an actor who is wearing a head-mounted display (HMD), as well as a new data-driven approach for eye tracking from monocular videos. In addition to these face reconstruction components, FaceVR incorporates photo-realistic re-rendering in real time, thus allowing artificial modifications of face and eye appearances. For instance, we can alter facial expressions, change gaze directions, or remove the VR goggles in realistic re-renderings. In a live setup with a source and a target actor, we apply these newly-introduced algorithmic components. We assume that the source actor is wearing a VR device, and we capture his facial expressions and eye movement in real-time. For the target video, we mimic a similar tracking process; however, we use the source input to drive the animations of the target video, thus enabling gaze-aware facial reenactment. To render the modified target video on a stereo display, we augment our capture and reconstruction process with stereo data. In the end, FaceVR produces compelling results for a variety of applications, such as gaze-aware facial reenactment, reenactment in virtual reality, removal of VR goggles, and re-targeting of somebody's gaze direction in a video conferencing call

    Face modeling for face recognition in the wild.

    Get PDF
    Face understanding is considered one of the most important topics in computer vision field since the face is a rich source of information in social interaction. Not only does the face provide information about the identity of people, but also of their membership in broad demographic categories (including sex, race, and age), and about their current emotional state. Facial landmarks extraction is the corner stone in the success of different facial analyses and understanding applications. In this dissertation, a novel facial modeling is designed for facial landmarks detection in unconstrained real life environment from different image modalities including infra-red and visible images. In the proposed facial landmarks detector, a part based model is incorporated with holistic face information. In the part based model, the face is modeled by the appearance of different face part(e.g., right eye, left eye, left eyebrow, nose, mouth) and their geometric relation. The appearance is described by a novel feature referred to as pixel difference feature. This representation is three times faster than the state-of-art in feature representation. On the other hand, to model the geometric relation between the face parts, the complex Bingham distribution is adapted from the statistical community into computer vision for modeling the geometric relationship between the facial elements. The global information is incorporated with the local part model using a regression model. The model results outperform the state-of-art in detecting facial landmarks. The proposed facial landmark detector is tested in two computer vision problems: boosting the performance of face detectors by rejecting pseudo faces and camera steering in multi-camera network. To highlight the applicability of the proposed model for different image modalities, it has been studied in two face understanding applications which are face recognition from visible images and physiological measurements for autistic individuals from thermal images. Recognizing identities from faces under different poses, expressions and lighting conditions from a complex background is an still unsolved problem even with accurate detection of landmark. Therefore, a learning similarity measure is proposed. The proposed measure responds only to the difference in identities and filter illuminations and pose variations. similarity measure makes use of statistical inference in the image plane. Additionally, the pose challenge is tackled by two new approaches: assigning different weights for different face part based on their visibility in image plane at different pose angles and synthesizing virtual facial images for each subject at different poses from single frontal image. The proposed framework is demonstrated to be competitive with top performing state-of-art methods which is evaluated on standard benchmarks in face recognition in the wild. The other framework for the face understanding application, which is a physiological measures for autistic individual from infra-red images. In this framework, accurate detecting and tracking Superficial Temporal Arteria (STA) while the subject is moving, playing, and interacting in social communication is a must. It is very challenging to track and detect STA since the appearance of the STA region changes over time and it is not discriminative enough from other areas in face region. A novel concept in detection, called supporter collaboration, is introduced. In support collaboration, the STA is detected and tracked with the help of face landmarks and geometric constraint. This research advanced the field of the emotion recognition

    Modeling of human upper body for sign language recognition

    Get PDF
    Sign Language Recognition systems require not only the hand motion trajectory to be classified but also facial features, Human Upper Body (HUB) and hand position with respect to other HUB parts. Head, face, forehead, shoulders and chest are very crucial parts that can carry a lot of positioning information of hand gestures in gesture classification. In this paper as the main contribution, a fast and robust search algorithm for HUB parts based on head size has been introduced for real time implementations. Scaling the extracted parts during body orientation was attained using partial estimation of face size. Tracking the extracted parts for front and side view was achieved using CAMSHIFT [24]. The outcome of the system makes it applicable for real-time applications such as Sign Languages Recognition (SLR) systems. Keywords: Human upper body detectio
    corecore