509 research outputs found

    Unmasking Communication Partners: A Low-Cost AI Solution for Digitally Removing Head-Mounted Displays in VR-Based Telepresence

    Full text link
    Face-to-face conversation in Virtual Reality (VR) is a challenge when participants wear head-mounted displays (HMD). A significant portion of a participant's face is hidden and facial expressions are difficult to perceive. Past research has shown that high-fidelity face reconstruction with personal avatars in VR is possible under laboratory conditions with high-cost hardware. In this paper, we propose one of the first low-cost systems for this task which uses only open source, free software and affordable hardware. Our approach is to track the user's face underneath the HMD utilizing a Convolutional Neural Network (CNN) and generate corresponding expressions with Generative Adversarial Networks (GAN) for producing RGBD images of the person's face. We use commodity hardware with low-cost extensions such as 3D-printed mounts and miniature cameras. Our approach learns end-to-end without manual intervention, runs in real time, and can be trained and executed on an ordinary gaming computer. We report evaluation results showing that our low-cost system does not achieve the same fidelity of research prototypes using high-end hardware and closed source software, but it is capable of creating individual facial avatars with person-specific characteristics in movements and expressions.Comment: 9 pages, IEEE 3rd International Conference on Artificial Intelligence & Virtual Realit

    RGB-D-based Action Recognition Datasets: A Survey

    Get PDF
    Human action recognition from RGB-D (Red, Green, Blue and Depth) data has attracted increasing attention since the first work reported in 2010. Over this period, many benchmark datasets have been created to facilitate the development and evaluation of new algorithms. This raises the question of which dataset to select and how to use it in providing a fair and objective comparative evaluation against state-of-the-art methods. To address this issue, this paper provides a comprehensive review of the most commonly used action recognition related RGB-D video datasets, including 27 single-view datasets, 10 multi-view datasets, and 7 multi-person datasets. The detailed information and analysis of these datasets is a useful resource in guiding insightful selection of datasets for future research. In addition, the issues with current algorithm evaluation vis-\'{a}-vis limitations of the available datasets and evaluation protocols are also highlighted; resulting in a number of recommendations for collection of new datasets and use of evaluation protocols

    Distributed human 3D pose estimation and action recognition.

    Get PDF
    In this paper, we propose a distributed solution for3D human pose estimation using a RGBD camera network. Thekey feature of our method is a dynamic hybrid consensus filter(DHCF) is introduced to fuse the multiple view informationof cameras. In contrast to the centralized fusion solution,the DHCF algorithm can be used in a distributed network,which requires no central information fusion center. Therefore,the DHCF based fusion algorithm can benefit from manyadvantages of distributed network. We also show that theproposed fusion algorithm can handle the occlusion problemseffectively, and achieve higher action recognition rate comparedto the ones using only single view information
    • …
    corecore