29,501 research outputs found

    An Immersive Telepresence System using RGB-D Sensors and Head Mounted Display

    Get PDF
    We present a tele-immersive system that enables people to interact with each other in a virtual world using body gestures in addition to verbal communication. Beyond the obvious applications, including general online conversations and gaming, we hypothesize that our proposed system would be particularly beneficial to education by offering rich visual contents and interactivity. One distinct feature is the integration of egocentric pose recognition that allows participants to use their gestures to demonstrate and manipulate virtual objects simultaneously. This functionality enables the instructor to ef- fectively and efficiently explain and illustrate complex concepts or sophisticated problems in an intuitive manner. The highly interactive and flexible environment can capture and sustain more student attention than the traditional classroom setting and, thus, delivers a compelling experience to the students. Our main focus here is to investigate possible solutions for the system design and implementation and devise strategies for fast, efficient computation suitable for visual data processing and network transmission. We describe the technique and experiments in details and provide quantitative performance results, demonstrating our system can be run comfortably and reliably for different application scenarios. Our preliminary results are promising and demonstrate the potential for more compelling directions in cyberlearning.Comment: IEEE International Symposium on Multimedia 201

    CNN-based Real-time Dense Face Reconstruction with Inverse-rendered Photo-realistic Face Images

    Full text link
    With the powerfulness of convolution neural networks (CNN), CNN based face reconstruction has recently shown promising performance in reconstructing detailed face shape from 2D face images. The success of CNN-based methods relies on a large number of labeled data. The state-of-the-art synthesizes such data using a coarse morphable face model, which however has difficulty to generate detailed photo-realistic images of faces (with wrinkles). This paper presents a novel face data generation method. Specifically, we render a large number of photo-realistic face images with different attributes based on inverse rendering. Furthermore, we construct a fine-detailed face image dataset by transferring different scales of details from one image to another. We also construct a large number of video-type adjacent frame pairs by simulating the distribution of real video data. With these nicely constructed datasets, we propose a coarse-to-fine learning framework consisting of three convolutional networks. The networks are trained for real-time detailed 3D face reconstruction from monocular video as well as from a single image. Extensive experimental results demonstrate that our framework can produce high-quality reconstruction but with much less computation time compared to the state-of-the-art. Moreover, our method is robust to pose, expression and lighting due to the diversity of data.Comment: Accepted by IEEE Transactions on Pattern Analysis and Machine Intelligence, 201

    Object-based 2D-to-3D video conversion for effective stereoscopic content generation in 3D-TV applications

    Get PDF
    Three-dimensional television (3D-TV) has gained increasing popularity in the broadcasting domain, as it enables enhanced viewing experiences in comparison to conventional two-dimensional (2D) TV. However, its application has been constrained due to the lack of essential contents, i.e., stereoscopic videos. To alleviate such content shortage, an economical and practical solution is to reuse the huge media resources that are available in monoscopic 2D and convert them to stereoscopic 3D. Although stereoscopic video can be generated from monoscopic sequences using depth measurements extracted from cues like focus blur, motion and size, the quality of the resulting video may be poor as such measurements are usually arbitrarily defined and appear inconsistent with the real scenes. To help solve this problem, a novel method for object-based stereoscopic video generation is proposed which features i) optical-flow based occlusion reasoning in determining depth ordinal, ii) object segmentation using improved region-growing from masks of determined depth layers, and iii) a hybrid depth estimation scheme using content-based matching (inside a small library of true stereo image pairs) and depth-ordinal based regularization. Comprehensive experiments have validated the effectiveness of our proposed 2D-to-3D conversion method in generating stereoscopic videos of consistent depth measurements for 3D-TV applications

    A framework for realistic 3D tele-immersion

    Get PDF
    Meeting, socializing and conversing online with a group of people using teleconferencing systems is still quite differ- ent from the experience of meeting face to face. We are abruptly aware that we are online and that the people we are engaging with are not in close proximity. Analogous to how talking on the telephone does not replicate the experi- ence of talking in person. Several causes for these differences have been identified and we propose inspiring and innova- tive solutions to these hurdles in attempt to provide a more realistic, believable and engaging online conversational expe- rience. We present the distributed and scalable framework REVERIE that provides a balanced mix of these solutions. Applications build on top of the REVERIE framework will be able to provide interactive, immersive, photo-realistic ex- periences to a multitude of users that for them will feel much more similar to having face to face meetings than the expe- rience offered by conventional teleconferencing systems
    corecore