1,402 research outputs found

    Appearance-Based Gaze Estimation in the Wild

    Full text link
    Appearance-based gaze estimation is believed to work well in real-world settings, but existing datasets have been collected under controlled laboratory conditions and methods have been not evaluated across multiple datasets. In this work we study appearance-based gaze estimation in the wild. We present the MPIIGaze dataset that contains 213,659 images we collected from 15 participants during natural everyday laptop use over more than three months. Our dataset is significantly more variable than existing ones with respect to appearance and illumination. We also present a method for in-the-wild appearance-based gaze estimation using multimodal convolutional neural networks that significantly outperforms state-of-the art methods in the most challenging cross-dataset evaluation. We present an extensive evaluation of several state-of-the-art image-based gaze estimation algorithms on three current datasets, including our own. This evaluation provides clear insights and allows us to identify key research challenges of gaze estimation in the wild

    Depth map compression via 3D region-based representation

    Get PDF
    In 3D video, view synthesis is used to create new virtual views between encoded camera views. Errors in the coding of the depth maps introduce geometry inconsistencies in synthesized views. In this paper, a new 3D plane representation of the scene is presented which improves the performance of current standard video codecs in the view synthesis domain. Two image segmentation algorithms are proposed for generating a color and depth segmentation. Using both partitions, depth maps are segmented into regions without sharp discontinuities without having to explicitly signal all depth edges. The resulting regions are represented using a planar model in the 3D world scene. This 3D representation allows an efficient encoding while preserving the 3D characteristics of the scene. The 3D planes open up the possibility to code multiview images with a unique representation.Postprint (author's final draft

    Subjective assessment of super multiview video with coding artifacts

    Get PDF
    The subjective assessment of super multiview (SMV) video considers two main perceptual factors: image quality and visual comfort at the viewpoint transition. While previous works only covered raw content with high levels of visual comfort, this work supersedes them by targeting the subjective assessment of SMV content with coding artifacts. The outcome of this analysis yields important conclusions regarding the relationship between these two factors, indicating that 1) the perceived image quality is independent from the view point change speed, and 2) the perceived visual comfort at the view point transition is independent from the image quality. These conclusions facilitate the extension of the scope of existing subjective perception models, designed for raw SMV content, to coded content

    New visual coding exploration in MPEG: Super-MultiView and free navigation in free viewpoint TV

    Get PDF
    ISO/IEC MPEG and ITU-T VCEG have recently jointly issued a new multiview video compression standard, called 3D-HEVC, which reaches unpreceded compression performances for linear,dense camera arrangements. In view of supporting future highquality,auto-stereoscopic 3D displays and Free Navigation virtual/augmented reality applications with sparse, arbitrarily arranged camera setups, innovative depth estimation and virtual view synthesis techniques with global optimizations over all camera views should be developed. Preliminary studies in response to the MPEG-FTV (Free viewpoint TV) Call for Evidence suggest these targets are within reach, with at least 6% bitrate gains over 3DHEVC technology

    Providing 3D video services: the challenge from 2D to 3DTV quality of experience

    Get PDF
    Recently, three-dimensional (3D) video has decisively burst onto the entertainment industry scene, and has arrived in households even before the standardization process has been completed. 3D television (3DTV) adoption and deployment can be seen as a major leap in television history, similar to previous transitions from black and white (B&W) to color, from analog to digital television (TV), and from standard definition to high definition. In this paper, we analyze current 3D video technology trends in order to define a taxonomy of the availability and possible introduction of 3D-based services. We also propose an audiovisual network services architecture which provides a smooth transition from two-dimensional (2D) to 3DTV in an Internet Protocol (IP)-based scenario. Based on subjective assessment tests, we also analyze those factors which will influence the quality of experience in those 3D video services, focusing on effects of both coding and transmission errors. In addition, examples of the application of the architecture and results of assessment tests are provided

    3D video coding and transmission

    Get PDF
    The capture, transmission, and display of 3D content has gained a lot of attention in the last few years. 3D multimedia content is no longer con fined to cinema theatres but is being transmitted using stereoscopic video over satellite, shared on Blu-RayTMdisks, or sent over Internet technologies. Stereoscopic displays are needed at the receiving end and the viewer needs to wear special glasses to present the two versions of the video to the human vision system that then generates the 3D illusion. To be more e ffective and improve the immersive experience, more views are acquired from a larger number of cameras and presented on di fferent displays, such as autostereoscopic and light field displays. These multiple views, combined with depth data, also allow enhanced user experiences and new forms of interaction with the 3D content from virtual viewpoints. This type of audiovisual information is represented by a huge amount of data that needs to be compressed and transmitted over bandwidth-limited channels. Part of the COST Action IC1105 \3D Content Creation, Coding and Transmission over Future Media Networks" (3DConTourNet) focuses on this research challenge.peer-reviewe

    Performance improvement of segmentation-based depth representation in 3D imagery by region merging

    Get PDF
    The feasible implementation of immersive 3D video systems entails the need for a substantial reduction in the amount of image information necessary for representation. Multiview image rendering algorithms based on depth data have radically reduced the number of images required to reconstruct a 3D scene. Nonetheless, the compression of depth maps still poses several challenges due to the particular nature and characteristics of the data. To this end, this paper outlines a depth representation technique, developed in our earlier work, that exploits the correlation intrinsically present between color intensity and depth images capturing a natural scene. In this technique, a segmentation-based algorithm that is backwards compatible with conventional video coding systems is implemented. The effectiveness of our previous technique is enhanced in this contribution by a region merging process on the segmented regions, which results in a decrease in the amount of information necessary for transmission or storage of multiview image data by a factor of 20.5 with respect to the reference H.264/AVC coding methodology. This is furthermore achieved whilst maintaining a 3D image reconstruction and viewing quality which is quasi identical to the referenced approach.peer-reviewe
    • …
    corecore