9 research outputs found

    Data compression and transmission aspects of panoramic videos

    Get PDF
    Panoramic videos are effective means for representing static or dynamic scenes along predefined paths. They allow users to change their viewpoints interactively at points in time or space defined by the paths. High-resolution panoramic videos, while desirable, consume a significant amount of storage and bandwidth for transmission. They also make real-time decoding computationally very intensive. This paper proposes efficient data compression and transmission techniques for panoramic videos. A high-performance MPEG-2-like compression algorithm, which takes into account the random access requirements and the redundancies of panoramic videos, is proposed. The transmission aspects of panoramic videos over cable networks, local area networks (LANs), and the Internet are also discussed. In particular, an efficient advanced delivery sharing scheme (ADSS) for reducing repeated transmission and retrieval of frequently requested video segments is introduced. This protocol was verified by constructing an experimental VOD system consisting of a video server and eight Pentium 4 computers. Using the synthetic panoramic video Village at a rate of 197 kb/s and 7 f/s, nearly two-thirds of the memory access and transmission bandwidth of the video server were saved under normal network traffic.published_or_final_versio

    Bridge the Gap Between VQA and Human Behavior on Omnidirectional Video: A Large-Scale Dataset and a Deep Learning Model

    Full text link
    Omnidirectional video enables spherical stimuli with the 360×180∘360 \times 180^ \circ viewing range. Meanwhile, only the viewport region of omnidirectional video can be seen by the observer through head movement (HM), and an even smaller region within the viewport can be clearly perceived through eye movement (EM). Thus, the subjective quality of omnidirectional video may be correlated with HM and EM of human behavior. To fill in the gap between subjective quality and human behavior, this paper proposes a large-scale visual quality assessment (VQA) dataset of omnidirectional video, called VQA-OV, which collects 60 reference sequences and 540 impaired sequences. Our VQA-OV dataset provides not only the subjective quality scores of sequences but also the HM and EM data of subjects. By mining our dataset, we find that the subjective quality of omnidirectional video is indeed related to HM and EM. Hence, we develop a deep learning model, which embeds HM and EM, for objective VQA on omnidirectional video. Experimental results show that our model significantly improves the state-of-the-art performance of VQA on omnidirectional video.Comment: Accepted by ACM MM 201

    An object-based approach to image/video-based synthesis and processing for 3-D and multiview televisions

    Get PDF
    This paper proposes an object-based approach to a class of dynamic image-based representations called "plenoptic videos," where the plenoptic video sequences are segmented into image-based rendering (IBR) objects each with its image sequence, depth map, and other relevant information such as shape and alpha information. This allows desirable functionalities such as scalability of contents, error resilience, and interactivity with individual IBR objects to be supported. Moreover, the rendering quality in scenes with large depth variations can also be improved considerably. A portable capturing system consisting of two linear camera arrays was developed to verify the proposed approach. An important step in the object-based approach is to segment the objects in video streams into layers or IBR objects. To reduce the time for segmenting plenoptic videos under the semiautomatic technique, a new object tracking method based on the level-set method is proposed. Due to possible segmentation errors around object boundaries, natural matting with Bayesian approach is also incorporated into our system. Furthermore, extensions of conventional image processing algorithms to these IBR objects are studied and illustrated with examples. Experimental results are given to illustrate the efficiency of the tracking, matting, rendering, and processing algorithms under the proposed object-based framework. © 2009 IEEE.published_or_final_versio

    A multi-camera approach to image-based rendering and 3-D/Multiview display of ancient chinese artifacts

    Get PDF
    published_or_final_versio

    Image-based rendering and synthesis

    Get PDF
    Multiview imaging (MVI) is currently the focus of some research as it has a wide range of applications and opens up research in other topics and applications, including virtual view synthesis for three-dimensional (3D) television (3DTV) and entertainment. However, a large amount of storage is needed by multiview systems and are difficult to construct. The concept behind allowing 3D scenes and objects to be visualized in a realistic way without full 3D model reconstruction is image-based rendering (IBR). Using images as the primary substrate, IBR has many potential applications including for video games, virtual travel and others. The technique creates new views of scenes which are reconstructed from a collection of densely sampled images or videos. The IBR concept has different classification such as knowing 3D models and the lighting conditions and be rendered using conventional graphic techniques. Another is lightfield or lumigraph rendering which depends on dense sampling with no or very little geometry for rendering without recovering the exact 3D-models.published_or_final_versio

    360-Degree Panoramic Video Coding

    Get PDF
    Virtual reality (VR) creates an immersive experience of real world in virtual environment through computer interface. Due to the technological advancements in recent years, VR technology is growing very fast and as a result industrial usage of this technology is feasible nowadays. This technology is being used in many applications for example gaming, education, streaming live events, etc. Since VR is visualizing the real world experience, the image or video content which is used must represent the whole 3D world characteristics. Omnidirectional images/videos demonstrate such characteristics and hence are used in VR applications. However, these contents are not suitable for conventional video coding standards, which use only 2D image/video format content. Accordingly, the omnidirectional content are projected onto a 2D image plane using cylindrical or pseudo-cylindrical projections. In this work, coding methods for two types of projection formats that are popular among the VR contents are studied: Equirectangular panoramic projection and Pseudo-cylindrical panoramic projection. The equirectangular projection is the most commonly used format in VR applications due to its rectangular image plane and also wide support in software development environments. However, this projection stretches the nadir and zenith areas of the panorama and as a result contain a relatively large portion of redundant data in these areas. The redundant information causes extra bitrate and also higher encoding/decoding time. Regional downsampling (RDS) methods are used in this work in order to decrease the extra bitrate caused by over-stretched polar areas. These methods are categorized into persistent regional down-sampling (P-RDS) and temporal regional down-sampling (T-RDS) methods. In the P-RDS method, the down-sampling is applied to all frames of the video, but in the T-RDS method, only inter frames are down-sampled and the intra frames are coded in full resolution format in order to maintain the highest possible quality of these frames. The pseudo-cylindrical projections map the 3D spherical domain to a non-rectangular 2D image plane in which the polar areas do not have redundant information. Therefore, the more realistic sample distribution of 3D world is achieved by using these projection formats. However, because of non-rectangular image plane format, pseudocylindrical panoramas are not favorable for image/video coding standards and as a result the compression performance is not efficient. Therefore, two methods are investigated for improving the intra-frame and inter-frame compression of these panorama formats. In the intra-frame coding method, border edges are smoothed by modifying the content of the image in non-effective picture area. In the interframe coding method, gaining the benefit of 360-degree property of the content, non-effective picture area of reference frames at the border is filled with the content of the effective picture area from the opposite border to improve the performance of motion compensation. As a final contribution, the quality assessment methods in VR applications are studied. Since the VR content are mainly displayed in head mounted displays (HMDs) which use 3D coordinate system, measuring the quality of decoded image/video with conventional methods does not represent the quality fairly. In this work, spherical quality metrics are investigated for measuring the quality of the proposed coding methods of omnidirectional panoramas. Moreover, a novel spherical quality metric (USS-PSNR) is proposed for evaluating the quality of VR images/video

    Algorithms, Protocols & Systems for Remote Observation Using Networked Robotic Cameras

    Get PDF
    Emerging advances in robotic cameras, long-range wireless networking, and distributed sensors make feasible a new class of hybrid teleoperated/autonomous robotic remote "observatories" that can allow groups of peoples, via the Internet, to observe, record, and index detailed activity occurred in remote site. Equipped with robotic pan-tilt actuation mechanisms and a high-zoom lens, the camera can cover a large region with very high spatial resolution and allows for observation at a distance. High resolution motion panorama is the most nature data representation. We develop algorithms and protocols for high resolution motion panorama. We discover and prove the projection invariance and achieve real time image alignment. We propose a minimum variance based incremental frame alignment algorithm to minimize the accumulation of alignment error in incremental image alignment and ensure the quality of the panorama video over the long run. We propose a Frame Graph based panorama documentation algorithm to manage the large scale data involved in the online panorama video documentation. We propose a on-demand high resolution panorama video-streaming system that allows on-demand sharing of a high-resolution motion panorama and efficiently deals with multiple concurrent spatial-temporal user requests. In conclusion, our research work on high resolution motion panorama have significantly improve the efficiency and accuracy of image alignment, panorama video quality, data organization, and data storage and retrieving in remote observation using networked robotic cameras
    corecore