12,757 research outputs found

    Tracking-Optimized Quantization for H.264 Compression in Transportation Video Surveillance Applications

    Get PDF
    We propose a tracking-aware system that removes video components of low tracking interest and optimizes the quantization during compression of frequency coefficients, particularly those that most influence trackers, significantly reducing bitrate while maintaining comparable tracking accuracy. We utilize tracking accuracy as our compression criterion in lieu of mean squared error metrics. The process of optimizing quantization tables suitable for automated tracking can be executed online or offline. The online implementation initializes the encoding procedure for a specific scene, but introduces delay. On the other hand, the offline procedure produces globally optimum quantization tables where the optimization occurs for a collection of video sequences. Our proposed system is designed with low processing power and memory requirements in mind, and as such can be deployed on remote nodes. Using H.264/AVC video coding and a commonly used state-of-the-art tracker we show that while maintaining comparable tracking accuracy our system allows for over 50% bitrate savings on top of existing savings from previous work

    Calipso: Physics-based Image and Video Editing through CAD Model Proxies

    Get PDF
    We present Calipso, an interactive method for editing images and videos in a physically-coherent manner. Our main idea is to realize physics-based manipulations by running a full physics simulation on proxy geometries given by non-rigidly aligned CAD models. Running these simulations allows us to apply new, unseen forces to move or deform selected objects, change physical parameters such as mass or elasticity, or even add entire new objects that interact with the rest of the underlying scene. In Calipso, the user makes edits directly in 3D; these edits are processed by the simulation and then transfered to the target 2D content using shape-to-image correspondences in a photo-realistic rendering process. To align the CAD models, we introduce an efficient CAD-to-image alignment procedure that jointly minimizes for rigid and non-rigid alignment while preserving the high-level structure of the input shape. Moreover, the user can choose to exploit image flow to estimate scene motion, producing coherent physical behavior with ambient dynamics. We demonstrate Calipso's physics-based editing on a wide range of examples producing myriad physical behavior while preserving geometric and visual consistency.Comment: 11 page

    Quality of experience driven control of interactive media stream parameters

    Get PDF
    In recent years, cloud computing has led to many new kinds of services. One of these popular services is cloud gaming, which provides the entire game experience to the users remotely from a server, but also other applications are provided in a similar manner. In this paper we focus on the option to render the application in the cloud, thereby delivering the graphical output of the application to the user as a video stream. In more general terms, an interactive media stream is set up over the network between the user's device and the cloud server. The main issue with this approach is situated at the network, that currently gives little guarantees on the quality of service in terms of parameters such as available bandwidth, latency or packet loss. However, for interactive media stream cases, the user is merely interested in the perceived quality, regardless of the underlaying network situation. In this paper, we present an adaptive control mechanism that optimizes the quality of experience for the use case of a race game, by trading off visual quality against frame rate in function of the available bandwidth. Practical experiments verify that QoE driven adaptation leads to improved user experience compared to systems solely taking network characteristics into account

    A framework for realistic 3D tele-immersion

    Get PDF
    Meeting, socializing and conversing online with a group of people using teleconferencing systems is still quite differ- ent from the experience of meeting face to face. We are abruptly aware that we are online and that the people we are engaging with are not in close proximity. Analogous to how talking on the telephone does not replicate the experi- ence of talking in person. Several causes for these differences have been identified and we propose inspiring and innova- tive solutions to these hurdles in attempt to provide a more realistic, believable and engaging online conversational expe- rience. We present the distributed and scalable framework REVERIE that provides a balanced mix of these solutions. Applications build on top of the REVERIE framework will be able to provide interactive, immersive, photo-realistic ex- periences to a multitude of users that for them will feel much more similar to having face to face meetings than the expe- rience offered by conventional teleconferencing systems

    Loss-resilient Coding of Texture and Depth for Free-viewpoint Video Conferencing

    Full text link
    Free-viewpoint video conferencing allows a participant to observe the remote 3D scene from any freely chosen viewpoint. An intermediate virtual viewpoint image is commonly synthesized using two pairs of transmitted texture and depth maps from two neighboring captured viewpoints via depth-image-based rendering (DIBR). To maintain high quality of synthesized images, it is imperative to contain the adverse effects of network packet losses that may arise during texture and depth video transmission. Towards this end, we develop an integrated approach that exploits the representation redundancy inherent in the multiple streamed videos a voxel in the 3D scene visible to two captured views is sampled and coded twice in the two views. In particular, at the receiver we first develop an error concealment strategy that adaptively blends corresponding pixels in the two captured views during DIBR, so that pixels from the more reliable transmitted view are weighted more heavily. We then couple it with a sender-side optimization of reference picture selection (RPS) during real-time video coding, so that blocks containing samples of voxels that are visible in both views are more error-resiliently coded in one view only, given adaptive blending will erase errors in the other view. Further, synthesized view distortion sensitivities to texture versus depth errors are analyzed, so that relative importance of texture and depth code blocks can be computed for system-wide RPS optimization. Experimental results show that the proposed scheme can outperform the use of a traditional feedback channel by up to 0.82 dB on average at 8% packet loss rate, and by as much as 3 dB for particular frames
    corecore