605 research outputs found

    Generative Compression

    Full text link
    Traditional image and video compression algorithms rely on hand-crafted encoder/decoder pairs (codecs) that lack adaptability and are agnostic to the data being compressed. Here we describe the concept of generative compression, the compression of data using generative models, and suggest that it is a direction worth pursuing to produce more accurate and visually pleasing reconstructions at much deeper compression levels for both image and video data. We also demonstrate that generative compression is orders-of-magnitude more resilient to bit error rates (e.g. from noisy wireless channels) than traditional variable-length coding schemes

    Optimizing Network Coding Algorithms for Multiple Applications.

    Get PDF
    Deviating from the archaic communication approach of treating information as a fluid moving through pipes, the concepts of Network Coding (NC) suggest that optimal throughput of a multicast network can be achieved by processing information at individual network nodes. However, existing challenges to harness the advantages of NC concepts for practical applications have prevented the development of NC into an effective solution to increase the performance of practical communication networks. In response, the research work presented in this thesis proposes cross-layer NC solutions to increase the network throughput of data multicast as well as video quality of video multicast applications. First, three algorithms are presented to improve the throughput of NC enabled networks by minimizing the NC coefficient vector overhead, optimizing the NC redundancy allocation and improving the robustness of NC against bursty packet losses. Considering the fact that majority of network traffic occupies video, rest of the proposed NC algorithms are content-aware and are optimized for both data and video multicast applications. A set of content and network-aware optimization algorithms, which allocate redundancies for NC considering content properties as well as the network status, are proposed to efficiently multicast data and video across content delivery networks. Furthermore content and channel-aware joint channel and network coding algorithms are proposed to efficiently multicast data and video across wireless networks. Finally, the possibilities of performing joint source and network coding are explored to increase the robustness of high volume video multicast applications. Extensive simulation studies indicate significant improvements with the proposed algorithms to increase the network throughput and video quality over related state-of-the-art solutions. Hence, it is envisaged that the proposed algorithms will contribute to the advancement of data and video multicast protocols in the future communication networks

    Cross-layer Optimized Wireless Video Surveillance

    Get PDF
    A wireless video surveillance system contains three major components, the video capture and preprocessing, the video compression and transmission over wireless sensor networks (WSNs), and the video analysis at the receiving end. The coordination of different components is important for improving the end-to-end video quality, especially under the communication resource constraint. Cross-layer control proves to be an efficient measure for optimal system configuration. In this dissertation, we address the problem of implementing cross-layer optimization in the wireless video surveillance system. The thesis work is based on three research projects. In the first project, a single PTU (pan-tilt-unit) camera is used for video object tracking. The problem studied is how to improve the quality of the received video by jointly considering the coding and transmission process. The cross-layer controller determines the optimal coding and transmission parameters, according to the dynamic channel condition and the transmission delay. Multiple error concealment strategies are developed utilizing the special property of the PTU camera motion. In the second project, the binocular PTU camera is adopted for video object tracking. The presented work studied the fast disparity estimation algorithm and the 3D video transcoding over the WSN for real-time applications. The disparity/depth information is estimated in a coarse-to-fine manner using both local and global methods. The transcoding is coordinated by the cross-layer controller based on the channel condition and the data rate constraint, in order to achieve the best view synthesis quality. The third project is applied for multi-camera motion capture in remote healthcare monitoring. The challenge is the resource allocation for multiple video sequences. The presented cross-layer design incorporates the delay sensitive, content-aware video coding and transmission, and the adaptive video coding and transmission to ensure the optimal and balanced quality for the multi-view videos. In these projects, interdisciplinary study is conducted to synergize the surveillance system under the cross-layer optimization framework. Experimental results demonstrate the efficiency of the proposed schemes. The challenges of cross-layer design in existing wireless video surveillance systems are also analyzed to enlighten the future work. Adviser: Song C

    Cross-layer Optimized Wireless Video Surveillance

    Get PDF
    A wireless video surveillance system contains three major components, the video capture and preprocessing, the video compression and transmission over wireless sensor networks (WSNs), and the video analysis at the receiving end. The coordination of different components is important for improving the end-to-end video quality, especially under the communication resource constraint. Cross-layer control proves to be an efficient measure for optimal system configuration. In this dissertation, we address the problem of implementing cross-layer optimization in the wireless video surveillance system. The thesis work is based on three research projects. In the first project, a single PTU (pan-tilt-unit) camera is used for video object tracking. The problem studied is how to improve the quality of the received video by jointly considering the coding and transmission process. The cross-layer controller determines the optimal coding and transmission parameters, according to the dynamic channel condition and the transmission delay. Multiple error concealment strategies are developed utilizing the special property of the PTU camera motion. In the second project, the binocular PTU camera is adopted for video object tracking. The presented work studied the fast disparity estimation algorithm and the 3D video transcoding over the WSN for real-time applications. The disparity/depth information is estimated in a coarse-to-fine manner using both local and global methods. The transcoding is coordinated by the cross-layer controller based on the channel condition and the data rate constraint, in order to achieve the best view synthesis quality. The third project is applied for multi-camera motion capture in remote healthcare monitoring. The challenge is the resource allocation for multiple video sequences. The presented cross-layer design incorporates the delay sensitive, content-aware video coding and transmission, and the adaptive video coding and transmission to ensure the optimal and balanced quality for the multi-view videos. In these projects, interdisciplinary study is conducted to synergize the surveillance system under the cross-layer optimization framework. Experimental results demonstrate the efficiency of the proposed schemes. The challenges of cross-layer design in existing wireless video surveillance systems are also analyzed to enlighten the future work. Adviser: Song C

    Optimized Visual Internet of Things in Video Processing for Video Streaming

    Get PDF
    The global expansion of the Visual Internet of Things (VIoT) has enabled various new applications during the last decade through the interconnection of a wide range of devices and sensors.Frame freezing and buffering are the major artefacts in broad area of multimedia networking applications occurring due to significant packet loss and network congestion. Numerous studies have been carried out in order to understand the impact of packet loss on QoE for a wide range of applications. This paper improves the video streaming quality by using the proposed framework Lossy Video Transmission (LVT)  for simulating the effect of network congestion on the performance of  encrypted static images sent over wireless sensor networks.The simulations are intended for analysing video quality and determining packet drop resilience during video conversations.The assessment of emerging trends in quality measurement, including picture preference, visual attention, and audio visual quality is checked. To appropriately quantify the video quality loss caused by the encoding system, various encoders compress video sequences at various data rates.Simulation results for different QoE metrics with respect to user developed videos have been demonstrated which outperforms the existing metrics

    GRACE: Loss-Resilient Real-Time Video through Neural Codecs

    Full text link
    In real-time video communication, retransmitting lost packets over high-latency networks is not viable due to strict latency requirements. To counter packet losses without retransmission, two primary strategies are employed -- encoder-based forward error correction (FEC) and decoder-based error concealment. The former encodes data with redundancy before transmission, yet determining the optimal redundancy level in advance proves challenging. The latter reconstructs video from partially received frames, but dividing a frame into independently coded partitions inherently compromises compression efficiency, and the lost information cannot be effectively recovered by the decoder without adapting the encoder. We present a loss-resilient real-time video system called GRACE, which preserves the user's quality of experience (QoE) across a wide range of packet losses through a new neural video codec. Central to GRACE's enhanced loss resilience is its joint training of the neural encoder and decoder under a spectrum of simulated packet losses. In lossless scenarios, GRACE achieves video quality on par with conventional codecs (e.g., H.265). As the loss rate escalates, GRACE exhibits a more graceful, less pronounced decline in quality, consistently outperforming other loss-resilient schemes. Through extensive evaluation on various videos and real network traces, we demonstrate that GRACE reduces undecodable frames by 95% and stall duration by 90% compared with FEC, while markedly boosting video quality over error concealment methods. In a user study with 240 crowdsourced participants and 960 subjective ratings, GRACE registers a 38% higher mean opinion score (MOS) than other baselines

    Enabling geometry-based 3-D tele-immersion with fast mesh compression and linear rateless coding

    Get PDF
    3-D tele-immersion (3DTI) enables participants in remote locations to share, in real time, an activity. It offers users interactive and immersive experiences, but it challenges current media-streaming solutions. Work in the past has mainly focused on the efficient delivery of image-based 3-D videos and on realistic rendering and reconstruction of geometry-based 3-D objects. The contribution of this paper is a real-time streaming component for 3DTI with dynamic reconstructed geometry. This component includes both a novel fast compression method and a rateless packet protection scheme specifically designed towards the requirements imposed by real time transmission of live-reconstructed mesh geometry. Tests on a large dataset show an encoding speed-up up to ten times at comparable compression ratio and quality, when compared with the high-end MPEG-4 SC3DMC mesh encoders. The implemented rateless code ensures complete packet loss protection of the triangle mesh object and a delivery delay within interactive bounds. Contrary to most linear fountain codes, the designed codec enables real-time progressive decoding allowing partial decoding each time a packet is received. This approach is compared with transmission over TCP in packet loss rates and latencies, typical in managed WAN and MAN networks, and heavily outperforms it in terms of end-to-end delay. The streaming component has been integrated into a larger 3DTI environment that includes state of the art 3-D reconstruction and rendering modules. This resulted in a prototype that can capture, compress transmit, and render triangle mesh geometry in real-time in realistic internet conditions as shown in experiments. Compared with alternative methods, lower interactive end-to-end delay and frame rates over three times higher are achieved
    corecore