546 research outputs found

    Focal-Plane Change Triggered Video Compression for Low-Power Vision Sensor Systems

    Get PDF
    Video sensors with embedded compression offer significant energy savings in transmission but incur energy losses in the complexity of the encoder. Energy efficient video compression architectures for CMOS image sensors with focal-plane change detection are presented and analyzed. The compression architectures use pixel-level computational circuits to minimize energy usage by selectively processing only pixels which generate significant temporal intensity changes. Using the temporal intensity change detection to gate the operation of a differential DCT based encoder achieves nearly identical image quality to traditional systems (4dB decrease in PSNR) while reducing the amount of data that is processed by 67% and reducing overall power consumption reduction of 51%. These typical energy savings, resulting from the sparsity of motion activity in the visual scene, demonstrate the utility of focal-plane change triggered compression to surveillance vision systems

    Context-Based Trit-Plane Coding for Progressive Image Compression

    Full text link
    Trit-plane coding enables deep progressive image compression, but it cannot use autoregressive context models. In this paper, we propose the context-based trit-plane coding (CTC) algorithm to achieve progressive compression more compactly. First, we develop the context-based rate reduction module to estimate trit probabilities of latent elements accurately and thus encode the trit-planes compactly. Second, we develop the context-based distortion reduction module to refine partial latent tensors from the trit-planes and improve the reconstructed image quality. Third, we propose a retraining scheme for the decoder to attain better rate-distortion tradeoffs. Extensive experiments show that CTC outperforms the baseline trit-plane codec significantly in BD-rate on the Kodak lossless dataset, while increasing the time complexity only marginally. Our codes are available at https://github.com/seungminjeon-github/CTC.Comment: Accepted to CVPR 202

    Algorithms for compression of high dynamic range images and video

    Get PDF
    The recent advances in sensor and display technologies have brought upon the High Dynamic Range (HDR) imaging capability. The modern multiple exposure HDR sensors can achieve the dynamic range of 100-120 dB and LED and OLED display devices have contrast ratios of 10^5:1 to 10^6:1. Despite the above advances in technology the image/video compression algorithms and associated hardware are yet based on Standard Dynamic Range (SDR) technology, i.e. they operate within an effective dynamic range of up to 70 dB for 8 bit gamma corrected images. Further the existing infrastructure for content distribution is also designed for SDR, which creates interoperability problems with true HDR capture and display equipment. The current solutions for the above problem include tone mapping the HDR content to fit SDR. However this approach leads to image quality associated problems, when strong dynamic range compression is applied. Even though some HDR-only solutions have been proposed in literature, they are not interoperable with current SDR infrastructure and are thus typically used in closed systems. Given the above observations a research gap was identified in the need for efficient algorithms for the compression of still images and video, which are capable of storing full dynamic range and colour gamut of HDR images and at the same time backward compatible with existing SDR infrastructure. To improve the usability of SDR content it is vital that any such algorithms should accommodate different tone mapping operators, including those that are spatially non-uniform. In the course of the research presented in this thesis a novel two layer CODEC architecture is introduced for both HDR image and video coding. Further a universal and computationally efficient approximation of the tone mapping operator is developed and presented. It is shown that the use of perceptually uniform colourspaces for internal representation of pixel data enables improved compression efficiency of the algorithms. Further proposed novel approaches to the compression of metadata for the tone mapping operator is shown to improve compression performance for low bitrate video content. Multiple compression algorithms are designed, implemented and compared and quality-complexity trade-offs are identified. Finally practical aspects of implementing the developed algorithms are explored by automating the design space exploration flow and integrating the high level systems design framework with domain specific tools for synthesis and simulation of multiprocessor systems. The directions for further work are also presented

    Adaptive sensing and optimal power allocation for wireless video sensors with sigma-delta imager

    Get PDF
    We consider optimal power allocation for wireless video sensors (WVSs), including the image sensor subsystem in the system analysis. By assigning a power-rate-distortion (P-R-D) characteristic for the image sensor, we build a comprehensive P-R-D optimization framework for WVSs. For a WVS node operating under a power budget, we propose power allocation among the image sensor, compression, and transmission modules, in order to minimize the distortion of the video reconstructed at the receiver. To demonstrate the proposed optimization method, we establish a P-R-D model for an image sensor based upon a pixel level sigma-delta ( ) image sensor design that allows investigation of the tradeoff between the bit depth of the captured images and spatio-temporal characteristics of the video sequence under the power constraint. The optimization results obtained in this setting confirm that including the image sensor in the system optimization procedure can improve the overall video quality under power constraint and prolong the lifetime of the WVSs. In particular, when the available power budget for a WVS node falls below a threshold, adaptive sensing becomes necessary to ensure that the node communicates useful information about the video content while meeting its power budget.Peer ReviewedPostprint (published version

    Advanced Television and Signal Processing Program

    Get PDF
    Contains an introduction and reports on two research projects.Advanced Television Research Progra

    Evaluate the Performance of Video Transmission Using H.264 (SVC) Over Long Term Evolution (LTE)

    Get PDF
    In recent years, the mobile Internet has increased dramatically with the development of 3G and 4G technologies. Especially th e usage of mobile broadband internet on the devices like cellular mobiles, Tablets and Laptops has skyrocketed. Among the multimedia applications video streaming is the most popular mobile application. But, making these services available to users in a cost effective way without compromising quality is a big challenge. The development of Long Term Evolution (LTE) technology in the mobile world made this task achievable. The features of LTE technology provide effective services in multimedia applications with high data rates and low latency. The aim of this paper is to evaluate the quality of service (QoS) performance over LTE

    NERV++: An Enhanced Implicit Neural Video Representation

    Full text link
    Neural fields, also known as implicit neural representations (INRs), have shown a remarkable capability of representing, generating, and manipulating various data types, allowing for continuous data reconstruction at a low memory footprint. Though promising, INRs applied to video compression still need to improve their rate-distortion performance by a large margin, and require a huge number of parameters and long training iterations to capture high-frequency details, limiting their wider applicability. Resolving this problem remains a quite challenging task, which would make INRs more accessible in compression tasks. We take a step towards resolving these shortcomings by introducing neural representations for videos NeRV++, an enhanced implicit neural video representation, as more straightforward yet effective enhancement over the original NeRV decoder architecture, featuring separable conv2d residual blocks (SCRBs) that sandwiches the upsampling block (UB), and a bilinear interpolation skip layer for improved feature representation. NeRV++ allows videos to be directly represented as a function approximated by a neural network, and significantly enhance the representation capacity beyond current INR-based video codecs. We evaluate our method on UVG, MCL JVC, and Bunny datasets, achieving competitive results for video compression with INRs. This achievement narrows the gap to autoencoder-based video coding, marking a significant stride in INR-based video compression research
    • …
    corecore