50 research outputs found

    Adaptive Content Frame Skipping for Wyner–Ziv-Based Light Field Image Compression

    Full text link
    Light field (LF) imaging introduces attractive possibilities for digital imaging, such as digital focusing, post-capture changing of the focal plane or view point, and scene depth estimation, by capturing both spatial and angular information of incident light rays. However, LF image compression is still a great challenge, not only due to light field imagery requiring a large amount of storage space and a large transmission bandwidth, but also due to the complexity requirements of various applications. In this paper, we propose a novel LF adaptive content frame skipping compression solution by following a Wyner–Ziv (WZ) coding approach. In the proposed coding approach, the LF image is firstly converted into a four-dimensional LF (4D-LF) data format. To achieve good compression performance, we select an efficient scanning mechanism to generate a 4D-LF pseudo-sequence by analyzing the content of the LF image with different scanning methods. In addition, to further explore the high frame correlation of the 4D-LF pseudo-sequence, we introduce an adaptive frame skipping algorithm followed by decision tree techniques based on the LF characteristics, e.g., the depth of field and angular information. The experimental results show that the proposed WZ-LF coding solution achieves outstanding rate distortion (RD) performance while having less computational complexity. Notably, a bit rate saving of 53% is achieved compared to the standard high-efficiency video coding (HEVC) Intra codec.</jats:p

    Mixed-Resolution HEVC based multiview video codec for low bitrate transmission

    Get PDF

    Depth sequence coding with hierarchical partitioning and spatial-domain quantization

    Get PDF
    Depth coding in 3D-HEVC deforms object shapes due to block-level edge-approximation and lacks efficient techniques to exploit the statistical redundancy, due to the frame-level clustering tendency in depth data, for higher coding gain at near-lossless quality. This paper presents a standalone mono-view depth sequence coder, which preserves edges implicitly by limiting quantization to the spatial-domain and exploits the frame-level clustering tendency efficiently with a novel binary tree-based decomposition (BTBD) technique. The BTBD can exploit the statistical redundancy in frame-level syntax, motion components, and residuals efficiently with fewer block-level prediction/coding modes and simpler context modeling for context-adaptive arithmetic coding. Compared with the depth coder in 3D-HEVC, the proposed one has achieved significantly lower bitrate at lossless to near-lossless quality range for mono-view coding and rendered superior quality synthetic views from the depth maps, compressed at the same bitrate, and the corresponding texture frames. © 1991-2012 IEEE

    Novi algoritam za kompresiju seizmičkih podataka velike amplitudske rezolucije

    Get PDF
    Renewable sources cannot meet energy demand of a growing global market. Therefore, it is expected that oil & gas will remain a substantial sources of energy in a coming years. To find a new oil & gas deposits that would satisfy growing global energy demands, significant efforts are constantly involved in finding ways to increase efficiency of a seismic surveys. It is commonly considered that, in an initial phase of exploration and production of a new fields, high-resolution and high-quality images of the subsurface are of the great importance. As one part in the seismic data processing chain, efficient managing and delivering of a large data sets, that are vastly produced by the industry during seismic surveys, becomes extremely important in order to facilitate further seismic data processing and interpretation. In this respect, efficiency to a large extent relies on the efficiency of the compression scheme, which is often required to enable faster transfer and access to data, as well as efficient data storage. Motivated by the superior performance of High Efficiency Video Coding (HEVC), and driven by the rapid growth in data volume produced by seismic surveys, this work explores a 32 bits per pixel (b/p) extension of the HEVC codec for compression of seismic data. It is proposed to reassemble seismic slices in a format that corresponds to video signal and benefit from the coding gain achieved by HEVC inter mode, besides the possible advantages of the (still image) HEVC intra mode. To this end, this work modifies almost all components of the original HEVC codec to cater for high bit-depth coding of seismic data: Lagrange multiplier used in optimization of the coding parameters has been adapted to the new data statistics, core transform and quantization have been reimplemented to handle the increased bit-depth range, and modified adaptive binary arithmetic coder has been employed for efficient entropy coding. In addition, optimized block selection, reduced intra prediction modes, and flexible motion estimation are tested to adapt to the structure of seismic data. Even though the new codec after implementation of the proposed modifications goes beyond the standardized HEVC, it still maintains a generic HEVC structure, and it is developed under the general HEVC framework. There is no similar work in the field of the seismic data compression that uses the HEVC as a base codec setting. Thus, a specific codec design has been tailored which, when compared to the JPEG-XR and commercial wavelet-based codec, significantly improves the peak-signal-tonoise- ratio (PSNR) vs. compression ratio performance for 32 b/p seismic data. Depending on a proposed configurations, PSNR gain goes from 3.39 dB up to 9.48 dB. Also, relying on the specific characteristics of seismic data, an optimized encoder is proposed in this work. It reduces encoding time by 67.17% for All-I configuration on trace image dataset, and 67.39% for All-I, 97.96% for P2-configuration and 98.64% for B-configuration on 3D wavefield dataset, with negligible coding performance losses. As a side contribution of this work, HEVC is analyzed within all of its functional units, so that the presented work itself can serve as a specific overview of methods incorporated into the standard

    Adapting Computer Vision Models To Limitations On Input Dimensionality And Model Complexity

    Get PDF
    When considering instances of distributed systems where visual sensors communicate with remote predictive models, data traffic is limited to the capacity of communication channels, and hardware limits the processing of collected data prior to transmission. We study novel methods of adapting visual inference to limitations on complexity and data availability at test time, wherever the aforementioned limitations exist. Our contributions detailed in this thesis consider both task-specific and task-generic approaches to reducing the data requirement for inference, and evaluate our proposed methods on a wide range of computer vision tasks. This thesis makes four distinct contributions: (i) We investigate multi-class action classification via two-stream convolutional neural networks that directly ingest information extracted from compressed video bitstreams. We show that selective access to macroblock motion vector information provides a good low-dimensional approximation of the underlying optical flow in visual sequences. (ii) We devise a bitstream cropping method by which AVC/H.264 and H.265 bitstreams are reduced to the minimum amount of necessary elements for optical flow extraction, while maintaining compliance with codec standards. We additionally study the effect of codec rate-quality control on the sparsity and noise incurred on optical flow derived from resulting bitstreams, and do so for multiple coding standards. (iii) We demonstrate degrees of variability in the amount of data required for action classification, and leverage this to reduce the dimensionality of input volumes by inferring the required temporal extent for accurate classification prior to processing via learnable machines. (iv) We extend the Mixtures-of-Experts (MoE) paradigm to adapt the data cost of inference for any set of constituent experts. We postulate that the minimum acceptable data cost of inference varies for different input space partitions, and consider mixtures where each expert is designed to meet a different set of constraints on input dimensionality. To take advantage of the flexibility of such mixtures in processing different input representations and modalities, we train biased gating functions such that experts requiring less information to make their inferences are favoured to others. We finally note that, our proposed data utility optimization solutions include a learnable component which considers specified priorities on the amount of information to be used prior to inference, and can be realized for any combination of tasks, modalities, and constraints on available data

    Secure and Efficient Video Transmission in VANET

    Get PDF
    Currently, vehicular communications have become a reality used by various applications, especially applications that broadcast video in real time. However, the video quality received is penalized by the poor characteristics of the transmission channel (availability, non-stationarity, the ration of signal-to-noise, etc.). To improve and ensure minimum video quality at reception, we propose in this work a mechanism entitled “Secure and Efficient Transmission of Videos in VANET (SETV)”. It's based on the "Quality of Experience (QoE)" and using hierarchical packet management. This last is based on the importance of the images of the stream video. To this end, the use of transmission error correction with uneven error protection has proven to be effective in delivering high quality videos with low network overhead. This is done based on the specific details of video encoding and actual network conditions such as signal to noise ratio, network density, vehicle position and current packet loss rate (PLR) not to mention the prediction of the future DPP.Machine learning models were developed on our work to estimate perceived audio-visual quality. The protocol previously gathers information about its neighbouring vehicles to perform distributed jump reinforcement learning. The simulation results obtained for several types of realistic vehicular scenarios show that our proposed mechanism offers significant improvements in terms of video quality on reception and end-to-end delay compared to conventional schemes. The results prove that the proposed mechanism has showed 11% to 18% improvement in video quality and 9% load gain compared to ShieldHEVC
    corecore