722 research outputs found
Morphological operators for very low bit rate video coding
This paper deals with the use of some morphological tools for video coding at very low bit rates. Rather than describing a complete coding algorithm, the purpose of this paper is to focus on morphological connected operators and segmentation tools that have proved to be attractive for compression.Peer ReviewedPostprint (published version
Mitigation of H.264 and H.265 Video Compression for Reliable PRNU Estimation
The photo-response non-uniformity (PRNU) is a distinctive image sensor
characteristic, and an imaging device inadvertently introduces its sensor's
PRNU into all media it captures. Therefore, the PRNU can be regarded as a
camera fingerprint and used for source attribution. The imaging pipeline in a
camera, however, involves various processing steps that are detrimental to PRNU
estimation. In the context of photographic images, these challenges are
successfully addressed and the method for estimating a sensor's PRNU pattern is
well established. However, various additional challenges related to generation
of videos remain largely untackled. With this perspective, this work introduces
methods to mitigate disruptive effects of widely deployed H.264 and H.265 video
compression standards on PRNU estimation. Our approach involves an intervention
in the decoding process to eliminate a filtering procedure applied at the
decoder to reduce blockiness. It also utilizes decoding parameters to develop a
weighting scheme and adjust the contribution of video frames at the macroblock
level to PRNU estimation process. Results obtained on videos captured by 28
cameras show that our approach increases the PRNU matching metric up to more
than five times over the conventional estimation method tailored for photos
Implementation of BMA based motion estimation hardware accelerator in HDL
Motion Estimation in MPEG (Motion Pictures Experts Group) video is a temporal prediction technique. The basic principle of motion estimation is that in most cases, consecutive video frames will be similar except for changes induced by objects moving within the frames. Motion Estimation performs a comprehensive 2-dimensional spatial search for each luminance macroblock (16x16 pixel block). MPEG does not define how this search should be performed. This is a detail that the system designer can choose to implement in one of many possible ways. It is well known that a full, exhaustive search over a wide 2-dimensional area yields the best matching results in most cases, but this performance comes at an extreme computational cost to the encoder. Some lower cost encoders might choose to limit the pixel search range, or use other techniques usually at some cost to the video quality which gives rise to a trade-off; Such algorithms used in image processing are generally computationally expensive. FPGAs are capable of running graphics algorithms at the speed comparable to dedicated graphics chips. At the same time they are configurable through high-level programming languages, e.g. Verilog, VHDL. The work presented entirely focuses upon a Hardware Accelerator capable of performing Motion Estimation, based upon Block Matching Algorithm. The SAD based Full Search Motion Estimation coded using Verilog HDL, relies upon a 32x32 pixel search area to find the best match for single 16x16 macroblock; Keywords. Motion Estimation, MPEG, macroblock, FPGA, SAD, Verilog, VHDL
Study and simulation of low rate video coding schemes
The semiannual report is included. Topics covered include communication, information science, data compression, remote sensing, color mapped images, robust coding scheme for packet video, recursively indexed differential pulse code modulation, image compression technique for use on token ring networks, and joint source/channel coder design
Entropy coding and post-processing for image and video coding.
Fong, Yiu Leung.Thesis (M.Phil.)--Chinese University of Hong Kong, 2010.Includes bibliographical references (leaves 83-87).Abstracts in English and Chinese.Abstract --- p.2Acknowledgement --- p.6Chapter 1. --- Introduction --- p.9Chapter 2. --- Background and Motivation --- p.10Chapter 2.1 --- Context-Based Arithmetic Coding --- p.10Chapter 2.2 --- Video Post-processing --- p.13Chapter 3. --- Context-Based Arithmetic Coding for JPEG --- p.16Chapter 3.1 --- Introduction --- p.16Chapter 3.1.1 --- Huffman Coding --- p.16Chapter 3.1.1.1 --- Introduction --- p.16Chapter 3.1.1.2 --- Concept --- p.16Chapter 3.1.1.3 --- Drawbacks --- p.18Chapter 3.1.2 --- Context-Based Arithmetic Coding --- p.19Chapter 3.1.2.1 --- Introduction --- p.19Chapter 3.1.2.2 --- Concept --- p.20Chapter 3.2 --- Proposed Method --- p.30Chapter 3.2.1 --- Introduction --- p.30Chapter 3.2.2 --- Redundancy in Quantized DCT Coefficients --- p.32Chapter 3.2.2.1 --- Zig-Zag Scanning Position --- p.32Chapter 3.2.2.2 --- Magnitudes of Previously Coded Coefficients --- p.41Chapter 3.2.3 --- Proposed Scheme --- p.43Chapter 3.2.3.1 --- Overview --- p.43Chapter 3.2.3.2 --- Preparation of Coding --- p.44Chapter 3.2.3.3 --- Coding of Non-zero Coefficient Flags and EOB Decisions --- p.45Chapter 3.2.3.4 --- Coding of ´بLEVEL' --- p.48Chapter 3.2.3.5 --- Separate Coding of Color Planes --- p.53Chapter 3.3 --- Experimental Results --- p.54Chapter 3.3.1 --- Evaluation Method --- p.54Chapter 3.3.2 --- Methods under Evaluation --- p.55Chapter 3.3.3 --- Average File Size Reduction --- p.57Chapter 3.3.4 --- File Size Reduction on Individual Images --- p.59Chapter 3.3.5 --- Performance of Individual Techniques --- p.63Chapter 3.4 --- Discussions --- p.66Chapter 4. --- Video Post-processing for H.264 --- p.67Chapter 4.1 --- Introduction --- p.67Chapter 4.2 --- Proposed Method --- p.68Chapter 4.3 --- Experimental Results --- p.69Chapter 4.3.1 --- Deblocking on Compressed Frames --- p.69Chapter 4.3.2 --- Deblocking on Residue of Compressed Frames --- p.72Chapter 4.3.3 --- Performance Investigation --- p.74Chapter 4.3.4 --- Investigation Experiment 1 --- p.75Chapter 4.3.5 --- Investigation Experiment 2 --- p.77Chapter 4.3.6 --- Investigation Experiment 3 --- p.79Chapter 4.4 --- Discussions --- p.81Chapter 5. --- Conclusions --- p.82References --- p.8
Segmentation-based video coding system allowing the manipulation of objects
This paper presents a generic video coding algorithm allowing the content-based manipulation of objects. This manipulation is possible thanks to the definition of a spatiotemporal segmentation of the sequences. The coding strategy relies on a joint optimization in the rate-distortion sense of the partition definition and of the coding techniques to be used within each region. This optimization creates the link between the analysis and synthesis parts of the coder. The analysis defines the time evolution of the partition, as well as the elimination or the appearance of regions that are homogeneous either spatially or in motion. The coding of the texture as well as of the partition relies on region-based motion compensation techniques. The algorithm offers a good compromise between the ability to track and manipulate objects and the coding efficiency.Peer ReviewedPostprint (published version
A DWT based perceptual video coding framework: concepts, issues and techniques
The work in this thesis explore the DWT based video coding by the introduction of a novel DWT (Discrete Wavelet Transform) / MC (Motion Compensation) / DPCM (Differential Pulse Code Modulation) video coding framework, which adopts the EBCOT as the coding engine for both the intra- and the inter-frame coder. The adaptive switching mechanism between the frame/field coding modes is investigated for this coding framework. The Low-Band-Shift (LBS) is employed for the MC in the DWT domain. The LBS based MC is proven to provide consistent improvement on the Peak Signal-to-Noise Ratio (PSNR) of the coded video over the simple Wavelet Tree (WT) based MC. The Adaptive Arithmetic Coding (AAC) is adopted to code the motion information. The context set of the Adaptive Binary Arithmetic Coding (ABAC) for the inter-frame data is redesigned based on the statistical analysis. To further improve the perceived picture quality, a Perceptual Distortion Measure (PDM) based on human vision model is used for the EBCOT of the intra-frame coder. A visibility assessment of the quantization error of various subbands in the DWT domain is performed through subjective tests. In summary, all these findings have solved the issues originated from the proposed perceptual video coding framework. They include: a working DWT/MC/DPCM video coding framework with superior coding efficiency on sequences with translational or head-shoulder motion; an adaptive switching mechanism between frame and field coding mode; an effective LBS based MC scheme in the DWT domain; a methodology of the context design for entropy coding of the inter-frame data; a PDM which replaces the MSE inside the EBCOT coding engine for the intra-frame coder, which provides improvement on the perceived quality of intra-frames; a visibility assessment to the quantization errors in the DWT domain
MAP Joint Source-Channel Arithmetic Decoding for Compressed Video
In order to have robust video transmission over error prone telecommunication channels several mechanisms are introduced. These mechanisms try to detect, correct or conceal the errors in the received video stream.
In this thesis, the performance of the video codec is improved in terms of error rates without increasing overhead in terms of data bit rate. This is done by exploiting the residual syntactic/semantic redundancy inside compressed video along with optimizing the configuration of the state-of-the art entropy coding, i.e., binary arithmetic coding, and optimizing the quantization of the channel output. The thesis is divided into four phases.
In the first phase, a breadth-first suboptimal sequential maximum a posteriori (MAP) decoder is employed for joint source-channel arithmetic decoding of H.264 symbols. The proposed decoder uses not only the intentional redundancy inserted via a forbidden symbol (FS) but also exploits residual redundancy by a syntax checker. In contrast to previous methods this is done as each channel bit is decoded. Simulations using intra prediction modes show improvements in error rates, e.g., syntax element error rate reduction by an order of magnitude for channel SNR of 7.33dB. The cost of this improvement is more computational complexity spent on the syntax checking.
In the second phase, the configuration of the FS in the symbol set is studied. The delay probability function, i.e., the probability of the number of bits required to detect an error, is calculated for various FS configurations. The probability of missed error detection is calculated as a figure of merit for optimizing the FS configuration. The simulation results show the effectiveness of the proposed figure of merit, and support the FS configuration in which the FS lies entirely between the other information carrying symbols to be the best.
In the third phase, a new method for estimating the a priori probability of particular syntax elements is proposed. This estimation is based on the interdependency among the syntax elements that were previously decoded. This estimation is categorized as either reliable or unreliable. The decoder uses this prior information when they are reliable, otherwise the MAP decoder considers that the syntax elements are equiprobable and in turn uses maximum likelihood (ML) decoding. The reliability detection is carried out using a threshold on the local entropy of syntax elements in the neighboring macroblocks.
In the last phase, a new measure to assess performance of the channel quantizer is proposed. This measure is based on the statistics of the rank of true candidate among the sorted list of candidates in the MAP decoder. Simulation results shows that a quantizer designed based on the proposed measure is superior to the quantizers designed based on maximum mutual information and minimum mean square error
- …