20 research outputs found

    Optimization of image coding algorithms and architectures using genetic algorithms

    Get PDF

    Scalable video compression with optimized visual performance and random accessibility

    Full text link
    This thesis is concerned with maximizing the coding efficiency, random accessibility and visual performance of scalable compressed video. The unifying theme behind this work is the use of finely embedded localized coding structures, which govern the extent to which these goals may be jointly achieved. The first part focuses on scalable volumetric image compression. We investigate 3D transform and coding techniques which exploit inter-slice statistical redundancies without compromising slice accessibility. Our study shows that the motion-compensated temporal discrete wavelet transform (MC-TDWT) practically achieves an upper bound to the compression efficiency of slice transforms. From a video coding perspective, we find that most of the coding gain is attributed to offsetting the learning penalty in adaptive arithmetic coding through 3D code-block extension, rather than inter-frame context modelling. The second aspect of this thesis examines random accessibility. Accessibility refers to the ease with which a region of interest is accessed (subband samples needed for reconstruction are retrieved) from a compressed video bitstream, subject to spatiotemporal code-block constraints. We investigate the fundamental implications of motion compensation for random access efficiency and the compression performance of scalable interactive video. We demonstrate that inclusion of motion compensation operators within the lifting steps of a temporal subband transform incurs a random access penalty which depends on the characteristics of the motion field. The final aspect of this thesis aims to minimize the perceptual impact of visible distortion in scalable reconstructed video. We present a visual optimization strategy based on distortion scaling which raises the distortion-length slope of perceptually significant samples. This alters the codestream embedding order during post-compression rate-distortion optimization, thus allowing visually sensitive sites to be encoded with higher fidelity at a given bit-rate. For visual sensitivity analysis, we propose a contrast perception model that incorporates an adaptive masking slope. This versatile feature provides a context which models perceptual significance. It enables scene structures that otherwise suffer significant degradation to be preserved at lower bit-rates. The novelty in our approach derives from a set of "perceptual mappings" which account for quantization noise shaping effects induced by motion-compensated temporal synthesis. The proposed technique reduces wavelet compression artefacts and improves the perceptual quality of video

    Video post processing architectures

    Get PDF

    Selected topics in video coding and computer vision

    Get PDF
    Video applications ranging from multimedia communication to computer vision have been extensively studied in the past decades. However, the emergence of new applications continues to raise questions that are only partially answered by existing techniques. This thesis studies three selected topics related to video: intra prediction in block-based video coding, pedestrian detection and tracking in infrared imagery, and multi-view video alignment.;In the state-of-art video coding standard H.264/AVC, intra prediction is defined on the hierarchical quad-tree based block partitioning structure which fails to exploit the geometric constraint of edges. We propose a geometry-adaptive block partitioning structure and a new intra prediction algorithm named geometry-adaptive intra prediction (GAIP). A new texture prediction algorithm named geometry-adaptive intra displacement prediction (GAIDP) is also developed by extending the original intra displacement prediction (IDP) algorithm with the geometry-adaptive block partitions. Simulations on various test sequences demonstrate that intra coding performance of H.264/AVC can be significantly improved by incorporating the proposed geometry adaptive algorithms.;In recent years, due to the decreasing cost of thermal sensors, pedestrian detection and tracking in infrared imagery has become a topic of interest for night vision and all weather surveillance applications. We propose a novel approach for detecting and tracking pedestrians in infrared imagery based on a layered representation of infrared images. Pedestrians are detected from the foreground layer by a Principle Component Analysis (PCA) based scheme using the appearance cue. To facilitate the task of pedestrian tracking, we formulate the problem of shot segmentation and present a graph matching-based tracking algorithm. Simulations with both OSU Infrared Image Database and WVU Infrared Video Database are reported to demonstrate the accuracy and robustness of our algorithms.;Multi-view video alignment is a process to facilitate the fusion of non-synchronized multi-view video sequences for various applications including automatic video based surveillance and video metrology. In this thesis, we propose an accurate multi-view video alignment algorithm that iteratively aligns two sequences in space and time. To achieve an accurate sub-frame temporal alignment, we generalize the existing phase-correlation algorithm to 3-D case. We also present a novel method to obtain the ground-truth of the temporal alignment by using supplementary audio signals sampled at a much higher rate. The accuracy of our algorithm is verified by simulations using real-world sequences

    A Novel Multi-Symbol Curve Fit based CABAC Framework for Hybrid Video Codec's with Improved Coding Efficiency and Throughput

    Get PDF
    Video compression is an essential component of present-day applications and a decisive factor between the success or failure of a business model. There is an ever increasing demand to transmit larger number of superior-quality video channels into the available transmission bandwidth. Consumers are increasingly discerning about the quality and performance of video-based products and there is therefore a strong incentive for continuous improvement in video coding technology for companies to have market edge over its competitors. Even though processor speeds and network bandwidths continue to increase, a better video compression results in a more competitive product. This drive to improve video compression technology has led to a revolution in the last decade. In this thesis we addresses some of these data compression problems in a practical multimedia system that employ Hybrid video coding schemes. Typically Real life video signals show non-stationary statistical behavior. The statistics of these signals largely depend on the video content and the acquisition process. Hybrid video coding schemes like H264/AVC exploits some of the non-stationary characteristics but certainly not all of it. Moreover, higher order statistical dependencies on a syntax element level are mostly neglected in existing video coding schemes. Designing a video coding scheme for a video coder by taking into consideration these typically observed statistical properties, however, offers room for significant improvements in coding efficiency.In this thesis work a new frequency domain curve-fitting compression framework is proposed as an extension to H264 Context Adaptive Binary Arithmetic Coder (CABAC) that achieves better compression efficiency at reduced complexity. The proposed Curve-Fitting extension to H264 CABAC, henceforth called as CF-CABAC, is modularly designed to conveniently fit into existing block based H264 Hybrid video Entropy coding algorithms. Traditionally there have been many proposals in the literature to fuse surfaces/curve fitting with Block-based, Region based, Training-based (VQ, fractals) compression algorithms primarily to exploiting pixel- domain redundancies. Though the compression efficiency of these are expectantly better than DCT transform based compression, but their main drawback is the high computational demand which make the former techniques non-competitive for real-time applications over the latter. The curve fitting techniques proposed so far have been on the pixel domain. The video characteristic on the pixel domain are highly non-stationary making curve fitting techniques not very efficient in terms of video quality, compression ratio and complexity. In this thesis, we explore using curve fitting techniques to Quantized frequency domain coefficients. we fuse this powerful technique to H264 CABAC Entropy coding. Based on some predictable characteristics of Quantized DCT coefficients, a computationally in-expensive curve fitting technique is explored that fits into the existing H264 CABAC framework. Also Due to the lossy nature of video compression and the strong demand for bandwidth and computation resources in a multimedia system, one of the key design issues for video coding is to optimize trade-off among quality (distortion) vs compression (rate) vs complexity. This thesis also briefly studies the existing rate distortion (RD) optimization approaches proposed to video coding for exploring the best RD performance of a video codec. Further, we propose a graph based algorithm for Rate-distortion. optimization of quantized coefficient indices for the proposed CF-CABAC entropy coding

    Audio/Video Transmission over IEEE 802.11e Networks: Retry Limit Adaptation and Distortion Estimation

    Get PDF
    The objective of this thesis focuses on the audio and video transmission over wireless networks adopting the family of the IEEE 802.11x standards. In particular, this thesis discusses about the resolution of four issues: the adaptive retransmission, the comparison of video quality indexes for retry limit adaptation purposes, the estimation of the distortion and the joint adaptation of the maximum number of retransmissions of voice and video flows

    Energy efficient hardware acceleration of multimedia processing tools

    Get PDF
    The world of mobile devices is experiencing an ongoing trend of feature enhancement and generalpurpose multimedia platform convergence. This trend poses many grand challenges, the most pressing being their limited battery life as a consequence of delivering computationally demanding features. The envisaged mobile application features can be considered to be accelerated by a set of underpinning hardware blocks Based on the survey that this thesis presents on modem video compression standards and their associated enabling technologies, it is concluded that tight energy and throughput constraints can still be effectively tackled at algorithmic level in order to design re-usable optimised hardware acceleration cores. To prove these conclusions, the work m this thesis is focused on two of the basic enabling technologies that support mobile video applications, namely the Shape Adaptive Discrete Cosine Transform (SA-DCT) and its inverse, the SA-IDCT. The hardware architectures presented in this work have been designed with energy efficiency in mind. This goal is achieved by employing high level techniques such as redundant computation elimination, parallelism and low switching computation structures. Both architectures compare favourably against the relevant pnor art in the literature. The SA-DCT/IDCT technologies are instances of a more general computation - namely, both are Constant Matrix Multiplication (CMM) operations. Thus, this thesis also proposes an algorithm for the efficient hardware design of any general CMM-based enabling technology. The proposed algorithm leverages the effective solution search capability of genetic programming. A bonus feature of the proposed modelling approach is that it is further amenable to hardware acceleration. Another bonus feature is an early exit mechanism that achieves large search space reductions .Results show an improvement on state of the art algorithms with future potential for even greater savings

    Proceedings of the Third International Mobile Satellite Conference (IMSC 1993)

    Get PDF
    Satellite-based mobile communications systems provide voice and data communications to users over a vast geographic area. The users may communicate via mobile or hand-held terminals, which may also provide access to terrestrial cellular communications services. While the first and second International Mobile Satellite Conferences (IMSC) mostly concentrated on technical advances, this Third IMSC also focuses on the increasing worldwide commercial activities in Mobile Satellite Services. Because of the large service areas provided by such systems, it is important to consider political and regulatory issues in addition to technical and user requirements issues. Topics covered include: the direct broadcast of audio programming from satellites; spacecraft technology; regulatory and policy considerations; advanced system concepts and analysis; propagation; and user requirements and applications
    corecore