11 research outputs found
Image and Video Coding/Transcoding: A Rate Distortion Approach
Due to the lossy nature of image/video compression and the expensive bandwidth and computation resources in a multimedia system, one of the key design issues for image and video coding/transcoding is to optimize trade-off among distortion, rate, and/or complexity. This thesis studies the application of rate distortion (RD) optimization approaches to image and video coding/transcoding for exploring the best RD performance of a video codec compatible to the newest video coding standard H.264 and for designing computationally efficient down-sampling algorithms with high visual fidelity in the discrete Cosine transform (DCT) domain.
RD optimization for video coding in this thesis considers two objectives, i.e., to achieve the best encoding efficiency in terms of minimizing the actual RD cost and to maintain decoding compatibility with the newest video coding standard H.264. By the actual RD cost, we mean a cost based on the final reconstruction error and the entire coding rate. Specifically, an operational RD method is proposed based on a soft decision quantization (SDQ) mechanism, which has its root in a fundamental RD theoretic study on fixed-slope lossy data compression. Using SDQ instead of hard decision quantization, we establish a general framework in which motion prediction, quantization, and entropy coding in a hybrid video coding scheme such as H.264 are jointly designed to minimize the actual RD cost on a frame basis. The proposed framework is applicable to optimize any hybrid video coding scheme, provided that specific algorithms are designed corresponding to coding syntaxes of a given standard codec, so as to maintain compatibility with the standard.
Corresponding to the baseline profile syntaxes and the main profile syntaxes of H.264, respectively, we have proposed three RD algorithms---a graph-based algorithm for SDQ given motion prediction and quantization step sizes, an algorithm for residual coding optimization given motion prediction, and an iterative overall algorithm for jointly optimizing motion prediction, quantization, and entropy coding---with them embedded in the indicated order. Among the three algorithms, the SDQ design is the core, which is developed based on a given entropy coding method. Specifically, two SDQ algorithms have been developed based on the context adaptive variable length coding (CAVLC) in H.264 baseline profile and the context adaptive binary arithmetic coding (CABAC) in H.264 main profile, respectively.
Experimental results for the H.264 baseline codec optimization show that for a set of typical testing sequences, the proposed RD method for H.264 baseline coding achieves a better trade-off between rate and distortion, i.e., 12\% rate reduction on average at the same distortion (ranging from 30dB to 38dB by PSNR) when compared with the RD optimization method implemented in H.264 baseline reference codec. Experimental results for optimizing H.264 main profile coding with CABAC show 10\% rate reduction over a main profile reference codec using CABAC, which also suggests 20\% rate reduction over the RD optimization method implemented in H.264 baseline reference codec, leading to our claim of having developed the best codec in terms of RD performance, while maintaining the compatibility with H.264.
By investigating trade-off between distortion and complexity, we have also proposed a designing framework for image/video transcoding with spatial resolution reduction, i.e., to down-sample compressed images/video with an arbitrary ratio in the DCT domain. First, we derive a set of DCT-domain down-sampling methods, which can be represented by a linear transform with double-sided matrix multiplication (LTDS) in the DCT domain. Then, for a pre-selected pixel-domain down-sampling method, we formulate an optimization problem for finding an LTDS to approximate the given pixel-domain method to achieve the best trade-off between visual quality and computational complexity. The problem is then solved by modeling an LTDS with a multi-layer perceptron network and using a structural learning with forgetting algorithm for training the network. Finally, by selecting a pixel-domain reference method with the popular Butterworth lowpass filtering and cubic B-spline interpolation, the proposed framework discovers an LTDS with better visual quality and lower computational complexity when compared with state-of-the-art methods in the literature
Wavelets and Subband Coding
First published in 1995, Wavelets and Subband Coding offered a unified view of the exciting field of wavelets and their discrete-time cousins, filter banks, or subband coding. The book developed the theory in both continuous and discrete time, and presented important applications. During the past decade, it filled a useful need in explaining a new view of signal processing based on flexible time-frequency analysis and its applications. Since 2007, the authors now retain the copyright and allow open access to the book
CHORUS Deliverable 2.2: Second report - identification of multi-disciplinary key issues for gap analysis toward EU multimedia search engines roadmap
After addressing the state-of-the-art during the first year of Chorus and establishing the existing landscape in
multimedia search engines, we have identified and analyzed gaps within European research effort during our second year.
In this period we focused on three directions, notably technological issues, user-centred issues and use-cases and socio-
economic and legal aspects. These were assessed by two central studies: firstly, a concerted vision of functional breakdown
of generic multimedia search engine, and secondly, a representative use-cases descriptions with the related discussion on
requirement for technological challenges. Both studies have been carried out in cooperation and consultation with the
community at large through EC concertation meetings (multimedia search engines cluster), several meetings with our
Think-Tank, presentations in international conferences, and surveys addressed to EU projects coordinators as well as
National initiatives coordinators. Based on the obtained feedback we identified two types of gaps, namely core
technological gaps that involve research challenges, and “enablers”, which are not necessarily technical research
challenges, but have impact on innovation progress. New socio-economic trends are presented as well as emerging legal
challenges
Recommended from our members
Demodulator techniques in satellite communication systems for direct broadcast systems
This thesis is concerned with the FM demodulator techniques used in terrestrial TV receiver designs for Direct Broadcast Systems (DBS) from satellites. The various MAC/Packet schemes intended for DBS applications are described and the international standards that apply to them considered, with particular emphasis on the D2-MAC system. Noise in FM systems is discussed and a suitable threshold noise model is chosen for use in DBS TV demodulator systems. The characteristics of the various types of noise effects are considered in terms of their effect upon the TV picture. The threshold performance of a conventional FM demodulator for differing types of modulation is reviewed and it is shown how the threshold characteristic depends upon the nature of the modulation. The literature review carried out represents a significant component of the thesis and combines material from patent literature with more conventional source materials from professional journals, conferences, textbooks, etc.
Some ten existing demodulator concepts that exhibit threshold extension characteristics are examined, and where relevant their potential performance in D2-MAC format systems is assessed. The demodulator characteristics that limit their performance in TV systems are identified. It is concluded that designing a threshold extension demodulator, with reliable operation, for all picture contents and for a wide range of input carrier-to-noise ratios, is a formidable task using existing design techniques. On the basis of this examination an adaptive threshold extension demodulator concept is proposed, that utilises information contained within the signal structure to achieve an improved performance over a wide range of input carrier-to-noise ratios and picture content. It is shown how the relevant signal structures may be derived from conventional (PAL, SECAM and NTSC), MAC format and all-digital television systems. Illustrations are given that show how the adaptive demodulator concept can be applied to certain existing threshold extension demodulators, enhancing their performance for television picture reception. Future trends in all-digital DBS TV systems intended ultimately for DBS applications are briefly discussed together with their demodlilation requirements
Recent Advances in Signal Processing
The signal processing task is a very critical issue in the majority of new technological inventions and challenges in a variety of applications in both science and engineering fields. Classical signal processing techniques have largely worked with mathematical models that are linear, local, stationary, and Gaussian. They have always favored closed-form tractability over real-world accuracy. These constraints were imposed by the lack of powerful computing tools. During the last few decades, signal processing theories, developments, and applications have matured rapidly and now include tools from many areas of mathematics, computer science, physics, and engineering. This book is targeted primarily toward both students and researchers who want to be exposed to a wide variety of signal processing techniques and algorithms. It includes 27 chapters that can be categorized into five different areas depending on the application at hand. These five categories are ordered to address image processing, speech processing, communication systems, time-series analysis, and educational packages respectively. The book has the advantage of providing a collection of applications that are completely independent and self-contained; thus, the interested reader can choose any chapter and skip to another without losing continuity
Speech coding at medium bit rates using analysis by synthesis techniques
Speech coding at medium bit rates using analysis by synthesis technique
Social work with airports passengers
Social work at the airport is in to offer to passengers social services. The main
methodological position is that people are under stress, which characterized by a
particular set of characteristics in appearance and behavior. In such circumstances
passenger attracts in his actions some attention. Only person whom he trusts can help him
with the documents or psychologically
Proceedings of the Fifth International Mobile Satellite Conference 1997
Satellite-based mobile communications systems provide voice and data communications to users over a vast geographic area. The users may communicate via mobile or hand-held terminals, which may also provide access to terrestrial communications services. While previous International Mobile Satellite Conferences have concentrated on technical advances and the increasing worldwide commercial activities, this conference focuses on the next generation of mobile satellite services. The approximately 80 papers included here cover sessions in the following areas: networking and protocols; code division multiple access technologies; demand, economics and technology issues; current and planned systems; propagation; terminal technology; modulation and coding advances; spacecraft technology; advanced systems; and applications and experiments