18 research outputs found

    Graph Spectral Image Processing

    Full text link
    Recent advent of graph signal processing (GSP) has spurred intensive studies of signals that live naturally on irregular data kernels described by graphs (e.g., social networks, wireless sensor networks). Though a digital image contains pixels that reside on a regularly sampled 2D grid, if one can design an appropriate underlying graph connecting pixels with weights that reflect the image structure, then one can interpret the image (or image patch) as a signal on a graph, and apply GSP tools for processing and analysis of the signal in graph spectral domain. In this article, we overview recent graph spectral techniques in GSP specifically for image / video processing. The topics covered include image compression, image restoration, image filtering and image segmentation

    Application of a Bi-Geometric Transparent Composite Model to HEVC: Residual Data Modelling and Rate Control

    Get PDF
    Among various transforms, the discrete cosine transform (DCT) is the most widely used one in multimedia compression technologies for different image or video coding standards. During the development of image or video compression, a lot of interest has been attracted to understand the statistical distribution of DCT coefficients, which would be useful to design compression techniques, such as quantization, entropy coding and rate control. Recently, a bi-geometric transparent composite model (BGTCM) has been developed to provide modelling of distribution of DCT coefficients with both simplicity and accuracy. It has been reported that for DCT coefficients obtained from original images, which is applied in image coding, a transparent composite model (TCM) can provide better modelling than Laplacian. In video compression, such as H.264/AVC, DCT is performed on residual images obtained after prediction with different transform sizes. What's more, in high efficiency video coding(HEVC) which is the newest video coding standard, besides DCT as the main transform tool, discrete sine transform (DST) and transform skip (TS) techniques are possibly performed on residual data in small blocks. As such, the distribution of transformed residual data differs from that of transformed original image data. In this thesis, the distribution of coefficients, including those from all DCT, DST and TS blocks, is analysed based on BGTCM. To be specific, firstly, the distribution of all the coefficients from the whole frame is examined. Secondly, in HEVC, the entropy coding is implemented based on the new encoding concept, coefficient group (CG) with size 4*4, where quantized coefficients are encoded with context models based on their scan indices in each CG. To simulate the encoding process, coefficients at the same scan indices among different CGs are grouped together to form a set. Distribution of coefficients in each set is analysed. Based on our result, BGTCM is better than other widely used distributions, such as Laplacian and Cauchy distributions, in both x^2 and KL-divergence testing. Furthermore, unlike the way based on Laplacian and Cauchy distribution, the BGTCM can be used to model rate-quantization (R-Q) and distortion-quantization (D-Q) models without approximation expressions. R-Q and D-Q models based on BGTCM can reflect the distribution of coefficients, which are important in rate control. In video coding, rate control involves these two models to generate a suitable quantization parameter without multi-passes encoding in order to maintain the coding efficiency and to generate required rate to satisfy rate requirement. In this thesis, based on BGTCM, rate control in HEVC is revised with much increase in coding efficiency and decrease in rate fluctuation in terms of rate variance among frames for constant bit rate requirement.1 yea

    Image representation and compression using steered hermite transforms

    Get PDF

    Three dimensional DCT based video compression.

    Get PDF
    by Chan Kwong Wing Raymond.Thesis (M.Phil.)--Chinese University of Hong Kong, 1997.Includes bibliographical references (leaves 115-123).Acknowledgments --- p.iTable of Contents --- p.ii-vList of Tables --- p.viList of Figures --- p.viiAbstract --- p.1Chapter Chapter 1 : --- IntroductionChapter 1.1 --- An Introduction to Video Compression --- p.3Chapter 1.2 --- Overview of Problems --- p.4Chapter 1.2.1 --- Analog Video and Digital Problems --- p.4Chapter 1.2.2 --- Low Bit Rate Application Problems --- p.4Chapter 1.2.3 --- Real Time Video Compression Problems --- p.5Chapter 1.2.4 --- Source Coding and Channel Coding Problems --- p.6Chapter 1.2.5 --- Bit-rate and Quality Problems --- p.7Chapter 1.3 --- Organization of the Thesis --- p.7Chapter Chapter 2 : --- Background and Related WorkChapter 2.1 --- Introduction --- p.9Chapter 2.1.1 --- Analog Video --- p.9Chapter 2.1.2 --- Digital Video --- p.10Chapter 2.1.3 --- Color Theory --- p.10Chapter 2.2 --- Video Coding --- p.12Chapter 2.2.1 --- Predictive Coding --- p.12Chapter 2.2.2 --- Vector Quantization --- p.12Chapter 2.2.3 --- Subband Coding --- p.13Chapter 2.2.4 --- Transform Coding --- p.14Chapter 2.2.5 --- Hybrid Coding --- p.14Chapter 2.3 --- Transform Coding --- p.15Chapter 2.3.1 --- Discrete Cosine Transform --- p.16Chapter 2.3.1.1 --- 1-D Fast Algorithms --- p.16Chapter 2.3.1.2 --- 2-D Fast Algorithms --- p.17Chapter 2.3.1.3 --- Multidimensional DCT Algorithms --- p.17Chapter 2.3.2 --- Quantization --- p.18Chapter 2.3.3 --- Entropy Coding --- p.18Chapter 2.3.3.1 --- Huffman Coding --- p.19Chapter 2.3.3.2 --- Arithmetic Coding --- p.19Chapter Chapter 3 : --- Existing Compression SchemeChapter 3.1 --- Introduction --- p.20Chapter 3.2 --- Motion JPEG --- p.20Chapter 3.3 --- MPEG --- p.20Chapter 3.4 --- H.261 --- p.22Chapter 3.5 --- Other Techniques --- p.23Chapter 3.5.1 --- Fractals --- p.23Chapter 3.5.2 --- Wavelets --- p.23Chapter 3.6 --- Proposed Solution --- p.24Chapter 3.7 --- Summary --- p.25Chapter Chapter 4 : --- Fast 3D-DCT AlgorithmsChapter 4.1 --- Introduction --- p.27Chapter 4.1.1 --- Motivation --- p.27Chapter 4.1.2 --- Potentials of 3D DCT --- p.28Chapter 4.2 --- Three Dimensional Discrete Cosine Transform (3D-DCT) --- p.29Chapter 4.2.1 --- Inverse 3D-DCT --- p.29Chapter 4.2.2 --- Forward 3D-DCT --- p.30Chapter 4.3 --- 3-D FCT (3-D Fast Cosine Transform Algorithm --- p.30Chapter 4.3.1 --- Partitioning and Rearrangement of Data Cube --- p.30Chapter 4.3.1.1 --- Spatio-temporal Data Cube --- p.30Chapter 4.3.1.2 --- Spatio-temporal Transform Domain Cube --- p.31Chapter 4.3.1.3 --- Coefficient Matrices --- p.31Chapter 4.3.2 --- 3-D Inverse Fast Cosine Transform (3-D IFCT) --- p.32Chapter 4.3.2.1 --- Matrix Representations --- p.32Chapter 4.3.2.2 --- Simplification of the calculation steps --- p.33Chapter 4.3.3 --- 3-D Forward Fast Cosine Transform (3-D FCT) --- p.35Chapter 4.3.3.1 --- Decomposition --- p.35Chapter 4.3.3.2 --- Reconstruction --- p.36Chapter 4.4 --- The Fast Algorithm --- p.36Chapter 4.5 --- Example using 4x4x4 IFCT --- p.38Chapter 4.6 --- Complexity Comparison --- p.43Chapter 4.6.1 --- Complexity of Multiplications --- p.43Chapter 4.6.2 --- Complexity of Additions --- p.43Chapter 4.7 --- Implementation Issues --- p.44Chapter 4.8 --- Summary --- p.46Chapter Chapter 5 : --- QuantizationChapter 5.1 --- Introduction --- p.49Chapter 5.2 --- Dynamic Ranges of 3D-DCT Coefficients --- p.49Chapter 5.3 --- Distribution of 3D-DCT AC Coefficients --- p.54Chapter 5.4 --- Quantization Volume --- p.55Chapter 5.4.1 --- Shifted Complement Hyperboloid --- p.55Chapter 5.4.2 --- Quantization Volume --- p.58Chapter 5.5 --- Scan Order for Quantized 3D-DCT Coefficients --- p.59Chapter 5.6 --- Finding Parameter Values --- p.60Chapter 5.7 --- Experimental Results from Using the Proposed Quantization Values --- p.65Chapter 5.8 --- Summary --- p.66Chapter Chapter 6 : --- Entropy CodingChapter 6.1 --- Introduction --- p.69Chapter 6.1.1 --- Huffman Coding --- p.69Chapter 6.1.2 --- Arithmetic Coding --- p.71Chapter 6.2 --- Zero Run-Length Encoding --- p.73Chapter 6.2.1 --- Variable Length Coding in JPEG --- p.74Chapter 6.2.1.1 --- Coding of the DC Coefficients --- p.74Chapter 6.2.1.2 --- Coding of the DC Coefficients --- p.75Chapter 6.2.2 --- Run-Level Encoding of the Quantized 3D-DCT Coefficients --- p.76Chapter 6.3 --- Frequency Analysis of the Run-Length Patterns --- p.76Chapter 6.3.1 --- The Frequency Distributions of the DC Coefficients --- p.77Chapter 6.3.2 --- The Frequency Distributions of the DC Coefficients --- p.77Chapter 6.4 --- Huffman Table Design --- p.84Chapter 6.4.1 --- DC Huffman Table --- p.84Chapter 6.4.2 --- AC Huffman Table --- p.85Chapter 6.5 --- Implementation Issue --- p.85Chapter 6.5.1 --- Get Category --- p.85Chapter 6.5.2 --- Huffman Encode --- p.86Chapter 6.5.3 --- Huffman Decode --- p.86Chapter 6.5.4 --- PutBits --- p.88Chapter 6.5.5 --- GetBits --- p.90Chapter Chapter 7 : --- "Contributions, Concluding Remarks and Future Work"Chapter 7.1 --- Contributions --- p.92Chapter 7.2 --- Concluding Remarks --- p.93Chapter 7.2.1 --- The Advantages of 3D DCT codec --- p.94Chapter 7.2.2 --- Experimental Results --- p.95Chapter 7.1 --- Future Work --- p.95Chapter 7.2.1 --- Integer Discrete Cosine Transform Algorithms --- p.95Chapter 7.2.2 --- Adaptive Quantization Volume --- p.96Chapter 7.2.3 --- Adaptive Huffman Tables --- p.96Appendices:Appendix A : The detailed steps in the simplification of Equation 4.29 --- p.98Appendix B : The program Listing of the Fast DCT Algorithms --- p.101Appendix C : Tables to Illustrate the Reording of the Quantized Coefficients --- p.110Appendix D : Sample Values of the Quantization Volume --- p.111Appendix E : A 16-bit VLC table for AC Run-Level Pairs --- p.113References --- p.11

    Resource-Constrained Low-Complexity Video Coding for Wireless Transmission

    Get PDF

    Selected topics on distributed video coding

    Get PDF
    Distributed Video Coding (DVC) is a new paradigm for video compression based on the information theoretical results of Slepian and Wolf (SW), and Wyner and Ziv (WZ). While conventional coding has a rigid complexity allocation as most of the complex tasks are performed at the encoder side, DVC enables a flexible complexity allocation between the encoder and the decoder. The most novel and interesting case is low complexity encoding and complex decoding, which is the opposite of conventional coding. While the latter is suitable for applications where the cost of the decoder is more critical than the encoder's one, DVC opens the door for a new range of applications where low complexity encoding is required and the decoder's complexity is not critical. This is interesting with the deployment of small and battery-powered multimedia mobile devices all around in our daily life. Further, since DVC operates as a reversed-complexity scheme when compared to conventional coding, DVC also enables the interesting scenario of low complexity encoding and decoding between two ends by transcoding between DVC and conventional coding. More specifically, low complexity encoding is possible by DVC at one end. Then, the resulting stream is decoded and conventionally re-encoded to enable low complexity decoding at the other end. Multiview video is attractive for a wide range of applications such as free viewpoint television, which is a system that allows viewing the scene from a viewpoint chosen by the viewer. Moreover, multiview can be beneficial for monitoring purposes in video surveillance. The increased use of multiview video systems is mainly due to the improvements in video technology and the reduced cost of cameras. While a multiview conventional codec will try to exploit the correlation among the different cameras at the encoder side, DVC allows for separate encoding of correlated video sources. Therefore, DVC requires no communication between the cameras in a multiview scenario. This is an advantage since communication is time consuming (i.e. more delay) and requires complex networking. Another appealing feature of DVC is the fact that it is based on a statistical framework. Moreover, DVC behaves as a natural joint source-channel coding solution. This results in an improved error resilience performance when compared to conventional coding. Further, DVC-based scalable codecs do not require a deterministic knowledge of the lower layers. In other words, the enhancement layers are completely independent from the base layer codec. This is called the codec-independent scalability feature, which offers a high flexibility in the way the various layers are distributed in a network. This thesis addresses the following topics: First, the theoretical foundations of DVC as well as the practical DVC scheme used in this research are presented. The potential applications for DVC are also outlined. DVC-based schemes use conventional coding to compress parts of the data, while the rest is compressed in a distributed fashion. Thus, different conventional codecs are studied in this research as they are compared in terms of compression efficiency for a rich set of sequences. This includes fine tuning the compression parameters such that the best performance is achieved for each codec. Further, DVC tools for improved Side Information (SI) and Error Concealment (EC) are introduced for monoview DVC using a partially decoded frame. The improved SI results in a significant gain in reconstruction quality for video with high activity and motion. This is done by re-estimating the erroneous motion vectors using the partially decoded frame to improve the SI quality. The latter is then used to enhance the reconstruction of the finally decoded frame. Further, the introduced spatio-temporal EC improves the quality of decoded video in the case of erroneously received packets, outperforming both spatial and temporal EC. Moreover, it also outperforms error-concealed conventional coding in different modes. Then, multiview DVC is studied in terms of SI generation, which differentiates it from the monoview case. More specifically, different multiview prediction techniques for SI generation are described and compared in terms of prediction quality, complexity and compression efficiency. Further, a technique for iterative multiview SI is introduced, where the final SI is used in an enhanced reconstruction process. The iterative SI outperforms the other SI generation techniques, especially for high motion video content. Finally, fusion techniques of temporal and inter-view side informations are introduced as well, which improves the performance of multiview DVC over monoview coding. DVC is also used to enable scalability for image and video coding. Since DVC is based on a statistical framework, the base and enhancement layers are completely independent, which is an interesting property called codec-independent scalability. Moreover, the introduced DVC scalable schemes show a good robustness to errors as the quality of decoded video steadily decreases with error rate increase. On the other hand, conventional coding exhibits a cliff effect as the performance drops dramatically after a certain error rate value. Further, the issue of privacy protection is addressed for DVC by transform domain scrambling, which is used to alter regions of interest in video such that the scene is still understood and privacy is preserved as well. The proposed scrambling techniques are shown to provide a good level of security without impairing the performance of the DVC scheme when compared to the one without scrambling. This is particularly attractive for video surveillance scenarios, which is one of the most promising applications for DVC. Finally, a practical DVC demonstrator built during this research is described, where the main requirements as well as the observed limitations are presented. Furthermore, it is defined in a setup being as close as possible to a complete real application scenario. This shows that it is actually possible to implement a complete end-to-end practical DVC system relying only on realistic assumptions. Even though DVC is inferior in terms of compression efficiency to the state of the art conventional coding for the moment, strengths of DVC reside in its good error resilience properties and the codec-independent scalability feature. Therefore, DVC offers promising possibilities for video compression with transmission over error-prone environments requirement as it significantly outperforms conventional coding in this case

    Wavelet-based image compression for mobile applications.

    Get PDF
    The transmission of digital colour images is rapidly becoming popular on mobile telephones, Personal Digital Assistant (PDA) technology and other wireless based image services. However, transmitting digital colour images via mobile devices is badly affected by low air bandwidth. Advances in communications Channels (example 3G communication network) go some way to addressing this problem but the rapid increase in traffic and demand for ever better quality images, means that effective data compression techniques are essential for transmitting and storing digital images. The main objective of this thesis is to offer a novel image compression technique that can help to overcome the bandwidth problem. This thesis has investigated and implemented three different wavelet-based compression schemes with a focus on a suitable compression method for mobile applications. The first described algorithm is a dual wavelet compression algorithm, which is a modified conventional wavelet compression method. The algorithm uses different wavelet filters to decompose the luminance and chrominance components separately. In addition, different levels of decomposition can also be applied to each component separately. The second algorithm is segmented wavelet-based, which segments an image into its smooth and nonsmooth parts. Different wavelet filters are then applied to the segmented parts of the image. Finally, the third algorithm is the hybrid wavelet-based compression System (HWCS), where the subject of interest is cropped and is then compressed using a wavelet-based method. The details of the background are reduced by averaging it and sending the background separately from the compressed subject of interest. The final image is reconstructed by replacing the averaged background image pixels with the compressed cropped image. For each algorithm the experimental results presented in this thesis clearly demonstrated that encoder output can be effectively reduced while maintaining an acceptable image visual quality particularly when compared to a conventional wavelet-based compression scheme

    Recent Advances in Signal Processing

    Get PDF
    The signal processing task is a very critical issue in the majority of new technological inventions and challenges in a variety of applications in both science and engineering fields. Classical signal processing techniques have largely worked with mathematical models that are linear, local, stationary, and Gaussian. They have always favored closed-form tractability over real-world accuracy. These constraints were imposed by the lack of powerful computing tools. During the last few decades, signal processing theories, developments, and applications have matured rapidly and now include tools from many areas of mathematics, computer science, physics, and engineering. This book is targeted primarily toward both students and researchers who want to be exposed to a wide variety of signal processing techniques and algorithms. It includes 27 chapters that can be categorized into five different areas depending on the application at hand. These five categories are ordered to address image processing, speech processing, communication systems, time-series analysis, and educational packages respectively. The book has the advantage of providing a collection of applications that are completely independent and self-contained; thus, the interested reader can choose any chapter and skip to another without losing continuity

    High throughput image compression and decompression on GPUs

    Get PDF
    Diese Arbeit befasst sich mit der Entwicklung eines GPU-freundlichen, intra-only, Wavelet-basierten Videokompressionsverfahrens mit hohem Durchsatz, das für visuell verlustfreie Anwendungen optimiert ist. Ausgehend von der Beobachtung, dass der JPEG 2000 Entropie-Kodierer ein Flaschenhals ist, werden verschiedene algorithmische Änderungen vorgeschlagen und bewertet. Zunächst wird der JPEG 2000 Selective Arithmetic Coding Mode auf der GPU realisiert, wobei sich die Erhöhung des Durchsatzes hierdurch als begrenzt zeigt. Stattdessen werden zwei nicht standard-kompatible Änderungen vorgeschlagen, die (1) jede Bitebebene in nur einem einzelnen Pass verarbeiten (Single-Pass-Modus) und (2) einen echten Rohcodierungsmodus einführen, der sample-weise parallelisierbar ist und keine aufwendige Kontextmodellierung erfordert. Als nächstes wird ein alternativer Entropiekodierer aus der Literatur, der Bitplane Coder with Parallel Coefficient Processing (BPC-PaCo), evaluiert. Er gibt Signaladaptivität zu Gunsten von höherer Parallelität auf und daher wird hier untersucht und gezeigt, dass ein aus verschiedensten Testsequenzen gemitteltes statisches Wahrscheinlichkeitsmodell eine kompetitive Kompressionseffizienz erreicht. Es wird zudem eine Kombination von BPC-PaCo mit dem Single-Pass-Modus vorgeschlagen, der den Speedup gegenüber dem JPEG 2000 Entropiekodierer von 2,15x (BPC-PaCo mit zwei Pässen) auf 2,6x (BPC-PaCo mit Single-Pass-Modus) erhöht auf Kosten eines um 0,3 dB auf 1,0 dB erhöhten Spitzen-Signal-Rausch-Verhältnis (PSNR). Weiter wird ein paralleler Algorithmus zur Post-Compression Ratenkontrolle vorgestellt sowie eine parallele Codestream-Erstellung auf der GPU. Es wird weiterhin ein theoretisches Laufzeitmodell formuliert, das es durch Benchmarking von einer GPU ermöglicht die Laufzeit einer Routine auf einer anderen GPU vorherzusagen. Schließlich wird der erste JPEG XS GPU Decoder vorgestellt und evaluiert. JPEG XS wurde als Low Complexity Codec konzipiert und forderte erstmals explizit GPU-Freundlichkeit bereits im Call for Proposals. Ab Bitraten über 1 bpp ist der Decoder etwa 2x schneller im Vergleich zu JPEG 2000 und 1,5x schneller als der schnellste hier vorgestellte Entropiekodierer (BPC-PaCo mit Single-Pass-Modus). Mit einer GeForce GTX 1080 wird ein Decoder Durchsatz von rund 200 fps für eine UHD-4:4:4-Sequenz erreicht.This work investigates possibilities to create a high throughput, GPU-friendly, intra-only, Wavelet-based video compression algorithm optimized for visually lossless applications. Addressing the key observation that JPEG 2000’s entropy coder is a bottleneck and might be overly complex for a high bit rate scenario, various algorithmic alterations are proposed. First, JPEG 2000’s Selective Arithmetic Coding mode is realized on the GPU, but the gains in terms of an increased throughput are shown to be limited. Instead, two independent alterations not compliant to the standard are proposed, that (1) give up the concept of intra-bit plane truncation points and (2) introduce a true raw-coding mode that is fully parallelizable and does not require any context modeling. Next, an alternative block coder from the literature, the Bitplane Coder with Parallel Coefficient Processing (BPC-PaCo), is evaluated. Since it trades signal adaptiveness for increased parallelism, it is shown here how a stationary probability model averaged from a set of test sequences yields competitive compression efficiency. A combination of BPC-PaCo with the single-pass mode is proposed and shown to increase the speedup with respect to the original JPEG 2000 entropy coder from 2.15x (BPC-PaCo with two passes) to 2.6x (proposed BPC-PaCo with single-pass mode) at the marginal cost of increasing the PSNR penalty by 0.3 dB to at most 1 dB. Furthermore, a parallel algorithm is presented that determines the optimal code block bit stream truncation points (given an available bit rate budget) and builds the entire code stream on the GPU, reducing the amount of data that has to be transferred back into host memory to a minimum. A theoretical runtime model is formulated that allows, based on benchmarking results on one GPU, to predict the runtime of a kernel on another GPU. Lastly, the first ever JPEG XS GPU-decoder realization is presented. JPEG XS was designed to be a low complexity codec and for the first time explicitly demanded GPU-friendliness already in the call for proposals. Starting at bit rates above 1 bpp, the decoder is around 2x faster compared to the original JPEG 2000 and 1.5x faster compared to JPEG 2000 with the fastest evaluated entropy coder (BPC-PaCo with single-pass mode). With a GeForce GTX 1080, a decoding throughput of around 200 fps is achieved for a UHD 4:4:4 sequence
    corecore