9 research outputs found

    Wavelet-Based Embedded Rate Scalable Still Image Coders: A review

    Get PDF
    Embedded scalable image coding algorithms based on the wavelet transform have received considerable attention lately in academia and in industry in terms of both coding algorithms and standards activity. In addition to providing a very good coding performance, the embedded coder has the property that the bit stream can be truncated at any point and still decodes a reasonably good image. In this paper we present some state-of-the-art wavelet-based embedded rate scalable still image coders. In addition, the JPEG2000 still image compression standard is presented.

    The JPEG2000 still image compression standard

    Get PDF
    The development of standards (emerging and established) by the International Organization for Standardization (ISO), the International Telecommunications Union (ITU), and the International Electrotechnical Commission (IEC) for audio, image, and video, for both transmission and storage, has led to worldwide activity in developing hardware and software systems and products applicable to a number of diverse disciplines [7], [22], [23], [55], [56], [73]. Although the standards implicitly address the basic encoding operations, there is freedom and flexibility in the actual design and development of devices. This is because only the syntax and semantics of the bit stream for decoding are specified by standards, their main objective being the compatibility and interoperability among the systems (hardware/software) manufactured by different companies. There is, thus, much room for innovation and ingenuity. Since the mid 1980s, members from both the ITU and the ISO have been working together to establish a joint international standard for the compression of grayscale and color still images. This effort has been known as JPEG, the Join

    The JPEG 2000 still image compression standard

    Get PDF
    With the increasing use of multimedia technologies, image compression requires higher performance as well as new features. To address this need in the specific area of still image encoding, a new standard is currently being developed, the JPEC2000. It is not only intended to provide rate-distortion and subjective image quality performance superior to existing standards, but also to provide features and functionalities that current standards can either not address efficiently or in many cases cannot address at all. Lossless and lossy compression, embedded lossy to lossless coding, progressive transmission by pixel accuracy and by resolution, robustness to the presence of bit-errors and region-of-interest coding, are some representative features. It is interesting to note that JPEG2000 is being designed to address the requirements of a diversity of applications, e.g. Internet, color facsimile, printing, scanning, digital photography, remote sensing, mobile applications, medical imagery, digital library and E-commerce

    Centralized and distributed semi-parametric compression of piecewise smooth functions

    No full text
    This thesis introduces novel wavelet-based semi-parametric centralized and distributed compression methods for a class of piecewise smooth functions. Our proposed compression schemes are based on a non-conventional transform coding structure with simple independent encoders and a complex joint decoder. Current centralized state-of-the-art compression schemes are based on the conventional structure where an encoder is relatively complex and nonlinear. In addition, the setting usually allows the encoder to observe the entire source. Recently, there has been an increasing need for compression schemes where the encoder is lower in complexity and, instead, the decoder has to handle more computationally intensive tasks. Furthermore, the setup may involve multiple encoders, where each one can only partially observe the source. Such scenario is often referred to as distributed source coding. In the first part, we focus on the dual situation of the centralized compression where the encoder is linear and the decoder is nonlinear. Our analysis is centered around a class of 1-D piecewise smooth functions. We show that, by incorporating parametric estimation into the decoding procedure, it is possible to achieve the same distortion- rate performance as that of a conventional wavelet-based compression scheme. We also present a new constructive approach to parametric estimation based on the sampling results of signals with finite rate of innovation. The second part of the thesis focuses on the distributed compression scenario, where each independent encoder partially observes the 1-D piecewise smooth function. We propose a new wavelet-based distributed compression scheme that uses parametric estimation to perform joint decoding. Our distortion-rate analysis shows that it is possible for the proposed scheme to achieve that same compression performance as that of a joint encoding scheme. Lastly, we apply the proposed theoretical framework in the context of distributed image and video compression. We start by considering a simplified model of the video signal and show that we can achieve distortion-rate performance close to that of a joint encoding scheme. We then present practical compression schemes for real world signals. Our simulations confirm the improvement in performance over classical schemes, both in terms of the PSNR and the visual quality

    Data-efficient neural network training with dataset condensation

    Get PDF
    The state of the art in many data driven fields including computer vision and natural language processing typically relies on training larger models on bigger data. It is reported by OpenAI that the computational cost to achieve the state of the art doubles every 3.4 months in the deep learning era. In contrast, the GPU computation power doubles every 21.4 months, which is significantly slower. Thus, advancing deep learning performance by consuming more hardware resources is not sustainable. How to reduce the training cost while preserving the generalization performance is a long standing goal in machine learning. This thesis investigates a largely under-explored while promising solution - dataset condensation which aims to condense a large training set into a small set of informative synthetic samples for training deep models and achieve close performance to models trained on the original dataset. In this thesis, we investigate how to condense image datasets for classification tasks. We propose three methods for image dataset condensation. Our methods can be applied to condense other kinds of datasets for different learning tasks, such as text data, graph data and medical images, and we discuss it in Section 6.1. First, we propose a principled method that formulates the goal of learning a small synthetic set as a gradient matching problem with respect to the gradients of deep neural network weights that are trained on the original and synthetic data. A new gradient/weight matching loss is designed for robust matching of different neural architectures. We evaluate its performance in several image classification benchmarks and explore the usage of our method in continual learning and neural architecture search. In the second work, we propose to further improve the data-efficiency of training neural networks with synthetic data by enabling effective data augmentation. Specifically, we propose Differentiable Siamese Augmentation and learn better synthetic data that can be used more effectively with data augmentation and thus achieve better performance when training networks with data augmentation. Experiments verify that the proposed method obtains substantial gains over the state of the art. While training deep models on the small set of condensed images can be extremely fast, their synthesis remains computationally expensive due to the complex bi-level optimization. Finally, we propose a simple yet effective method that synthesizes condensed images by matching feature distributions of the synthetic and original training images when being embedded by randomly sampled deep networks. Thanks to its efficiency, we apply our method to more realistic and larger datasets with sophisticated neural architectures and obtain a significant performance boost. In summary, this manuscript presents several important contributions that improve data efficiency of training deep neural networks by condensing large datasets into significantly smaller synthetic ones. The innovations focus on principled methods based on gradient matching, higher data-efficiency with differentiable Siamese augmentation, and extremely simple and fast distribution matching without bilevel optimization. The proposed methods are evaluated on popular image classification datasets, namely MNIST, FashionMNIST, SVHN, CIFAR10/100 and TinyImageNet. The code is available at https://github.com/VICO-UoE/DatasetCondensation

    Depth-Map Image Compression Based on Region and Contour Modeling

    Get PDF
    In this thesis, the problem of depth-map image compression is treated. The compilation of articles included in the thesis provides methodological contributions in the fields of lossless and lossy compression of depth-map images.The first group of methods addresses the lossless compression problem. The introduced methods are using the approach of representing the depth-map image in terms of regions and contours. In the depth-map image, a segmentation defines the regions, by grouping pixels having similar properties, and separates them using (region) contours. The depth-map image is encoded by the contours and the auxiliary information needed to reconstruct the depth values in each region.One way of encoding the contours is to describe them using two matrices of horizontal and vertical contour edges. The matrices are encoded using template context coding where each context tree is optimally pruned. In certain contexts, the contour edges are found deterministically using only the currently available information. Another way of encoding the contours is to describe them as a sequence of contour segments. Each such segment is defined by an anchor (starting) point and a string of contour edges, equivalent to a string of chain-code symbols. Here we propose efficient ways to select and encode the anchor points and to generate contour segments by using a contour crossing point analysis and by imposing rules that help in minimizing the number of anchor points.The regions are reconstructed at the decoder using predictive coding or the piecewise constant model representation. In the first approach, the large constant regions are found and one depth value is encoded for each such region. For the rest of the image, suitable regions are generated by constraining the local variation of the depth level from one pixel to another. The nonlinear predictors selected specifically for each region are combining the results of several linear predictors, each fitting optimally a subset of pixels belonging to the local neighborhood. In the second approach, the depth value of a given region is encoded using the depth values of the neighboring regions already encoded. The natural smoothness of the depth variation and the mutual exclusiveness of the values in neighboring regions are exploited to efficiently predict and encode the current region's depth value.The second group of methods is studying the lossy compression problem. In a first contribution, different segmentations are generated by varying the threshold for the depth local variability. A lossy depth-map image is obtained for each segmentation and is encoded based on predictive coding, quantization and context tree coding. In another contribution, the lossy versions of one image are created either by successively merging the constant regions of the original image, or by iteratively splitting the regions of a template image using horizontal or vertical line segments. Merging and splitting decisions are greedily taken, according to the best slope towards the next point in the rate-distortion curve. An entropy coding algorithm is used to encode each image.We propose also a progressive coding method for coding the sequence of lossy versions of a depth-map image. The bitstream is encoded so that any lossy version of the original image is generated, starting from a very low resolution up to lossless reconstruction. The partitions of the lossy versions into regions are assumed to be nested so that a higher resolution image is obtained by splitting some regions of a lower resolution image. A current image in the sequence is encoded using the a priori information from a previously encoded image: the anchor points are encoded relative to the already encoded contour points; the depth information of the newly resulting regions is recovered using the depth value of the parent region.As a final contribution, the dissertation includes a study of the parameterization of planar models. The quantized heights at three-pixel locations are used to compute the optimal plane for each region. The three-pixel locations are selected so that the distortion due to the approximation of the plane over the region is minimized. The planar model and the piecewise constant model are competing in the merging process, where the two regions to be merged are those ensuring the optimal slope in the rate-distortion curve

    Robust density modelling using the student's t-distribution for human action recognition

    Full text link
    The extraction of human features from videos is often inaccurate and prone to outliers. Such outliers can severely affect density modelling when the Gaussian distribution is used as the model since it is highly sensitive to outliers. The Gaussian distribution is also often used as base component of graphical models for recognising human actions in the videos (hidden Markov model and others) and the presence of outliers can significantly affect the recognition accuracy. In contrast, the Student's t-distribution is more robust to outliers and can be exploited to improve the recognition rate in the presence of abnormal data. In this paper, we present an HMM which uses mixtures of t-distributions as observation probabilities and show how experiments over two well-known datasets (Weizmann, MuHAVi) reported a remarkable improvement in classification accuracy. © 2011 IEEE

    XXIII Congreso Argentino de Ciencias de la Computación - CACIC 2017 : Libro de actas

    Get PDF
    Trabajos presentados en el XXIII Congreso Argentino de Ciencias de la Computación (CACIC), celebrado en la ciudad de La Plata los días 9 al 13 de octubre de 2017, organizado por la Red de Universidades con Carreras en Informática (RedUNCI) y la Facultad de Informática de la Universidad Nacional de La Plata (UNLP).Red de Universidades con Carreras en Informática (RedUNCI
    corecore