569 research outputs found
A new efficient predictor blending lossless image coder
In the paper a highly efficient algorithm for lossless image coding is described. The algorithm is a predictor blending one, a sample estimate is computed as a weighted sum of estimates given by subpredictors, here 27 ones, hence the name Blend-2. Data compaction performance of Blend-27 is compared to that of numerous other lossless image coding algorithms, including the best currently existing ones. The compared methods are "classical" ones, as well as those based on Artificial Neural Networks. Performance of Blend-27 as a near-lossless coder is also evaluated. Its computational complexity is lower than that of majority of its direct competitors. The new algorithm appears to be currently the most efficient technique for lossless coding of natural images
Recommended from our members
Visibility metrics and their applications in visually lossless image compression
Visibility metrics are image metrics that predict the probability that a human observer can detect differences between a pair of images. These metrics can provide localized information in the form of visibility maps, in which each value represents a probability of detection. An important application of the visibility metric is visually lossless image compression that aims at compressing a given image to the lowest fraction of bit per pixel while keeping the compression artifacts invisible at the same time.
In previous works, most visibility metrics were modeled based on largely simplified assumptions and mathematical models of human visual systems. This approach generally fits well into experimental data measured with simple stimuli, such as Gabor patches. However, it cannot predict complex non-linear effects, such as contrast masking in natural images, particularly well. To predict visibility of image differences accurately, we collected the largest visibility dataset under fixed viewing conditions for calibrating existing visibility metrics and proposed a deep neural network-based visibility metric. We demonstrated in our experiments that the deep neural network-based visibility metric significantly outperformed existing visibility metrics.
However, the deep neural network-based visibility metric cannot predict visibility under varying viewing conditions, such as display brightness and viewing distances that have great impacts on the visibility of distortions. To extend the deep neural network-based visibility metric to varying viewing conditions, we collected the largest visibility dataset under varying display brightness and viewing distances. We proposed incorporating white-box modules, in other words, luminance masking and viewing distance adaptation, into the black-box deep neural network, and we found that the combination of white-box modules and black-box deep neural networks could generalize our proposed visibility metric to varying viewing conditions.
To demonstrate the application of our proposed deep neural network-based visibility metric to visually lossless image compression, we collected the visually lossless image compression dataset under fixed viewing conditions and significantly improved the deep neural network-based visibility metric's accuracy of predicting visually lossless image compression threshold by pre-training the visibility metric with a synthetic dataset generated by the state-of-the-art white-box visibility metric---HDR-VDP \cite{Mantiuk2011}. In a large-scale study of 1000 images, we found that with our improved visibility metric, we can save around 60\% to 70\% bits for visually lossless image compression encoding as compared to the default visually lossless quality level of 90.
Because predicting image visibility and predicting image quality are closely related research topics, we also proposed a trained perceptually uniform transform for high dynamic range images and videos quality assessments by training a perceptual encoding function on a set of subjective quality assessment datasets. We have shown that when combining the trained perceptual encoding function with standard dynamic range image quality metrics, such as peak-signal-noise-ratio (PSNR), better performance was achieved compared to the untrained version
An Introduction to Neural Data Compression
Neural compression is the application of neural networks and other machine
learning methods to data compression. Recent advances in statistical machine
learning have opened up new possibilities for data compression, allowing
compression algorithms to be learned end-to-end from data using powerful
generative models such as normalizing flows, variational autoencoders,
diffusion probabilistic models, and generative adversarial networks. The
present article aims to introduce this field of research to a broader machine
learning audience by reviewing the necessary background in information theory
(e.g., entropy coding, rate-distortion theory) and computer vision (e.g., image
quality assessment, perceptual metrics), and providing a curated guide through
the essential ideas and methods in the literature thus far
Sparse representation based hyperspectral image compression and classification
Abstract
This thesis presents a research work on applying sparse representation to lossy hyperspectral image
compression and hyperspectral image classification. The proposed lossy hyperspectral image
compression framework introduces two types of dictionaries distinguished by the terms sparse
representation spectral dictionary (SRSD) and multi-scale spectral dictionary (MSSD), respectively.
The former is learnt in the spectral domain to exploit the spectral correlations, and the
latter in wavelet multi-scale spectral domain to exploit both spatial and spectral correlations in
hyperspectral images. To alleviate the computational demand of dictionary learning, either a
base dictionary trained offline or an update of the base dictionary is employed in the compression
framework. The proposed compression method is evaluated in terms of different objective
metrics, and compared to selected state-of-the-art hyperspectral image compression schemes, including
JPEG 2000. The numerical results demonstrate the effectiveness and competitiveness of
both SRSD and MSSD approaches.
For the proposed hyperspectral image classification method, we utilize the sparse coefficients
for training support vector machine (SVM) and k-nearest neighbour (kNN) classifiers. In particular,
the discriminative character of the sparse coefficients is enhanced by incorporating contextual
information using local mean filters. The classification performance is evaluated and compared
to a number of similar or representative methods. The results show that our approach could outperform
other approaches based on SVM or sparse representation.
This thesis makes the following contributions. It provides a relatively thorough investigation
of applying sparse representation to lossy hyperspectral image compression. Specifically,
it reveals the effectiveness of sparse representation for the exploitation of spectral correlations
in hyperspectral images. In addition, we have shown that the discriminative character of sparse
coefficients can lead to superior performance in hyperspectral image classification.EM201
Recommended from our members
Active sampling, scaling and dataset merging for large-scale image quality assessment
The field of subjective assessment is concerned with eliciting human judgements about a set of stimuli. Collecting such data is costly and time-consuming, especially when the subjective study is to be conducted in a controlled environment and using a specialized equipment. Thus, data from these studies are usually scarce. One of the areas, for which obtaining subjective measurements is difficult is image quality assessment. The results from these studies are used to develop and train automated or objective image quality metrics, which, with the advent of deep learning, require large amounts of versatile and heterogeneous data.
I present three main contributions in this dissertation. First, I propose a new active sampling method for efficient collection of pairwise comparisons in subjective assessment experiments. In these experiments observers are asked to express a preference between two conditions. However, many pairwise comparison protocols require a large number of comparisons to infer accurate scores, which may be unfeasible when each comparison is time-consuming (e.g. videos) or expensive (e.g. medical imaging). This motivates the use of an active sampling algorithm that chooses only the most informative pairs for comparison. I demonstrate, with real and synthetic data, that my algorithm offers the highest accuracy of inferred scores given a fixed number of measurements compared to the existing methods. Second, I propose a probabilistic framework to fuse the outcomes of different psychophysical experimental protocols, namely rating and pairwise comparisons experiments. Such a method can be used for merging existing datasets of subjective nature and for experiments in which both measurements are collected. Third, with a new dataset merging technique and by collecting additional cross-dataset quality comparisons I create a Unified Photometric Image Quality (UPIQ) dataset with over 4,000 images by realigning and merging existing high-dynamic-range (HDR) and standard-dynamic-range (SDR) datasets. The realigned quality scores share the same unified quality scale across all datasets. I then use the new dataset to retrain existing HDR metrics and show that the dataset is sufficiently large for training deep architectures. I show the utility of the dataset and metrics in an application to image compression that accounts for viewing conditions, including screen brightness and the viewing distance
Lossy Compressive Sensing Based on Online Dictionary Learning
In this paper, a lossy compression of hyperspectral images is realized by using a novel online dictionary learning method in which three dimensional datasets can be compressed. This online dictionary learning method and blind compressive sensing (BCS) algorithm are combined in a hybrid lossy compression framework for the first time in the literature. According to the experimental results, BCS algorithm has the best compression performance when the compression bit rate is higher than or equal to 0.5 bps. Apart from observing rate-distortion performance, anomaly detection performance is also tested on the reconstructed images to measure the information preservation performance
Contributions to Medical Image Segmentation and Signal Analysis Utilizing Model Selection Methods
This thesis presents contributions to model selection techniques, especially based on information theoretic criteria, with the goal of solving problems appearing in signal analysis and in medical image representation, segmentation, and compression.The field of medical image segmentation is wide and is quickly developing to make use of higher available computational power. This thesis concentrates on several applications that allow the utilization of parametric models for image and signal representation. One important application is cell nuclei segmentation from histological images. We model nuclei contours by ellipses and thus the complicated problem of separating overlapping nuclei can be rephrased as a model selection problem, where the number of nuclei, their shapes, and their locations define one segmentation. In this thesis, we present methods for model selection in this parametric setting, where the intuitive algorithms are combined with more principled ones, namely those based on the minimum description length (MDL) principle. The results of the introduced unsupervised segmentation algorithm are compared with human subject segmentations, and are also evaluated with the help of a pathology expert.Another considered medical image application is lossless compression. The objective has been to add the task of image segmentation to that of image compression such that the image regions can be transmitted separately, depending on the region of interest for diagnosis. The experiments performed on retinal color images show that our modeling, in which the MDL criterion selects the structure of the linear predictive models, outperforms publicly available image compressors such as the lossless version of JPEG 2000.For time series modeling, the thesis presents an algorithm which allows detection of changes in time series signals. The algorithm is based on one of the most recent implementations of the MDL principle, the sequentially normalized maximum likelihood (SNML) models.This thesis produces contributions in the form of new methods and algorithms, where the simplicity of information theoretic principles are combined with a rather complex and problem dependent modeling formulation, resulting in both heuristically motivated and principled algorithmic solutions
- …