2,444 research outputs found
Semantic Perceptual Image Compression using Deep Convolution Networks
It has long been considered a significant problem to improve the visual
quality of lossy image and video compression. Recent advances in computing
power together with the availability of large training data sets has increased
interest in the application of deep learning cnns to address image recognition
and image processing tasks. Here, we present a powerful cnn tailored to the
specific task of semantic image understanding to achieve higher visual quality
in lossy compression. A modest increase in complexity is incorporated to the
encoder which allows a standard, off-the-shelf jpeg decoder to be used. While
jpeg encoding may be optimized for generic images, the process is ultimately
unaware of the specific content of the image to be compressed. Our technique
makes jpeg content-aware by designing and training a model to identify multiple
semantic regions in a given image. Unlike object detection techniques, our
model does not require labeling of object positions and is able to identify
objects in a single pass. We present a new cnn architecture directed
specifically to image compression, which generates a map that highlights
semantically-salient regions so that they can be encoded at higher quality as
compared to background regions. By adding a complete set of features for every
class, and then taking a threshold over the sum of all feature activations, we
generate a map that highlights semantically-salient regions so that they can be
encoded at a better quality compared to background regions. Experiments are
presented on the Kodak PhotoCD dataset and the MIT Saliency Benchmark dataset,
in which our algorithm achieves higher visual quality for the same compressed
size.Comment: Accepted to Data Compression Conference, 11 pages, 5 figure
Image quality assessment based on harmonics gain/loss information
We present an objective reduced-reference image quality assessment method based on harmonic gain/loss information through a discriminative analysis of local harmonic strength (LHS). The LHS is computed from the gradient of images, and its value represents a relative degree of the appearance of blockiness on images when it is related to energy gain within an image. Furthermore, comparison between local harmonic strength values from an original, distortion-free image and a degraded, processed, or compressed version of the image shows that the LHS can also be used to indicate other types of degradations, such as blurriness that corresponds with energy loss. Our simulations show that we can develop a single metric based on this gain/loss information and use it to rate the quality of images encoded by various encoders such as DCT-based JPEG, wavelet-based JPEG 2000, or various processed images. We show that our method can overcome some limitations of the traditional PSNR
Color image quality measures and retrieval
The focus of this dissertation is mainly on color image, especially on the images with lossy compression. Issues related to color quantization, color correction, color image retrieval and color image quality evaluation are addressed. A no-reference color image quality index is proposed. A novel color correction method applied to low bit-rate JPEG image is developed. A novel method for content-based image retrieval based upon combined feature vectors of shape, texture, and color similarities has been suggested. In addition, an image specific color reduction method has been introduced, which allows a 24-bit JPEG image to be shown in the 8-bit color monitor with 256-color display. The reduction in download and decode time mainly comes from the smart encoder incorporating with the proposed color reduction method after color space conversion stage. To summarize, the methods that have been developed can be divided into two categories: one is visual representation, and the other is image quality measure.
Three algorithms are designed for visual representation:
(1) An image-based visual representation for color correction on low bit-rate JPEG images. Previous studies on color correction are mainly on color image calibration among devices. Little attention was paid to the compressed image whose color distortion is evident in low bit-rate JPEG images. In this dissertation, a lookup table algorithm is designed based on the loss of PSNR in different compression ratio.
(2) A feature-based representation for content-based image retrieval. It is a concatenated vector of color, shape, and texture features from region of interest (ROI).
(3) An image-specific 256 colors (8 bits) reproduction for color reduction from 16 millions colors (24 bits). By inserting the proposed color reduction method into a JPEG encoder, the image size could be further reduced and the transmission time is also reduced. This smart encoder enables its decoder using less time in decoding.
Three algorithms are designed for image quality measure (IQM):
(1) A referenced IQM based upon image representation in very low-dimension. Previous studies on IQMs are based on high-dimensional domain including spatial and frequency domains. In this dissertation, a low-dimensional domain IQM based on random projection is designed, with preservation of the IQM accuracy in high-dimensional domain.
(2) A no-reference image blurring metric. Based on the edge gradient, the degree of image blur can be measured.
(3) A no-reference color IQM based upon colorfulness, contrast and sharpness
Metrics for Stereoscopic Image Compression
Metrics for automatically predicting the compression settings for stereoscopic images, to minimize file size, while still maintaining an acceptable level of image quality are investigated. This research evaluates whether symmetric or asymmetric compression produces a better quality of stereoscopic image.
Initially, how Peak Signal to Noise Ratio (PSNR) measures the quality of varyingly compressed stereoscopic image pairs was investigated. Two trials with human subjects, following the ITU-R BT.500-11 Double Stimulus Continuous Quality Scale (DSCQS) were undertaken to measure the quality of symmetric and asymmetric stereoscopic image compression. Computational models of the Human Visual System (HVS) were then investigated and a new stereoscopic image quality metric designed and implemented. The metric point matches regions of high spatial frequency between the left and right views of the stereo pair and accounts for HVS sensitivity to contrast and luminance changes in these regions.
The PSNR results show that symmetric, as opposed to asymmetric stereo image compression, produces significantly better results. The human factors trial suggested that in
general, symmetric compression of stereoscopic images should be used.
The new metric, Stereo Band Limited Contrast, has been demonstrated as a better predictor of human image quality preference than PSNR and can be used to predict
a perceptual threshold level for stereoscopic image compression. The threshold is the maximum compression that can be applied without the perceived image quality being
altered.
Overall, it is concluded that, symmetric, as opposed to asymmetric stereo image encoding, should be used for stereoscopic image compression. As PSNR measures of image
quality are correctly criticized for correlating poorly with perceived visual quality, the new HVS based metric was developed. This metric produces a useful threshold to provide a practical starting point to decide the level of compression to use
Map online system using internet-based image catalogue
Digital maps carry along its geodata information such as coordinate that is important in one particular topographic and thematic map. These geodatas are meaningful especially in military field. Since the maps carry along this information, its makes the size of the images is too big. The bigger size, the bigger storage is required to allocate the image file. It also can cause longer loading time. These conditions make it did not suitable to be applied in image catalogue approach via internet environment. With compression techniques, the image size can be reduced and the quality of the image is still guaranteed without much changes. This report is paying attention to one of the image compression technique using wavelet technology. Wavelet technology is much batter than any other image compression technique nowadays. As a result, the compressed images applied to a system called Map Online that used Internet-based Image Catalogue approach. This system allowed user to buy map online. User also can download the maps that had been bought besides using the searching the map. Map searching is based on several meaningful keywords. As a result, this system is expected to be used by Jabatan Ukur dan Pemetaan Malaysia (JUPEM) in order to make the organization vision is implemented
Recommended from our members
Visibility metrics and their applications in visually lossless image compression
Visibility metrics are image metrics that predict the probability that a human observer can detect differences between a pair of images. These metrics can provide localized information in the form of visibility maps, in which each value represents a probability of detection. An important application of the visibility metric is visually lossless image compression that aims at compressing a given image to the lowest fraction of bit per pixel while keeping the compression artifacts invisible at the same time.
In previous works, most visibility metrics were modeled based on largely simplified assumptions and mathematical models of human visual systems. This approach generally fits well into experimental data measured with simple stimuli, such as Gabor patches. However, it cannot predict complex non-linear effects, such as contrast masking in natural images, particularly well. To predict visibility of image differences accurately, we collected the largest visibility dataset under fixed viewing conditions for calibrating existing visibility metrics and proposed a deep neural network-based visibility metric. We demonstrated in our experiments that the deep neural network-based visibility metric significantly outperformed existing visibility metrics.
However, the deep neural network-based visibility metric cannot predict visibility under varying viewing conditions, such as display brightness and viewing distances that have great impacts on the visibility of distortions. To extend the deep neural network-based visibility metric to varying viewing conditions, we collected the largest visibility dataset under varying display brightness and viewing distances. We proposed incorporating white-box modules, in other words, luminance masking and viewing distance adaptation, into the black-box deep neural network, and we found that the combination of white-box modules and black-box deep neural networks could generalize our proposed visibility metric to varying viewing conditions.
To demonstrate the application of our proposed deep neural network-based visibility metric to visually lossless image compression, we collected the visually lossless image compression dataset under fixed viewing conditions and significantly improved the deep neural network-based visibility metric's accuracy of predicting visually lossless image compression threshold by pre-training the visibility metric with a synthetic dataset generated by the state-of-the-art white-box visibility metric---HDR-VDP \cite{Mantiuk2011}. In a large-scale study of 1000 images, we found that with our improved visibility metric, we can save around 60\% to 70\% bits for visually lossless image compression encoding as compared to the default visually lossless quality level of 90.
Because predicting image visibility and predicting image quality are closely related research topics, we also proposed a trained perceptually uniform transform for high dynamic range images and videos quality assessments by training a perceptual encoding function on a set of subjective quality assessment datasets. We have shown that when combining the trained perceptual encoding function with standard dynamic range image quality metrics, such as peak-signal-noise-ratio (PSNR), better performance was achieved compared to the untrained version
- ā¦