26 research outputs found
Two Decades of Colorization and Decolorization for Images and Videos
Colorization is a computer-aided process, which aims to give color to a gray
image or video. It can be used to enhance black-and-white images, including
black-and-white photos, old-fashioned films, and scientific imaging results. On
the contrary, decolorization is to convert a color image or video into a
grayscale one. A grayscale image or video refers to an image or video with only
brightness information without color information. It is the basis of some
downstream image processing applications such as pattern recognition, image
segmentation, and image enhancement. Different from image decolorization, video
decolorization should not only consider the image contrast preservation in each
video frame, but also respect the temporal and spatial consistency between
video frames. Researchers were devoted to develop decolorization methods by
balancing spatial-temporal consistency and algorithm efficiency. With the
prevalance of the digital cameras and mobile phones, image and video
colorization and decolorization have been paid more and more attention by
researchers. This paper gives an overview of the progress of image and video
colorization and decolorization methods in the last two decades.Comment: 12 pages, 19 figure
Preserving Perceptual Contrast in Decolorization with Optimized Color Orders
Converting a color image to a grayscale image, namely decolorization, is an important process for many real-world applications. Previous methods build contrast loss functions to minimize the contrast differences between the color images and the resultant grayscale images. In this paper, we improve upon a widely used decolorization method with two extensions. First, we relax the need for heuristics on color orders, which the baseline method relies on when computing the contrast differences. In our method, the color orders are incorporated into the loss function and are determined through optimization. Moreover, we apply a nonlinear function on the grayscale contrast to better model human perception of contrast. Both qualitative and quantitative results on the standard benchmark demonstrate the effectiveness of our two extensions
Depth-aware Neural Style Transfer using Instance Normalization
Neural Style Transfer (NST) is concerned with the artistic stylization of
visual media. It can be described as the process of transferring the style of
an artistic image onto an ordinary photograph. Recently, a number of studies
have considered the enhancement of the depth-preserving capabilities of the NST
algorithms to address the undesired effects that occur when the input content
images include numerous objects at various depths. Our approach uses a deep
residual convolutional network with instance normalization layers that utilizes
an advanced depth prediction network to integrate depth preservation as an
additional loss function to content and style. We demonstrate results that are
effective in retaining the depth and global structure of content images. Three
different evaluation processes show that our system is capable of preserving
the structure of the stylized results while exhibiting style-capture
capabilities and aesthetic qualities comparable or superior to state-of-the-art
methods. Project page:
https://ioannoue.github.io/depth-aware-nst-using-in.html.Comment: 8 pages, 8 figures, Computer Graphics & Visual Computing (CGVC) 202
Efficient and effective objective image quality assessment metrics
Acquisition, transmission, and storage of images and videos have been largely increased in recent years. At the same time, there has been an increasing demand for high quality images and videos to provide satisfactory quality-of-experience for viewers. In this respect, high dynamic range (HDR) imaging with higher than 8-bit depth has been an interesting approach in order to capture more realistic images and videos. Objective image and video quality assessment plays a significant role in monitoring and enhancing the image and video quality in several applications such as image acquisition, image compression, multimedia streaming, image restoration, image enhancement and displaying. The main contributions of this work are to propose efficient features and similarity maps that can be used to design perceptually consistent image quality assessment tools. In this thesis, perceptually consistent full-reference image quality assessment (FR-IQA) metrics are proposed to assess the quality of natural, synthetic, photo-retouched and tone-mapped images. In addition, efficient no-reference image quality metrics are proposed to assess JPEG compressed and contrast distorted images. Finally, we propose a perceptually consistent color to gray conversion method, perform a subjective rating and evaluate existing color to gray assessment metrics.
Existing FR-IQA metrics may have the following limitations. First, their performance is not consistent for different distortions and datasets. Second, better performing metrics usually have high complexity. We propose in this thesis an efficient and reliable full-reference image quality evaluator based on new gradient and color similarities. We derive a general deviation pooling formulation and use it to compute a final quality score from the similarity maps. Extensive experimental results verify high accuracy and consistent performance of the proposed metric on natural, synthetic and photo retouched datasets as well as its low complexity.
In order to visualize HDR images on standard low dynamic range (LDR) displays, tone-mapping operators are used in order to convert HDR into LDR. Given different depth bits of HDR and LDR, traditional FR-IQA metrics are not able to assess the quality of tone-mapped images. The existing full-reference metric for tone-mapped images called TMQI converts both HDR and LDR to an intermediate color space and measure their similarity in the spatial domain. We propose in this thesis a feature similarity full-reference metric in which local phase of HDR is compared with the local phase of LDR. Phase is an important information of images and previous studies have shown that human visual system responds strongly to points in an image where the phase information is ordered. Experimental results on two available datasets show the very promising performance of the proposed metric.
No-reference image quality assessment (NR-IQA) metrics are of high interest because in the most present and emerging practical real-world applications, the reference signals are not available. In this thesis, we propose two perceptually consistent distortion-specific NR-IQA metrics for JPEG compressed and contrast distorted images. Based on edge statistics of JPEG compressed images, an efficient NR-IQA metric for blockiness artifact is proposed which is robust to block size and misalignment. Then, we consider the quality assessment of contrast distorted images which is a common distortion. Higher orders of Minkowski distance and power transformation are used to train a low complexity model that is able to assess contrast distortion with high accuracy. For the first time, the proposed model is used to classify the type of contrast distortions which is very useful additional information for image contrast enhancement.
Unlike its traditional use in the assessment of distortions, objective IQA can be used in other applications. Examples are the quality assessment of image fusion, color to gray image conversion, inpainting, background subtraction, etc. In the last part of this thesis, a real-time and perceptually consistent color to gray image conversion methodology is proposed. The proposed correlation-based method and state-of-the-art methods are compared by subjective and objective evaluation. Then, a conclusion is made on the choice of the objective quality assessment metric for the color to gray image conversion. The conducted subjective ratings can be used in the development process of quality assessment metrics for the color to gray image conversion and to test their performance
Learning based image transformation using convolutional neural networks
We have developed a learning-based image transformation framework and successfully applied it to three common image transformation operations: downscaling, decolorization, and high dynamic range image tone mapping. We use a convolutional neural network (CNN) as a non-linear mapping function to transform an input image to a desired output. A separate CNN network trained for a very large image classification task is used as a feature extractor to construct the training loss function of the image transformation CNN. Unlike similar applications in the related literature such as image super-resolution, none of the problems addressed in this paper have a known ground truth or target. For each problem, we reason abouta suitable learning objective function and develop an effective solution. This is the first work that uses deep learning to solve and unify these three common image processing tasks. We present experimental results to demonstrate the effectiveness of the new technique and its state-of-the-art performances
Nonlocal Co-occurrence for Image Downscaling
Image downscaling is one of the widely used operations in image processing
and computer graphics. It was recently demonstrated in the literature that
kernel-based convolutional filters could be modified to develop efficient image
downscaling algorithms. In this work, we present a new downscaling technique
which is based on kernel-based image filtering concept. We propose to use
pairwise co-occurrence similarity of the pixelpairs as the range kernel
similarity in the filtering operation. The co-occurrence of the pixel-pair is
learned directly from the input image. This co-occurrence learning is performed
in a neighborhood based fashion all over the image. The proposed method can
preserve the high-frequency structures, which were present in the input image,
into the downscaled image. The resulting images retain visually important
details and do not suffer from edge-blurring artifact. We demonstrate the
effectiveness of our proposed approach with extensive experiments on a large
number of images downscaled with various downscaling factors.Comment: 9 pages, 8 figure
Depth-aware neural style transfer using instance normalization
Neural Style Transfer (NST) is concerned with the artistic stylization of visual media. It can be described as the process of
transferring the style of an artistic image onto an ordinary photograph. Recently, a number of studies have considered the
enhancement of the depth-preserving capabilities of the NST algorithms to address the undesired effects that occur when the
input content images include numerous objects at various depths. Our approach uses a deep residual convolutional network
with instance normalization layers that utilizes an advanced depth prediction network to integrate depth preservation as an
additional loss function to content and style. We demonstrate results that are effective in retaining the depth and global structure
of content images. Three different evaluation processes show that our system is capable of preserving the structure of the stylized
results while exhibiting style-capture capabilities and aesthetic qualities comparable or superior to state-of-the-art methods