4,190 research outputs found
A Perceptually Optimized and Self-Calibrated Tone Mapping Operator
With the increasing popularity and accessibility of high dynamic range (HDR)
photography, tone mapping operators (TMOs) for dynamic range compression are
practically demanding. In this paper, we develop a two-stage neural
network-based TMO that is self-calibrated and perceptually optimized. In Stage
one, motivated by the physiology of the early stages of the human visual
system, we first decompose an HDR image into a normalized Laplacian pyramid. We
then use two lightweight deep neural networks (DNNs), taking the normalized
representation as input and estimating the Laplacian pyramid of the
corresponding LDR image. We optimize the tone mapping network by minimizing the
normalized Laplacian pyramid distance (NLPD), a perceptual metric aligning with
human judgments of tone-mapped image quality. In Stage two, the input HDR image
is self-calibrated to compute the final LDR image. We feed the same HDR image
but rescaled with different maximum luminances to the learned tone mapping
network, and generate a pseudo-multi-exposure image stack with different detail
visibility and color saturation. We then train another lightweight DNN to fuse
the LDR image stack into a desired LDR image by maximizing a variant of the
structural similarity index for multi-exposure image fusion (MEF-SSIM), which
has been proven perceptually relevant to fused image quality. The proposed
self-calibration mechanism through MEF enables our TMO to accept uncalibrated
HDR images, while being physiology-driven. Extensive experiments show that our
method produces images with consistently better visual quality. Additionally,
since our method builds upon three lightweight DNNs, it is among the fastest
local TMOs.Comment: 20 pages,18 figure
Multimodal enhancement-fusion technique for natural images.
Masters Degree. University of KwaZulu-Natal, Durban.This dissertation presents a multimodal enhancement-fusion (MEF) technique for natural images. The MEF is expected to contribute value to machine vision applications and personal image collections for the human user. Image enhancement techniques and the metrics that are used to assess their performance are prolific, and each is usually optimised for a specific objective. The MEF proposes a framework that adaptively fuses multiple enhancement objectives into a seamless pipeline. Given a segmented input image and a set of enhancement methods, the MEF applies all the enhancers to the image in parallel. The most appropriate enhancement in each image segment is identified, and finally, the differentially enhanced segments are seamlessly fused. To begin with, this dissertation studies targeted contrast enhancement methods and performance metrics that can be utilised in the proposed MEF. It addresses a selection of objective assessment metrics for contrast-enhanced images and determines their relationship with the subjective assessment of human visual systems. This is to identify which objective metrics best approximate human assessment and may therefore be used as an effective replacement for tedious human assessment surveys. A subsequent human visual assessment survey is conducted on the same dataset to ascertain image quality as perceived by a human observer. The interrelated concepts of naturalness and detail were found to be key motivators of human visual assessment. Findings show that when assessing the quality or accuracy of these methods, no single quantitative metric correlates well with human perception of naturalness and detail, however, a combination of two or more metrics may be used to approximate the complex human visual response.
Thereafter, this dissertation proposes the multimodal enhancer that adaptively selects the optimal enhancer for each image segment. MEF focusses on improving chromatic irregularities such as poor contrast distribution. It deploys a concurrent enhancement pathway that subjects an image to multiple image enhancers in parallel, followed by a fusion algorithm that creates a composite image that combines the strengths of each enhancement path. The study develops a framework for parallel image enhancement, followed by parallel image assessment and selection, leading to final merging of selected regions from the enhanced set. The output combines desirable attributes from each enhancement pathway to produce a result that is superior to each path taken alone. The study showed that the proposed MEF technique performs well for most image types. MEF is subjectively favourable to a human panel and achieves better performance for objective image quality assessment compared to other enhancement methods
High-Brightness Image Enhancement Algorithm
In this paper, we introduce a tone mapping algorithm for processing high-brightness video images. This method can maximally recover the information of high-brightness areas and preserve detailed information. Along with benchmark data, real-life and practical application data were taken to test the proposed method. The experimental objects were license plates. We reconstructed the image in the RGB channel, and gamma correction was carried out. After that, local linear adjustment was completed through a tone mapping window to restore the detailed information of the high-brightness region. The experimental results showed that our algorithm could clearly restore the details of high-brightness local areas. The processed image conformed to the visual effect observed by human eyes but with higher definition. Compared with other algorithms, the proposed algorithm has advantages in terms of both subjective and objective evaluation. It can fully satisfy the needs in various practical applications
High Dynamic Range Imaging by Perceptual Logarithmic Exposure Merging
In this paper we emphasize a similarity between the Logarithmic-Type Image
Processing (LTIP) model and the Naka-Rushton model of the Human Visual System
(HVS). LTIP is a derivation of the Logarithmic Image Processing (LIP), which
further replaces the logarithmic function with a ratio of polynomial functions.
Based on this similarity, we show that it is possible to present an unifying
framework for the High Dynamic Range (HDR) imaging problem, namely that
performing exposure merging under the LTIP model is equivalent to standard
irradiance map fusion. The resulting HDR algorithm is shown to provide high
quality in both subjective and objective evaluations.Comment: 14 pages 8 figures. Accepted at AMCS journa
Subjective and Objective Evaluation of Tone-Mapping and De-Ghosting Algorithms
With the increasing importance of high dynamic range (HDR) imaging and low availability of HDR
displays, HDR cameras the need for efficient tone mapping, De-ghosting techniques is very crucial.
However the tone mapping operators, De-Ghosting tend to introduce distortions in the HDR images,
thus making it visually unpleasant in normal displays. Subjective evaluation of images is important
for rating these algorithms as the users should be able to visualize the complete details present in both
the brightly and poorly illuminated regions of the scene. To facilitate a systematic subjective study
we have created a database of HDR images tone mapped, De-Ghosted using popular algorithms.
We conducted a subjective study of the tone mapped images, computed objective scores by using
some of the state-of-the-art no-reference low dynamic range image quality assessment algorithms
and evaluated their performance. We show that a moderate and low correlation between objective
and subjective scores indicates the need for the consideration of human perception in rating tone
mapping operators and De-Ghosting algorithms
Real-World Image Restoration Using Degradation Adaptive Transformer-Based Adversarial Network
Most existing learning-based image restoration methods heavily rely on paired degraded/non-degraded training datasets that are based on simplistic handcrafted degradation assumptions. These assumptions often involve a limited set of degradations, such as Gaussian blurs, noises, and bicubic downsampling. However, when these methods are applied to real-world images, there is a significant decrease in performance due to the discrepancy between synthetic and realistic degradation. Additionally, they lack the flexibility to adapt to unknown degradations in practical scenarios, which limits their generalizability to complex and unconstrained scenes.
To address the absence of image pairs, recent studies have proposed Generative Adversarial Network (GAN)-based unpaired methods. Nevertheless, unpaired learning models based on convolution operations encounter challenges in capturing long-range pixel dependencies in real-world images. This limitation stems from their reliance on convolution operations, which offer local connectivity and translation equivariance but struggle to capture global dependencies due to their limited receptive field.
To address these challenges, this dissertation proposed an innovative unpaired image restoration basic model along with an advanced model. The proposed basic model is the DA-CycleGAN model, which is based on the CycleGAN [1] neural network and specifically designed for blind real-world Single Image Super-Resolution (SISR). The DA-CycleGAN incorporates a degradation adaptive (DA) module to learn various real-world degradations (such as noise and blur patterns) in an unpaired manner, enabling strong flexible adaptation. Additionally, an advanced model called Trans-CycleGAN was designed, which integrated the Transformer architecture into CycleGAN to leverage its global connectivity. This combination allowed for image-to-image translation using CycleGAN [1] while enabling the Transformer to model global connectivity across long-range pixels. Extensive experiments conducted on realistic images demonstrate the superior performance of the proposed method in solving real-world image restoration problems, resulting in clearer and finer details.
Overall, this dissertation presents a novel unpaired image restoration basic model and an advanced model that effectively address the limitations of existing approaches. The proposed approach achieves significant advancements in handling real-world degradations and modeling long-range pixel dependencies, thereby offering substantial improvements in image restoration tasks.
Index Terms— Cross-domain translation, generative adversarial network, image restoration, super-resolution, transformer, unpaired training
Contrast enhancement and exposure correction using a structure-aware distribution fitting
Realce de contraste e correção de exposição são úteis em aplicações domésticas e técnicas, no segundo caso como uma etapa de pré-processamento para outras técnicas ou para ajudar a observação humana. Frequentemente, uma transformação localmente adaptativa é mais adequada para a tarefa do que uma transformação global. Por exemplo, objetos e regiões podem ter níveis de iluminação muito diferentes, fenômenos físicos podem comprometer o contraste em algumas regiões mas não em outras, ou pode ser desejável ter alta visibilidade de detalhes em todas as partes da imagem. Para esses casos, métodos de realce de imagem locais são preferíveis. Embora existam muitos métodos de realce de contraste e correção de exposição disponíveis na literatura, não há uma solução definitiva que forneça um resultado satisfatório em todas as situações, e novos métodos surgem a cada ano. Em especial, os métodos tradicionais baseados em equalização adaptativa de histograma sofrem dos efeitos checkerboard e staircase e de excesso de realce. Esta dissertação propõe um método para realce de contraste e correção de exposição em imagens chamado Structure-Aware Distribution Stretching (SADS). O método ajusta regionalmente à imagem um modelo paramétrico de distribuição de probabilidade, respeitando a estrutura da imagem e as bordas entre as regiões. Isso é feito usando versões regionais das expressões clássicas de estimativa dos parâmetros da distribuição, que são obtidas substituindo a mé- dia amostral presente nas expressões originais por um filtro de suavização que preserva as bordas. Após ajustar a distribuição, a função de distribuição acumulada (CDF) do modelo ajustado e a inversa da CDF da distribuição desejada são aplicadas. Uma heurística ciente de estrutura que detecta regiões suaves é proposta e usada para atenuar as transformações em regiões planas. SADS foi comparado a outros métodos da literatura usando métricas objetivas de avaliação de qualidade de imagem (IQA) sem referência e com referência completa nas tarefas de realce de contraste e correção de exposição simultâneos e na tarefa de defogging/dehazing. Os experimentos indicam um desempenho geral superior do SADS em relação aos métodos comparados para os conjuntos de imagens usados, de acordo com as métricas IQA adotadas.Contrast enhancement and exposure correction are useful in domestic and technical applications, the latter as a preprocessing step for other techniques or for aiding human observation. Often, a locally adaptive transformation is more suitable for the task than a global transformation. For example, objects and regions may have very different levels of illumination, physical phenomena may compromise the contrast at some regions but not at others, or it may be desired to have high visibility of details in all parts of the image. For such cases, local image enhancement methods are preferable. Although there are many contrast enhancement and exposure correction methods available in the literature, there is no definitive solution that provides a satisfactory result in all situations, and new methods emerge each year. In special, traditional adaptive histogram equalization-based methods suffer from checkerboard and staircase effects and from over enhancement. This dissertation proposes a method for contrast enhancement and exposure correction in images named Structure-Aware Distribution Stretching (SADS). The method fits a parametric model of probability distribution to the image regionally while respecting the image structure and edges between regions. This is done using regional versions of the classical expressions for estimating the parameters of the distribution, which are obtained by replacing the sample mean present in the original expressions by an edge-preserving smoothing filter. After fitting the distribution, the cumulative distribution function (CDF) of the adjusted model and the inverse of the CDF of the desired distribution are applied. A structure-aware heuristic to indicate smooth regions is proposed and used to attenuate the transformations in flat regions. SADS was compared with other methods from the literature using objective no-reference and full-reference image quality assessment (IQA) metrics in the tasks of simultaneous contrast enhancement and exposure correction and in the task of defogging/dehazing. The experiments indicate a superior overall performance of SADS with respect to the compared methods for the image sets used, according to the IQA metrics adopted
Implicit Neural Representation for Cooperative Low-light Image Enhancement
The following three factors restrict the application of existing low-light
image enhancement methods: unpredictable brightness degradation and noise,
inherent gap between metric-favorable and visual-friendly versions, and the
limited paired training data. To address these limitations, we propose an
implicit Neural Representation method for Cooperative low-light image
enhancement, dubbed NeRCo. It robustly recovers perceptual-friendly results in
an unsupervised manner. Concretely, NeRCo unifies the diverse degradation
factors of real-world scenes with a controllable fitting function, leading to
better robustness. In addition, for the output results, we introduce
semantic-orientated supervision with priors from the pre-trained
vision-language model. Instead of merely following reference images, it
encourages results to meet subjective expectations, finding more
visual-friendly solutions. Further, to ease the reliance on paired data and
reduce solution space, we develop a dual-closed-loop constrained enhancement
module. It is trained cooperatively with other affiliated modules in a
self-supervised manner. Finally, extensive experiments demonstrate the
robustness and superior effectiveness of our proposed NeRCo. Our code is
available at https://github.com/Ysz2022/NeRCo
- …