43 research outputs found
FastLLVE: Real-Time Low-Light Video Enhancement with Intensity-Aware Lookup Table
Low-Light Video Enhancement (LLVE) has received considerable attention in
recent years. One of the critical requirements of LLVE is inter-frame
brightness consistency, which is essential for maintaining the temporal
coherence of the enhanced video. However, most existing single-image-based
methods fail to address this issue, resulting in flickering effect that
degrades the overall quality after enhancement. Moreover, 3D Convolution Neural
Network (CNN)-based methods, which are designed for video to maintain
inter-frame consistency, are computationally expensive, making them impractical
for real-time applications. To address these issues, we propose an efficient
pipeline named FastLLVE that leverages the Look-Up-Table (LUT) technique to
maintain inter-frame brightness consistency effectively. Specifically, we
design a learnable Intensity-Aware LUT (IA-LUT) module for adaptive
enhancement, which addresses the low-dynamic problem in low-light scenarios.
This enables FastLLVE to perform low-latency and low-complexity enhancement
operations while maintaining high-quality results. Experimental results on
benchmark datasets demonstrate that our method achieves the State-Of-The-Art
(SOTA) performance in terms of both image quality and inter-frame brightness
consistency. More importantly, our FastLLVE can process 1,080p videos at
Frames Per Second (FPS), which is faster
than SOTA CNN-based methods in inference time, making it a promising solution
for real-time applications. The code is available at
https://github.com/Wenhao-Li-777/FastLLVE.Comment: 11pages, 9 Figures, and 6 Tables. Accepted by ACMMM 202
Image Aesthetics Assessment via Learnable Queries
Image aesthetics assessment (IAA) aims to estimate the aesthetics of images.
Depending on the content of an image, diverse criteria need to be selected to
assess its aesthetics. Existing works utilize pre-trained vision backbones
based on content knowledge to learn image aesthetics. However, training those
backbones is time-consuming and suffers from attention dispersion. Inspired by
learnable queries in vision-language alignment, we propose the Image Aesthetics
Assessment via Learnable Queries (IAA-LQ) approach. It adapts learnable queries
to extract aesthetic features from pre-trained image features obtained from a
frozen image encoder. Extensive experiments on real-world data demonstrate the
advantages of IAA-LQ, beating the best state-of-the-art method by 2.2% and 2.1%
in terms of SRCC and PLCC, respectively.Comment: This work has been submitted to the IEEE for possible publication.
Copyright may be transferred without notice, after which this version may no
longer be accessibl
DDColor: Towards Photo-Realistic Image Colorization via Dual Decoders
Image colorization is a challenging problem due to multi-modal uncertainty
and high ill-posedness. Directly training a deep neural network usually leads
to incorrect semantic colors and low color richness. While transformer-based
methods can deliver better results, they often rely on manually designed
priors, suffer from poor generalization ability, and introduce color bleeding
effects. To address these issues, we propose DDColor, an end-to-end method with
dual decoders for image colorization. Our approach includes a pixel decoder and
a query-based color decoder. The former restores the spatial resolution of the
image, while the latter utilizes rich visual features to refine color queries,
thus avoiding hand-crafted priors. Our two decoders work together to establish
correlations between color and multi-scale semantic representations via
cross-attention, significantly alleviating the color bleeding effect.
Additionally, a simple yet effective colorfulness loss is introduced to enhance
the color richness. Extensive experiments demonstrate that DDColor achieves
superior performance to existing state-of-the-art works both quantitatively and
qualitatively. The codes and models are publicly available at
https://github.com/piddnad/DDColor.Comment: ICCV 2023; Code: https://github.com/piddnad/DDColo
RSFNet: A White-Box Image Retouching Approach using Region-Specific Color Filters
Retouching images is an essential aspect of enhancing the visual appeal of
photos. Although users often share common aesthetic preferences, their
retouching methods may vary based on their individual preferences. Therefore,
there is a need for white-box approaches that produce satisfying results and
enable users to conveniently edit their images simultaneously. Recent white-box
retouching methods rely on cascaded global filters that provide image-level
filter arguments but cannot perform fine-grained retouching. In contrast,
colorists typically employ a divide-and-conquer approach, performing a series
of region-specific fine-grained enhancements when using traditional tools like
Davinci Resolve. We draw on this insight to develop a white-box framework for
photo retouching using parallel region-specific filters, called RSFNet. Our
model generates filter arguments (e.g., saturation, contrast, hue) and
attention maps of regions for each filter simultaneously. Instead of cascading
filters, RSFNet employs linear summations of filters, allowing for a more
diverse range of filter classes that can be trained more easily. Our
experiments demonstrate that RSFNet achieves state-of-the-art results, offering
satisfying aesthetic appeal and increased user convenience for editable
white-box retouching.Comment: Accepted by ICCV 202