53 research outputs found
High Dynamic Range Imaging with Context-aware Transformer
Avoiding the introduction of ghosts when synthesising LDR images as high
dynamic range (HDR) images is a challenging task. Convolutional neural networks
(CNNs) are effective for HDR ghost removal in general, but are challenging to
deal with the LDR images if there are large movements or
oversaturation/undersaturation. Existing dual-branch methods combining CNN and
Transformer omit part of the information from non-reference images, while the
features extracted by the CNN-based branch are bound to the kernel size with
small receptive field, which are detrimental to the deblurring and the recovery
of oversaturated/undersaturated regions. In this paper, we propose a novel
hierarchical dual Transformer method for ghost-free HDR (HDT-HDR) images
generation, which extracts global features and local features simultaneously.
First, we use a CNN-based head with spatial attention mechanisms to extract
features from all the LDR images. Second, the LDR features are delivered to the
Hierarchical Dual Transformer (HDT). In each Dual Transformer (DT), the global
features are extracted by the window-based Transformer, while the local details
are extracted using the channel attention mechanism with deformable CNNs.
Finally, the ghost free HDR image is obtained by dimensional mapping on the HDT
output. Abundant experiments demonstrate that our HDT-HDR achieves the
state-of-the-art performance among existing HDR ghost removal methods.Comment: 8 pages, 5 figure
Mutual-Guided Dynamic Network for Image Fusion
Image fusion aims to generate a high-quality image from multiple images
captured under varying conditions. The key problem of this task is to preserve
complementary information while filtering out irrelevant information for the
fused result. However, existing methods address this problem by leveraging
static convolutional neural networks (CNNs), suffering two inherent limitations
during feature extraction, i.e., being unable to handle spatial-variant
contents and lacking guidance from multiple inputs. In this paper, we propose a
novel mutual-guided dynamic network (MGDN) for image fusion, which allows for
effective information utilization across different locations and inputs.
Specifically, we design a mutual-guided dynamic filter (MGDF) for adaptive
feature extraction, composed of a mutual-guided cross-attention (MGCA) module
and a dynamic filter predictor, where the former incorporates additional
guidance from different inputs and the latter generates spatial-variant kernels
for different locations. In addition, we introduce a parallel feature fusion
(PFF) module to effectively fuse local and global information of the extracted
features. To further reduce the redundancy among the extracted features while
simultaneously preserving their shared structural information, we devise a
novel loss function that combines the minimization of normalized mutual
information (NMI) with an estimated gradient mask. Experimental results on five
benchmark datasets demonstrate that our proposed method outperforms existing
methods on four image fusion tasks. The code and model are publicly available
at: https://github.com/Guanys-dar/MGDN.Comment: ACMMM 2023 accepte
Alignment-free HDR Deghosting with Semantics Consistent Transformer
High dynamic range (HDR) imaging aims to retrieve information from multiple
low-dynamic range inputs to generate realistic output. The essence is to
leverage the contextual information, including both dynamic and static
semantics, for better image generation. Existing methods often focus on the
spatial misalignment across input frames caused by the foreground and/or camera
motion. However, there is no research on jointly leveraging the dynamic and
static context in a simultaneous manner. To delve into this problem, we propose
a novel alignment-free network with a Semantics Consistent Transformer (SCTNet)
with both spatial and channel attention modules in the network. The spatial
attention aims to deal with the intra-image correlation to model the dynamic
motion, while the channel attention enables the inter-image intertwining to
enhance the semantic consistency across frames. Aside from this, we introduce a
novel realistic HDR dataset with more variations in foreground objects,
environmental factors, and larger motions. Extensive comparisons on both
conventional datasets and ours validate the effectiveness of our method,
achieving the best trade-off on the performance and the computational cost
Locally Non-rigid Registration for Mobile HDR Photography
Image registration for stack-based HDR photography is challenging. If not
properly accounted for, camera motion and scene changes result in artifacts in
the composite image. Unfortunately, existing methods to address this problem
are either accurate, but too slow for mobile devices, or fast, but prone to
failing. We propose a method that fills this void: our approach is extremely
fast---under 700ms on a commercial tablet for a pair of 5MP images---and
prevents the artifacts that arise from insufficient registration quality
A robust patch-based synthesis framework for combining inconsistent images
Current methods for combining different images produce visible artifacts when the sources have very different textures and structures, come from far view points, or capture dynamic scenes with motions. In this thesis, we propose a patch-based synthesis algorithm to plausibly combine different images that have color, texture, structural, and geometric inconsistencies. For some applications such as cloning and stitching where a gradual blend is required, we present a new method for synthesizing a transition region between two source images, such that inconsistent properties change gradually from one source to the other. We call this process image melding. For gradual blending, we generalized patch-based optimization foundation with three key generalizations: First, we enrich the patch search space with additional geometric and photometric transformations. Second, we integrate image gradients into the patch representation and replace the usual color averaging with a screened Poisson equation solver. Third, we propose a new energy based on mixed L2/L0 norms for colors and gradients that produces a gradual transition between sources without sacrificing texture sharpness. Together, all three generalizations enable patch-based solutions to a broad class of image melding problems involving inconsistent sources: object cloning, stitching challenging panoramas, hole filling from multiple photos, and image harmonization. We also demonstrate another application which requires us to address inconsistencies across the images: high dynamic range (HDR) reconstruction using sequential exposures. In this application, the results will suffer from objectionable artifacts for dynamic scenes if the inconsistencies caused by significant scene motions are not handled properly. In this thesis, we propose a new approach to HDR reconstruction that uses information in all exposures while being more robust to motion than previous techniques. Our algorithm is based on a novel patch-based energy-minimization formulation that integrates alignment and reconstruction in a joint optimization through an equation we call the HDR image synthesis equation. This allows us to produce an HDR result that is aligned to one of the exposures yet contains information from all of them. These two applications (image melding and high dynamic range reconstruction) show that patch based methods like the one proposed in this dissertation can address inconsistent images and could open the door to many new image editing applications in the future
YDA görüntü gölgeleme gidermede gelişmişlik seviyesi ve YDA görüntüler için nesnel bir gölgeleme giderme kalite metriği.
Despite the emergence of new HDR acquisition methods, the multiple exposure technique (MET) is still the most popular one. The application of MET on dynamic scenes is a challenging task due to the diversity of motion patterns and uncontrollable factors such as sensor noise, scene occlusion and performance concerns on some platforms with limited computational capability. Currently, there are already more than 50 deghosting algorithms proposed for artifact-free HDR imaging of dynamic scenes and it is expected that this number will grow in the future. Due to the large number of algorithms, it is a difficult and time-consuming task to conduct subjective experiments for benchmarking recently proposed algorithms. In this thesis, first, a taxonomy of HDR deghosting methods and the key characteristics of each group of algorithms are introduced. Next, the potential artifacts which are observed frequently in the outputs of HDR deghosting algorithms are defined and an objective HDR image deghosting quality metric is presented. It is found that the proposed metric is well correlated with the human preferences and it may be used as a reference for benchmarking current and future HDR image deghosting algorithmsPh.D. - Doctoral Progra
Recommended from our members
Assessment of multi-exposure HDR image deghosting methods
© 2017 Elsevier LtdTo avoid motion artefacts when merging multiple exposures into a high dynamic range image, a number of HDR deghosting algorithms have been proposed. However, these algorithms do not work equally well on all types of scenes, and some may even introduce additional artefacts. As the number of proposed deghosting methods is increasing rapidly, there is an immediate need to evaluate them and compare their results. Even though subjective methods of evaluation provide reliable means of testing, they are often cumbersome and need to be repeated for each new proposed method or even its slight modification. Because of that, there is a need for objective quality metrics that will provide automatic means of evaluation of HDR deghosting algorithms. In this work, we explore several computational approaches of quantitative evaluation of multi-exposure HDR deghosting algorithms and demonstrate their results on five state-of-the-art algorithms. In order to perform a comprehensive evaluation, a new dataset consisting of 36 scenes has been created, where each scene provides a different challenge for a deghosting algorithm. The quality of HDR images produced by deghosting method is measured in a subjective experiment and then evaluated using objective metrics. As this paper is an extension of our conference paper, we add one more objective quality metric, UDQM, as an additional metric in the evaluation. Furthermore, analysis of objective and subjective experiments is performed and explained more extensively in this work. By testing correlation between objective metric and subjective scores, the results show that from the tested metrics, that HDR-VDP-2 is the most reliable metric for evaluating HDR deghosting algorithms. The results also show that for most of the tested scenes, Sen et al.'s deghosting method outperforms other evaluated deghosting methods. The observations based on the obtained results can be used as a vital guide in the development of new HDR deghosting algorithms, which would be robust to a variety of scenes and could produce high quality results
Robust estimation of exposure ratios in multi-exposure image stacks
Merging multi-exposure image stacks into a high dynamic range (HDR) image
requires knowledge of accurate exposure times. When exposure times are
inaccurate, for example, when they are extracted from a camera's EXIF metadata,
the reconstructed HDR images reveal banding artifacts at smooth gradients. To
remedy this, we propose to estimate exposure ratios directly from the input
images. We derive the exposure time estimation as an optimization problem, in
which pixels are selected from pairs of exposures to minimize estimation error
caused by camera noise. When pixel values are represented in the logarithmic
domain, the problem can be solved efficiently using a linear solver. We
demonstrate that the estimation can be easily made robust to pixel misalignment
caused by camera or object motion by collecting pixels from multiple spatial
tiles. The proposed automatic exposure estimation and alignment eliminates
banding artifacts in popular datasets and is essential for applications that
require physically accurate reconstructions, such as measuring the modulation
transfer function of a display. The code for the method is available.Comment: 11 pages, 11 figures, journa
- …