458 research outputs found
Multi-scale Sampling and Aggregation Network For High Dynamic Range Imaging
High dynamic range (HDR) imaging is a fundamental problem in image
processing, which aims to generate well-exposed images, even in the presence of
varying illumination in the scenes. In recent years, multi-exposure fusion
methods have achieved remarkable results, which merge multiple low dynamic
range (LDR) images, captured with different exposures, to generate
corresponding HDR images. However, synthesizing HDR images in dynamic scenes is
still challenging and in high demand. There are two challenges in producing HDR
images: 1). Object motion between LDR images can easily cause undesirable
ghosting artifacts in the generated results. 2). Under and overexposed regions
often contain distorted image content, because of insufficient compensation for
these regions in the merging stage. In this paper, we propose a multi-scale
sampling and aggregation network for HDR imaging in dynamic scenes. To
effectively alleviate the problems caused by small and large motions, our
method implicitly aligns LDR images by sampling and aggregating
high-correspondence features in a coarse-to-fine manner. Furthermore, we
propose a densely connected network based on discrete wavelet transform for
performance improvement, which decomposes the input into several
non-overlapping frequency subbands and adaptively performs compensation in the
wavelet domain. Experiments show that our proposed method can achieve
state-of-the-art performances under diverse scenes, compared to other promising
HDR imaging methods. In addition, the HDR images generated by our method
contain cleaner and more detailed content, with fewer distortions, leading to
better visual quality
デバイスの限界を超えた正確な撮像を可能にする深層学習
Tohoku University博士(情報科学)thesi
Exposure Fusion for Hand-held Camera Inputs with Optical Flow and PatchMatch
This paper proposes a hybrid synthesis method for multi-exposure image fusion
taken by hand-held cameras. Motions either due to the shaky camera or caused by
dynamic scenes should be compensated before any content fusion. Any
misalignment can easily cause blurring/ghosting artifacts in the fused result.
Our hybrid method can deal with such motions and maintain the exposure
information of each input effectively. In particular, the proposed method first
applies optical flow for a coarse registration, which performs well with
complex non-rigid motion but produces deformations at regions with missing
correspondences. The absence of correspondences is due to the occlusions of
scene parallax or the moving contents. To correct such error registration, we
segment images into superpixels and identify problematic alignments based on
each superpixel, which is further aligned by PatchMatch. The method combines
the efficiency of optical flow and the accuracy of PatchMatch. After PatchMatch
correction, we obtain a fully aligned image stack that facilitates a
high-quality fusion that is free from blurring/ghosting artifacts. We compare
our method with existing fusion algorithms on various challenging examples,
including the static/dynamic, the indoor/outdoor and the daytime/nighttime
scenes. Experiment results demonstrate the effectiveness and robustness of our
method
Novel image processing algorithms and methods for improving their robustness and operational performance
Image processing algorithms have developed rapidly in recent years. Imaging functions are becoming more common in electronic devices, demanding better image quality, and more robust image capture in challenging conditions. Increasingly more complicated algorithms are being developed in order to achieve better signal to noise characteristics, more accurate colours, and wider dynamic range, in order to approach the human visual system performance levels. [Continues.
Use of Coherent Point Drift in computer vision applications
This thesis presents the novel use of Coherent Point Drift in improving the robustness of a number of computer vision applications. CPD approach includes two methods for registering two images - rigid and non-rigid point set approaches which are based on the transformation model used. The key characteristic of a rigid transformation is that the distance between points is preserved, which means it can be used in the presence of translation, rotation, and scaling. Non-rigid transformations - or affine transforms - provide the opportunity of registering under non-uniform scaling and skew. The idea is to move one point set coherently to align with the second point set. The CPD method finds both the non-rigid transformation and the correspondence distance between two point sets at the same time without having to use a-priori declaration of the transformation model used.
The first part of this thesis is focused on speaker identification in video conferencing. A real-time, audio-coupled video based approach is presented, which focuses more on the video analysis side, rather than the audio analysis that is known to be prone to errors. CPD is effectively utilised for lip movement detection and a temporal face detection approach is used to minimise false positives if face detection algorithm fails to perform.
The second part of the thesis is focused on multi-exposure and multi-focus image fusion with compensation for camera shake. Scale Invariant Feature Transforms (SIFT) are first used to detect keypoints in images being fused. Subsequently this point set is reduced to remove outliers, using RANSAC (RANdom Sample Consensus) and finally the point sets are registered using CPD with non-rigid transformations. The registered images are then fused with a Contourlet based image fusion algorithm that makes use of a novel alpha blending and filtering technique to minimise artefacts. The thesis evaluates the performance of the algorithm in comparison to a number of state-of-the-art approaches, including the key commercial products available in the market at present, showing significantly improved subjective quality in the fused images.
The final part of the thesis presents a novel approach to Vehicle Make & Model Recognition in CCTV video footage. CPD is used to effectively remove skew of vehicles detected as CCTV cameras are not specifically configured for the VMMR task and may capture vehicles at different approaching angles. A LESH (Local Energy Shape Histogram) feature based approach is used for vehicle make and model recognition with the novelty that temporal processing is used to improve reliability. A number of further algorithms are used to maximise the reliability of the final outcome. Experimental results are provided to prove that the proposed system demonstrates an accuracy in excess of 95% when tested on real CCTV footage with no prior camera calibration
DeepFuse: A Deep Unsupervised Approach for Exposure Fusion with Extreme Exposure Image Pairs
We present a novel deep learning architecture for fusing static
multi-exposure images. Current multi-exposure fusion (MEF) approaches use
hand-crafted features to fuse input sequence. However, the weak hand-crafted
representations are not robust to varying input conditions. Moreover, they
perform poorly for extreme exposure image pairs. Thus, it is highly desirable
to have a method that is robust to varying input conditions and capable of
handling extreme exposure without artifacts. Deep representations have known to
be robust to input conditions and have shown phenomenal performance in a
supervised setting. However, the stumbling block in using deep learning for MEF
was the lack of sufficient training data and an oracle to provide the
ground-truth for supervision. To address the above issues, we have gathered a
large dataset of multi-exposure image stacks for training and to circumvent the
need for ground truth images, we propose an unsupervised deep learning
framework for MEF utilizing a no-reference quality metric as loss function. The
proposed approach uses a novel CNN architecture trained to learn the fusion
operation without reference ground truth image. The model fuses a set of common
low level features extracted from each image to generate artifact-free
perceptually pleasing results. We perform extensive quantitative and
qualitative evaluation and show that the proposed technique outperforms
existing state-of-the-art approaches for a variety of natural images.Comment: ICCV 201
Holistic Dynamic Frequency Transformer for Image Fusion and Exposure Correction
The correction of exposure-related issues is a pivotal component in enhancing
the quality of images, offering substantial implications for various computer
vision tasks. Historically, most methodologies have predominantly utilized
spatial domain recovery, offering limited consideration to the potentialities
of the frequency domain. Additionally, there has been a lack of a unified
perspective towards low-light enhancement, exposure correction, and
multi-exposure fusion, complicating and impeding the optimization of image
processing. In response to these challenges, this paper proposes a novel
methodology that leverages the frequency domain to improve and unify the
handling of exposure correction tasks. Our method introduces Holistic Frequency
Attention and Dynamic Frequency Feed-Forward Network, which replace
conventional correlation computation in the spatial-domain. They form a
foundational building block that facilitates a U-shaped Holistic Dynamic
Frequency Transformer as a filter to extract global information and dynamically
select important frequency bands for image restoration. Complementing this, we
employ a Laplacian pyramid to decompose images into distinct frequency bands,
followed by multiple restorers, each tuned to recover specific frequency-band
information. The pyramid fusion allows a more detailed and nuanced image
restoration process. Ultimately, our structure unifies the three tasks of
low-light enhancement, exposure correction, and multi-exposure fusion, enabling
comprehensive treatment of all classical exposure errors. Benchmarking on
mainstream datasets for these tasks, our proposed method achieves
state-of-the-art results, paving the way for more sophisticated and unified
solutions in exposure correction
- …