1,791 research outputs found

    High-ISO long-exposure image denoising based on quantitative blob characterization

    Get PDF
    Blob detection and image denoising are fundamental, sometimes related tasks in computer vision. In this paper, we present a computational method to quantitatively measure blob characteristics using normalized unilateral second-order Gaussian kernels. This method suppresses non-blob structures while yielding a quantitative measurement of the position, prominence and scale of blobs, which can facilitate the tasks of blob reconstruction and blob reduction. Subsequently, we propose a denoising scheme to address high-ISO long-exposure noise, which sometimes spatially shows a blob appearance, employing a blob reduction procedure as a cheap preprocessing for conventional denoising methods. We apply the proposed denoising methods to real-world noisy images as well as standard images that are corrupted by real noise. The experimental results demonstrate the superiority of the proposed methods over state-of-the-art denoising methods

    FastLLVE: Real-Time Low-Light Video Enhancement with Intensity-Aware Lookup Table

    Full text link
    Low-Light Video Enhancement (LLVE) has received considerable attention in recent years. One of the critical requirements of LLVE is inter-frame brightness consistency, which is essential for maintaining the temporal coherence of the enhanced video. However, most existing single-image-based methods fail to address this issue, resulting in flickering effect that degrades the overall quality after enhancement. Moreover, 3D Convolution Neural Network (CNN)-based methods, which are designed for video to maintain inter-frame consistency, are computationally expensive, making them impractical for real-time applications. To address these issues, we propose an efficient pipeline named FastLLVE that leverages the Look-Up-Table (LUT) technique to maintain inter-frame brightness consistency effectively. Specifically, we design a learnable Intensity-Aware LUT (IA-LUT) module for adaptive enhancement, which addresses the low-dynamic problem in low-light scenarios. This enables FastLLVE to perform low-latency and low-complexity enhancement operations while maintaining high-quality results. Experimental results on benchmark datasets demonstrate that our method achieves the State-Of-The-Art (SOTA) performance in terms of both image quality and inter-frame brightness consistency. More importantly, our FastLLVE can process 1,080p videos at 50+\mathit{50+} Frames Per Second (FPS), which is 2×\mathit{2 \times} faster than SOTA CNN-based methods in inference time, making it a promising solution for real-time applications. The code is available at https://github.com/Wenhao-Li-777/FastLLVE.Comment: 11pages, 9 Figures, and 6 Tables. Accepted by ACMMM 202

    Computational Multimedia for Video Self Modeling

    Get PDF
    Video self modeling (VSM) is a behavioral intervention technique in which a learner models a target behavior by watching a video of oneself. This is the idea behind the psychological theory of self-efficacy - you can learn or model to perform certain tasks because you see yourself doing it, which provides the most ideal form of behavior modeling. The effectiveness of VSM has been demonstrated for many different types of disabilities and behavioral problems ranging from stuttering, inappropriate social behaviors, autism, selective mutism to sports training. However, there is an inherent difficulty associated with the production of VSM material. Prolonged and persistent video recording is required to capture the rare, if not existed at all, snippets that can be used to string together in forming novel video sequences of the target skill. To solve this problem, in this dissertation, we use computational multimedia techniques to facilitate the creation of synthetic visual content for self-modeling that can be used by a learner and his/her therapist with a minimum amount of training data. There are three major technical contributions in my research. First, I developed an Adaptive Video Re-sampling algorithm to synthesize realistic lip-synchronized video with minimal motion jitter. Second, to denoise and complete the depth map captured by structure-light sensing systems, I introduced a layer based probabilistic model to account for various types of uncertainties in the depth measurement. Third, I developed a simple and robust bundle-adjustment based framework for calibrating a network of multiple wide baseline RGB and depth cameras
    corecore