80 research outputs found

    Real-World Blur Dataset for Learning and Benchmarking Deblurring Algorithms

    Get PDF
    Computational Photography, Deblurring, Low-level Vision, Datasets and EvaluationNumerous learning-based approaches to single image deblurring for camera and object motion blurs have recently been proposed. To generalize such approaches to real-world blurs, large datasets of real blurred images and their ground truth sharp images are essential. However, there are still no such datasets, thus all the existing approaches resort to synthetic ones, which leads to the failure of deblurring real-world images. In this work, we present a large-scale dataset of real-world blurred images and their corresponding sharp images captured in low-light environments for learning and benchmarking single image deblurring methods. To collect our dataset, we build an image acquisition system to simultaneously capture a geometrically aligned pair of blurred and sharp images, and develop a post-processing method to further align images geometrically and photometrically. We analyze the effect of our post-processing step, and the performance of existing learning-based deblurring methods. Our analysis shows that our dataset significantly improves deblurring quality for real-world low-light images.Y1. Introduction 1 2. Related Work 2 3. Image Acquisition System and Process 3 3.1 Image Acquisition System 3 3.2 Image Acquisition Process 4 4. Post-Processing 5 4.1 Downsampling & Denoising 6 4.2 Geometric Alignment 6 4.3 Photometric Alignment 8 5. Experiments 8 5.1 Analysis of RealBlur Dataset 9 5.2 Benchmark 12 6. Conclusion 19 7. Appendix 20 8. References 24 9. 요약문 28MasterdCollectio

    Depth and IMU aided image deblurring based on deep learning

    Get PDF
    Abstract. With the wide usage and spread of camera phones, it becomes necessary to tackle the problem of the image blur. Embedding a camera in those small devices implies obviously small sensor size compared to sensors in professional cameras such as full-frame Digital Single-Lens Reflex (DSLR) cameras. As a result, this can dramatically affect the collected amount of photons on the image sensor. To overcome this, a long exposure time is needed, but with slight motions that often happen in handheld devices, experiencing image blur is inevitable. Our interest in this thesis is the motion blur that can be caused by the camera motion, scene (objects in the scene) motion, or generally the relative motion between the camera and scene. We use deep neural network (DNN) models in contrary to conventional (non DNN-based) methods which are computationally expensive and time-consuming. The process of deblurring an image is guided by utilizing the scene depth and camera’s inertial measurement unit (IMU) records. One of the challenges of adopting DNN solutions is that a relatively huge amount of data is needed to train the neural network. Moreover, several hyperparameters need to be tuned including the network architecture itself. To train our network, a novel and promising method of synthesizing spatially-variant motion blur is proposed that considers the depth variations in the scene, which showed improvement of results against other methods. In addition to the synthetic dataset generation algorithm, a real blurry and sharp dataset collection setup is designed. This setup can provide thousands of real blurry and sharp images which can be of paramount benefit in DNN training or fine-tuning

    Selected Topics in Bayesian Image/Video Processing

    Get PDF
    In this dissertation, three problems in image deblurring, inpainting and virtual content insertion are solved in a Bayesian framework.;Camera shake, motion or defocus during exposure leads to image blur. Single image deblurring has achieved remarkable results by solving a MAP problem, but there is no perfect solution due to inaccurate image prior and estimator. In the first part, a new non-blind deconvolution algorithm is proposed. The image prior is represented by a Gaussian Scale Mixture(GSM) model, which is estimated from non-blurry images as training data. Our experimental results on a total twelve natural images have shown that more details are restored than previous deblurring algorithms.;In augmented reality, it is a challenging problem to insert virtual content in video streams by blending it with spatial and temporal information. A generic virtual content insertion (VCI) system is introduced in the second part. To the best of my knowledge, it is the first successful system to insert content on the building facades from street view video streams. Without knowing camera positions, the geometry model of a building facade is established by using a detection and tracking combined strategy. Moreover, motion stabilization, dynamic registration and color harmonization contribute to the excellent augmented performance in this automatic VCI system.;Coding efficiency is an important objective in video coding. In recent years, video coding standards have been developing by adding new tools. However, it costs numerous modifications in the complex coding systems. Therefore, it is desirable to consider alternative standard-compliant approaches without modifying the codec structures. In the third part, an exemplar-based data pruning video compression scheme for intra frame is introduced. Data pruning is used as a pre-processing tool to remove part of video data before they are encoded. At the decoder, missing data is reconstructed by a sparse linear combination of similar patches. The novelty is to create a patch library to exploit similarity of patches. The scheme achieves an average 4% bit rate reduction on some high definition videos

    Neural Global Shutter: Learn to Restore Video from a Rolling Shutter Camera with Global Reset Feature

    Full text link
    Most computer vision systems assume distortion-free images as inputs. The widely used rolling-shutter (RS) image sensors, however, suffer from geometric distortion when the camera and object undergo motion during capture. Extensive researches have been conducted on correcting RS distortions. However, most of the existing work relies heavily on the prior assumptions of scenes or motions. Besides, the motion estimation steps are either oversimplified or computationally inefficient due to the heavy flow warping, limiting their applicability. In this paper, we investigate using rolling shutter with a global reset feature (RSGR) to restore clean global shutter (GS) videos. This feature enables us to turn the rectification problem into a deblur-like one, getting rid of inaccurate and costly explicit motion estimation. First, we build an optic system that captures paired RSGR/GS videos. Second, we develop a novel algorithm incorporating spatial and temporal designs to correct the spatial-varying RSGR distortion. Third, we demonstrate that existing image-to-image translation algorithms can recover clean GS videos from distorted RSGR inputs, yet our algorithm achieves the best performance with the specific designs. Our rendered results are not only visually appealing but also beneficial to downstream tasks. Compared to the state-of-the-art RS solution, our RSGR solution is superior in both effectiveness and efficiency. Considering it is easy to realize without changing the hardware, we believe our RSGR solution can potentially replace the RS solution in taking distortion-free videos with low noise and low budget.Comment: CVPR2022, https://github.com/lightChaserX/neural-global-shutte

    Artificial Intelligence in the Creative Industries: A Review

    Full text link
    This paper reviews the current state of the art in Artificial Intelligence (AI) technologies and applications in the context of the creative industries. A brief background of AI, and specifically Machine Learning (ML) algorithms, is provided including Convolutional Neural Network (CNNs), Generative Adversarial Networks (GANs), Recurrent Neural Networks (RNNs) and Deep Reinforcement Learning (DRL). We categorise creative applications into five groups related to how AI technologies are used: i) content creation, ii) information analysis, iii) content enhancement and post production workflows, iv) information extraction and enhancement, and v) data compression. We critically examine the successes and limitations of this rapidly advancing technology in each of these areas. We further differentiate between the use of AI as a creative tool and its potential as a creator in its own right. We foresee that, in the near future, machine learning-based AI will be adopted widely as a tool or collaborative assistant for creativity. In contrast, we observe that the successes of machine learning in domains with fewer constraints, where AI is the `creator', remain modest. The potential of AI (or its developers) to win awards for its original creations in competition with human creatives is also limited, based on contemporary technologies. We therefore conclude that, in the context of creative industries, maximum benefit from AI will be derived where its focus is human centric -- where it is designed to augment, rather than replace, human creativity

    Computational Multispectral Endoscopy

    Get PDF
    Minimal Access Surgery (MAS) is increasingly regarded as the de-facto approach in interventional medicine for conducting many procedures this is due to the reduced patient trauma and consequently reduced recovery times, complications and costs. However, there are many challenges in MAS that come as a result of viewing the surgical site through an endoscope and interacting with tissue remotely via tools, such as lack of haptic feedback; limited field of view; and variation in imaging hardware. As such, it is important best utilise the imaging data available to provide a clinician with rich data corresponding to the surgical site. Measuring tissue haemoglobin concentrations can give vital information, such as perfusion assessment after transplantation; visualisation of the health of blood supply to organ; and to detect ischaemia. In the area of transplant and bypass procedures measurements of the tissue tissue perfusion/total haemoglobin (THb) and oxygen saturation (SO2) are used as indicators of organ viability, these measurements are often acquired at multiple discrete points across the tissue using with a specialist probe. To acquire measurements across the whole surface of an organ one can use a specialist camera to perform multispectral imaging (MSI), which optically acquires sequential spectrally band limited images of the same scene. This data can be processed to provide maps of the THb and SO2 variation across the tissue surface which could be useful for intra operative evaluation. When capturing MSI data, a trade off often has to be made between spectral sensitivity and capture speed. The work in thesis first explores post processing blurry MSI data from long exposure imaging devices. It is of interest to be able to use these MSI data because the large number of spectral bands that can be captured, the long capture times, however, limit the potential real time uses for clinicians. Recognising the importance to clinicians of real-time data, the main body of this thesis develops methods around estimating oxy- and deoxy-haemoglobin concentrations in tissue using only monocular and stereo RGB imaging data

    YDA görüntü gölgeleme gidermede gelişmişlik seviyesi ve YDA görüntüler için nesnel bir gölgeleme giderme kalite metriği.

    Get PDF
    Despite the emergence of new HDR acquisition methods, the multiple exposure technique (MET) is still the most popular one. The application of MET on dynamic scenes is a challenging task due to the diversity of motion patterns and uncontrollable factors such as sensor noise, scene occlusion and performance concerns on some platforms with limited computational capability. Currently, there are already more than 50 deghosting algorithms proposed for artifact-free HDR imaging of dynamic scenes and it is expected that this number will grow in the future. Due to the large number of algorithms, it is a difficult and time-consuming task to conduct subjective experiments for benchmarking recently proposed algorithms. In this thesis, first, a taxonomy of HDR deghosting methods and the key characteristics of each group of algorithms are introduced. Next, the potential artifacts which are observed frequently in the outputs of HDR deghosting algorithms are defined and an objective HDR image deghosting quality metric is presented. It is found that the proposed metric is well correlated with the human preferences and it may be used as a reference for benchmarking current and future HDR image deghosting algorithmsPh.D. - Doctoral Progra

    Bringing Blurry Images Alive: High-Quality Image Restoration and Video Reconstruction

    Get PDF
    Consumer-level cameras are affordable for customers. While handy and easy to use, images and videos are likely to suffer from motion blur effect, especially under low-lighting conditions. Moreover, it is rather difficult to take high frame-rate videos due to the hardware limitations of conventional RGB-sensors. Therefore, our thesis mainly focuses on restoring high-quality (sharp, and high frame-rate) images and videos, from the low-quality (blur, and low frame-rate) ones for better practical applications. In this thesis, we mainly address the problem of how to restore a sharp image from a blurred stereo video sequence, a blurred RGB-D image, or a single blurred image. Then, by utilizing the faithful information about the motion provided by blurry effects in the image, we reconstruct high frame-rate and sharp videos based on an event camera, that brings blurry frame alive. Stereo camera systems can provide motion information incorporated to help to remove complex spatially-varying motion blur in dynamic scenes. Given consecutive blurred stereo video frames, we recover the latent images, estimate the 3D scene flow, and segment the multiple moving objects simultaneously. We represent the dynamic scenes with the piece-wise planar model, which exploits the local structure of the scene and expresses various dynamic scenes. These three tasks are naturally connected under our model and expressed as the parameter estimation of 3D scene structure and camera motion (structure and motion for the dynamic scenes). To tackle the challenging, minimal image deblurring case, namely, single-image deblurring, we first focus on blur caused by camera shake during the exposure time. We propose to jointly estimate the 6 DoF camera motion and remove the non-uniform blur by exploiting their underlying geometric relationships, with a single blurred RGB-D image as input. We formulate our joint deblurring and 6 DoF camera motion estimation as an energy minimization problem solved in an alternative manner. In general cases, we solve the single-image deblurring task by studying the problem in the frequency domain. We show that the auto-correlation of the absolute phase-only image (phase-only image means the image is reconstructed only from the phase information of the blurry image) can provide faithful information about the motion (e.g., the motion direction and magnitude) that caused the blur, leading to a new and efficient blur kernel estimation approach. Event cameras are gaining attention for they measure intensity changes (called `events') with microsecond accuracy. The event camera allows the simultaneous output of the intensity frames. However, the images are captured at a relatively low frame-rate and often suffer from motion blur. A blurred image can be regarded as the integral of a sequence of latent images, while the events indicate the changes between the latent images. Therefore, we model the blur-generation process by associating event data to a latent image. We propose a simple and effective approach, the EDI model, to reconstruct a high frame-rate, sharp video (>1000 fps) from a single blurry frame and its event data. The video generation is based on solving a simple non-convex optimization problem in a single scalar variable. Then, we improved the EDI model by using multiple images and their events to handle flickering effects and noise in the generated video. Also, we provide a more efficient solver to minimize the proposed energy model. Last, the blurred image and events also contribute to optical flow estimation. We propose a single image and events based optical flow estimation approach to unlock their potential applications. In summary, this thesis addresses how to recover sharp images from blurred ones and reconstruct a high temporal resolution video from a single image and event. Our extensive experimental results demonstrate our proposed methods outperform the state-of-the-art
    corecore