24 research outputs found

    Implementation of a distributed real-time video panorama pipeline for creating high quality virtual views

    Get PDF
    Today, we are continuously looking for more immersive video systems. Such systems, however, require more content, which can be costly to produce. A full panorama, covering regions of interest, can contain all the information required, but can be difficult to view in its entirety. In this thesis, we discuss a method for creating virtual views from a cylindrical panorama, allowing multiple users to create individual virtual cameras from the same panorama video. We discuss how this method can be used for video delivery, but emphasize on the creation of the initial panorama. The panorama must be created in real-time, and with very high quality. We design and implement a prototype recording pipeline, installed at a soccer stadium, as a part of the Bagadus project. We describe a pipeline capable of producing 4K panorama videos from five HD cameras, in real-time, with possibilities for further upscaling. We explain how the cylindrical panorama can be created, with minimal computational cost and without visible seams. The cameras of our prototype system record video in the incomplete Bayer format, and we also investigate which debayering algorithms are best suited for recording multiple high resolution video streams in real-time

    Video and Image Super-Resolution via Deep Learning with Attention Mechanism

    Get PDF
    Image demosaicing, image super-resolution and video super-resolution are three important tasks in color imaging pipeline. Demosaicing deals with the recovery of missing color information and generation of full-resolution color images from so-called Color filter Array (CFA) such as Bayer pattern. Image super-resolution aims at increasing the spatial resolution and enhance important structures (e.g., edges and textures) in super-resolved images. Both spatial and temporal dependency are important to the task of video super-resolution, which has received increasingly more attention in recent years. Traditional solutions to these three low-level vision tasks lack generalization capability especially for real-world data. Recently, deep learning methods have achieved great success in vision problems including image demosaicing and image/video super-resolution. Conceptually similar to adaptation in model-based approaches, attention has received increasing more usage in deep learning recently. As a tool to reallocate limited computational resources based on the importance of informative components, attention mechanism which includes channel attention, spatial attention, non-local attention, etc. has found successful applications in both highlevel and low-level vision tasks. However, to the best of our knowledge, 1) most approaches independently studied super-resolution and demosaicing; little is known about the potential benefit of formulating a joint demosaicing and super-resolution (JDSR) problem; 2) attention mechanism has not been studied for spectral channels of color images in the open literature; 3) current approaches for video super-resolution implement deformable convolution based frame alignment methods and naive spatial attention mechanism. How to exploit attention mechanism in spectral and temporal domains sets up the stage for the research in this dissertation. In this dissertation, we conduct a systematic study about those two issues and make the following contributions: 1) we propose a spatial color attention network (SCAN) designed to jointly exploit the spatial and spectral dependency within color images for single image super-resolution (SISR) problem. We present a spatial color attention module that calibrates important color information for individual color components from output feature maps of residual groups. Experimental results have shown that SCAN has achieved superior performance in terms of both subjective and objective qualities on the NTIRE2019 dataset; 2) we propose two competing end-to-end joint optimization solutions to the JDSR problem: Densely-Connected Squeeze-and-Excitation Residual Network (DSERN) vs. Residual-Dense Squeeze-and-Excitation Network (RDSEN). Experimental results have shown that an enhanced design RDSEN can significantly improve both subjective and objective performance over DSERN; 3) we propose a novel deep learning based framework, Deformable Kernel Spatial Attention Network (DKSAN) to super-resolve videos with a scale factor as large as 16 (the extreme SR situation). Thanks to newly designed Deformable Kernel Convolution Alignment (DKC Align) and Deformable Kernel Spatial Attention (DKSA) modules, DKSAN can get both better subjective and objective results when compared with the existing state-of-the-art approach enhanced deformable convolutional network (EDVR)

    Kuva-anturien tunnistaminen valovasteen epäyhdenmukaisuutta hyödyntäen

    Get PDF
    This thesis shows a method to identify a camera source by examining the noise inherent to the imaging process of the camera. The noise is caused by the imaging hardware, e.g. physical properties of charge-coupled device (CCD), the lens, and the Bayer pattern filter. The noise is then altered by the algorithms of the imaging pipeline. After the imaging pipeline, the noise can be isolated from the image by calculating the difference between noisy and denoised image. Noise can be used to form a camera fingerprint by calculating mean noise of a number of training images from same camera, pixel by pixel. The fingerprint can be used to identify the camera by calculating the correlation coefficient between the fingerprints from the cameras and a test image. The image is then assigned to the camera with highest correlation. The key factors affecting the recognition accuracy and stability are the de- noising algorithm and number of training images. It was shown that the best results are achieved with 60 training images and wavelet filter. This thesis evaluates the identification process in four cases. Firstly, between cameras chosen so that each is from different model. Secondly, between different individual cameras from the same model. Thirdly, between all individual cameras without considering the camera model. Finally, forming a fingerprint from one camera from each model, and then using them to identify the rest of the cameras from that model. It was shown that in the first two cases the identification process is feasible, accurate and reasonably stabile. In the latter two cases, the identification process failed to achieve sufficient accuracy to be feasible.Tässä työssä esitetään menetelmä kuvalähteenä olevan kameran tunnistamiseksi tutkimalla kuvausprosessissa sinällään syntyvää kohinaa. Kohina syntyy kuvauksessa käytettävästä laitteistosta, esim. kuva-anturista (CCD), linssistä ja Bayer-suotimesta. Kohinaa muokkaavat kameran automaattisesti kuvanparannukseen käyttämät algoritmit. Kuvanparannuksen jälkeen kohinan voi eristää muodostamalla erotuksen kohinan sisältävän kuvan ja suodatetun kuvan välillä. Kameran sormenjäljen voi muodostaa laskemalla pikseleittäin keskiarvon opetuskuvien kohinasta. Sormenjälkeä käytetään laskemaan korrelaatio testikuvan ja sormenjäljen välillä. Kuvan ottaneeksi kameraksi tunnistetaan se, jonka sormenjäljen ja testikuvan kohinan välillä on suurin korrelaatio. Tärkeimmät tunnistuksen tarkkuuteen ja vakauteen vaikuttavat tekijät ovat kohinanpoistoalgoritmi ja opetuskuvien määrä. Työssä osoitetaan, että parhaat tulokset saadaan käyttämällä 60:tä opetuskuvaa ja aallokesuodatusta. Tässä työssä arvioidaan tunnistusprosessia neljässä tapauksessa. Ensiksi eri malleista valittujen yksittäisten kameroiden suhteen, toiseksi saman kameramallin yksilöiden välillä, kolmanneksi kaikkien yksittäisten kameroiden välillä jättäen huomiotta kameramallin, ja viimeiseksi pyritään yhtä kameraa käyttäen muodostamaan prototyyppisormenjälki, jolla tunnistaa muut samanmalliset kamerat. Työssä osoitettiin, että kahdessa ensinmainitussa tapauksessa tunnistus toimii riittävän tarkasti ja vakaasti. Jälkimmäisissä kahdessa tapauksessa tunnistus ei saavuttanut riittävää tarkkuutta

    {HDR} Denoising and Deblurring by Learning Spatio-temporal Distortion Model

    Get PDF
    We seek to reconstruct sharp and noise-free high-dynamic range (HDR) video from a dual-exposure sensor that records different low-dynamic range (LDR) information in different pixel columns: Odd columns provide low-exposure, sharp, but noisy information; even columns complement this with less noisy, high-exposure, but motion-blurred data. Previous LDR work learns to deblur and denoise (DISTORTED->CLEAN) supervised by pairs of CLEAN and DISTORTED images. Regrettably, capturing DISTORTED sensor readings is time-consuming; as well, there is a lack of CLEAN HDR videos. We suggest a method to overcome those two limitations. First, we learn a different function instead: CLEAN->DISTORTED, which generates samples containing correlated pixel noise, and row and column noise, as well as motion blur from a low number of CLEAN sensor readings. Second, as there is not enough CLEAN HDR video available, we devise a method to learn from LDR video in-stead. Our approach compares favorably to several strong baselines, and can boost existing methods when they are re-trained on our data. Combined with spatial and temporal super-resolution, it enables applications such as re-lighting with low noise or blur

    Digital forensic techniques for the reverse engineering of image acquisition chains

    Get PDF
    In recent years a number of new methods have been developed to detect image forgery. Most forensic techniques use footprints left on images to predict the history of the images. The images, however, sometimes could have gone through a series of processing and modification through their lifetime. It is therefore difficult to detect image tampering as the footprints could be distorted or removed over a complex chain of operations. In this research we propose digital forensic techniques that allow us to reverse engineer and determine history of images that have gone through chains of image acquisition and reproduction. This thesis presents two different approaches to address the problem. In the first part we propose a novel theoretical framework for the reverse engineering of signal acquisition chains. Based on a simplified chain model, we describe how signals have gone in the chains at different stages using the theory of sampling signals with finite rate of innovation. Under particular conditions, our technique allows to detect whether a given signal has been reacquired through the chain. It also makes possible to predict corresponding important parameters of the chain using acquisition-reconstruction artefacts left on the signal. The second part of the thesis presents our new algorithm for image recapture detection based on edge blurriness. Two overcomplete dictionaries are trained using the K-SVD approach to learn distinctive blurring patterns from sets of single captured and recaptured images. An SVM classifier is then built using dictionary approximation errors and the mean edge spread width from the training images. The algorithm, which requires no user intervention, was tested on a database that included more than 2500 high quality recaptured images. Our results show that our method achieves a performance rate that exceeds 99% for recaptured images and 94% for single captured images.Open Acces
    corecore