21,962 research outputs found

    Light Weight Color Image Warping with Inter-Channel Information

    Full text link
    Image warping is a necessary step in many multimedia applications such as texture mapping, image-based rendering, panorama stitching, image resizing and optical flow computation etc. Traditionally, color image warping interpolation is performed in each color channel independently. In this paper, we show that the warping quality can be significantly enhanced by exploiting the cross-channel correlation. We design a warping scheme that integrates intra-channel interpolation with cross-channel variation at very low computational cost, which is required for interactive multimedia applications on mobile devices. The effectiveness and efficiency of our method are validated by extensive experiments

    A Unified Framework for Multi-Sensor HDR Video Reconstruction

    Full text link
    One of the most successful approaches to modern high quality HDR-video capture is to use camera setups with multiple sensors imaging the scene through a common optical system. However, such systems pose several challenges for HDR reconstruction algorithms. Previous reconstruction techniques have considered debayering, denoising, resampling (align- ment) and exposure fusion as separate problems. In contrast, in this paper we present a unifying approach, performing HDR assembly directly from raw sensor data. Our framework includes a camera noise model adapted to HDR video and an algorithm for spatially adaptive HDR reconstruction based on fitting of local polynomial approximations to observed sensor data. The method is easy to implement and allows reconstruction to an arbitrary resolution and output mapping. We present an implementation in CUDA and show real-time performance for an experimental 4 Mpixel multi-sensor HDR video system. We further show that our algorithm has clear advantages over existing methods, both in terms of flexibility and reconstruction quality

    Pyramid Attention Networks for Image Restoration

    Full text link
    Self-similarity refers to the image prior widely used in image restoration algorithms that small but similar patterns tend to occur at different locations and scales. However, recent advanced deep convolutional neural network based methods for image restoration do not take full advantage of self-similarities by relying on self-attention neural modules that only process information at the same scale. To solve this problem, we present a novel Pyramid Attention module for image restoration, which captures long-range feature correspondences from a multi-scale feature pyramid. Inspired by the fact that corruptions, such as noise or compression artifacts, drop drastically at coarser image scales, our attention module is designed to be able to borrow clean signals from their "clean" correspondences at the coarser levels. The proposed pyramid attention module is a generic building block that can be flexibly integrated into various neural architectures. Its effectiveness is validated through extensive experiments on multiple image restoration tasks: image denoising, demosaicing, compression artifact reduction, and super resolution. Without any bells and whistles, our PANet (pyramid attention module with simple network backbones) can produce state-of-the-art results with superior accuracy and visual quality. Our code will be available at https://github.com/SHI-Labs/Pyramid-Attention-Network

    Three-dimensional virtual refocusing of fluorescence microscopy images using deep learning

    Full text link
    Three-dimensional (3D) fluorescence microscopy in general requires axial scanning to capture images of a sample at different planes. Here we demonstrate that a deep convolutional neural network can be trained to virtually refocus a 2D fluorescence image onto user-defined 3D surfaces within the sample volume. With this data-driven computational microscopy framework, we imaged the neuron activity of a Caenorhabditis elegans worm in 3D using a time-sequence of fluorescence images acquired at a single focal plane, digitally increasing the depth-of-field of the microscope by 20-fold without any axial scanning, additional hardware, or a trade-off of imaging resolution or speed. Furthermore, we demonstrate that this learning-based approach can correct for sample drift, tilt, and other image aberrations, all digitally performed after the acquisition of a single fluorescence image. This unique framework also cross-connects different imaging modalities to each other, enabling 3D refocusing of a single wide-field fluorescence image to match confocal microscopy images acquired at different sample planes. This deep learning-based 3D image refocusing method might be transformative for imaging and tracking of 3D biological samples, especially over extended periods of time, mitigating photo-toxicity, sample drift, aberration and defocusing related challenges associated with standard 3D fluorescence microscopy techniques.Comment: 47 pages, 5 figures (main text

    NTIRE 2020 Challenge on Spectral Reconstruction from an RGB Image

    Full text link
    This paper reviews the second challenge on spectral reconstruction from RGB images, i.e., the recovery of whole-scene hyperspectral (HS) information from a 3-channel RGB image. As in the previous challenge, two tracks were provided: (i) a "Clean" track where HS images are estimated from noise-free RGBs, the RGB images are themselves calculated numerically using the ground-truth HS images and supplied spectral sensitivity functions (ii) a "Real World" track, simulating capture by an uncalibrated and unknown camera, where the HS images are recovered from noisy JPEG-compressed RGB images. A new, larger-than-ever, natural hyperspectral image data set is presented, containing a total of 510 HS images. The Clean and Real World tracks had 103 and 78 registered participants respectively, with 14 teams competing in the final testing phase. A description of the proposed methods, alongside their challenge scores and an extensive evaluation of top performing methods is also provided. They gauge the state-of-the-art in spectral reconstruction from an RGB image

    Sub-Pixel Registration of Wavelet-Encoded Images

    Full text link
    Sub-pixel registration is a crucial step for applications such as super-resolution in remote sensing, motion compensation in magnetic resonance imaging, and non-destructive testing in manufacturing, to name a few. Recently, these technologies have been trending towards wavelet encoded imaging and sparse/compressive sensing. The former plays a crucial role in reducing imaging artifacts, while the latter significantly increases the acquisition speed. In view of these new emerging needs for applications of wavelet encoded imaging, we propose a sub-pixel registration method that can achieve direct wavelet domain registration from a sparse set of coefficients. We make the following contributions: (i) We devise a method of decoupling scale, rotation, and translation parameters in the Haar wavelet domain, (ii) We derive explicit mathematical expressions that define in-band sub-pixel registration in terms of wavelet coefficients, (iii) Using the derived expressions, we propose an approach to achieve in-band subpixel registration, avoiding back and forth transformations. (iv) Our solution remains highly accurate even when a sparse set of coefficients are used, which is due to localization of signals in a sparse set of wavelet coefficients. We demonstrate the accuracy of our method, and show that it outperforms the state-of-the-art on simulated and real data, even when the data is sparse

    Finding Correspondences for Optical Flow and Disparity Estimations using a Sub-pixel Convolution-based Encoder-Decoder Network

    Full text link
    Deep convolutional neural networks (DCNN) have recently shown promising results in low-level computer vision problems such as optical flow and disparity estimation, but still, have much room to further improve their performance. In this paper, we propose a novel sub-pixel convolution-based encoder-decoder network for optical flow and disparity estimations, which can extend FlowNetS and DispNet by replacing the deconvolution layers with sup-pixel convolution blocks. By using sub-pixel refinement and estimation on the decoder stages instead of deconvolution, we can significantly improve the estimation accuracy for optical flow and disparity, even with reduced numbers of parameters. We show a supervised end-to-end training of our proposed networks for optical flow and disparity estimations, and an unsupervised end-to-end training for monocular depth and pose estimations. In order to verify the effectiveness of our proposed networks, we perform intensive experiments for (i) optical flow and disparity estimations, and (ii) monocular depth and pose estimations. Throughout the extensive experiments, our proposed networks outperform the baselines such as FlowNetS and DispNet in terms of estimation accuracy and training times

    Image quality assessment for determining efficacy and limitations of Super-Resolution Convolutional Neural Network (SRCNN)

    Full text link
    Traditional metrics for evaluating the efficacy of image processing techniques do not lend themselves to understanding the capabilities and limitations of modern image processing methods - particularly those enabled by deep learning. When applying image processing in engineering solutions, a scientist or engineer has a need to justify their design decisions with clear metrics. By applying blind/referenceless image spatial quality (BRISQUE), Structural SIMilarity (SSIM) index scores, and Peak signal-to-noise ratio (PSNR) to images before and after image processing, we can quantify quality improvements in a meaningful way and determine the lowest recoverable image quality for a given method

    Neural Imaging Pipelines - the Scourge or Hope of Forensics?

    Full text link
    Forensic analysis of digital photographs relies on intrinsic statistical traces introduced at the time of their acquisition or subsequent editing. Such traces are often removed by post-processing (e.g., down-sampling and re-compression applied upon distribution in the Web) which inhibits reliable provenance analysis. Increasing adoption of computational methods within digital cameras further complicates the process and renders explicit mathematical modeling infeasible. While this trend challenges forensic analysis even in near-acquisition conditions, it also creates new opportunities. This paper explores end-to-end optimization of the entire image acquisition and distribution workflow to facilitate reliable forensic analysis at the end of the distribution channel, where state-of-the-art forensic techniques fail. We demonstrate that a neural network can be trained to replace the entire photo development pipeline, and jointly optimized for high-fidelity photo rendering and reliable provenance analysis. Such optimized neural imaging pipeline allowed us to increase image manipulation detection accuracy from approx. 45% to over 90%. The network learns to introduce carefully crafted artifacts, akin to digital watermarks, which facilitate subsequent manipulation detection. Analysis of performance trade-offs indicates that most of the gains can be obtained with only minor distortion. The findings encourage further research towards building more reliable imaging pipelines with explicit provenance-guaranteeing properties.Comment: Manuscript + supplement; currently under review; compressed figures to minimize file size. arXiv admin note: text overlap with arXiv:1812.0151

    Aerial Spectral Super-Resolution using Conditional Adversarial Networks

    Full text link
    Inferring spectral signatures from ground based natural images has acquired a lot of interest in applied deep learning. In contrast to the spectra of ground based images, aerial spectral images have low spatial resolution and suffer from higher noise interference. In this paper, we train a conditional adversarial network to learn an inverse mapping from a trichromatic space to 31 spectral bands within 400 to 700 nm. The network is trained on AeroCampus, a first of its kind aerial hyperspectral dataset. AeroCampus consists of high spatial resolution color images and low spatial resolution hyperspectral images (HSI). Color images synthesized from 31 spectral bands are used to train our network. With a baseline root mean square error of 2.48 on the synthesized RGB test data, we show that it is possible to generate spectral signatures in aerial imagery
    corecore