49,424 research outputs found

    A Comprehensive Review of Deep Learning-based Single Image Super-resolution

    Get PDF
    Image super-resolution (SR) is one of the vital image processing methods that improve the resolution of an image in the field of computer vision. In the last two decades, significant progress has been made in the field of super-resolution, especially by utilizing deep learning methods. This survey is an effort to provide a detailed survey of recent progress in single-image super-resolution in the perspective of deep learning while also informing about the initial classical methods used for image super-resolution. The survey classifies the image SR methods into four categories, i.e., classical methods, supervised learning-based methods, unsupervised learning-based methods, and domain-specific SR methods. We also introduce the problem of SR to provide intuition about image quality metrics, available reference datasets, and SR challenges. Deep learning-based approaches of SR are evaluated using a reference dataset. Some of the reviewed state-of-the-art image SR methods include the enhanced deep SR network (EDSR), cycle-in-cycle GAN (CinCGAN), multiscale residual network (MSRN), meta residual dense network (Meta-RDN), recurrent back-projection network (RBPN), second-order attention network (SAN), SR feedback network (SRFBN) and the wavelet-based residual attention network (WRAN). Finally, this survey is concluded with future directions and trends in SR and open problems in SR to be addressed by the researchers.Comment: 56 Pages, 11 Figures, 5 Table

    Optimized highway deep learning network for fast single image super-resolution reconstruction

    Get PDF
    With the success of the deep residual network for image recognition tasks, the residual connection or skip connection has been widely used in deep learning models for various vision tasks, including single image super-resolution (SISR). Most existing SISR approaches pay particular attention to residual learning, while few studies investigate highway connection for SISR. Although skip connection can help to alleviate the vanishing gradient problem and enable fast training of the deep network, it still provides the coarse level of approximation in both forward and backward propagation paths and thus challenging to recover high-frequency details. To address this issue, we propose a novel model for SISR by using highway connection (HNSR), which composes of a nonlinear gating mechanism to further regulate the information. By using the global residual learning and replacing all local residual learning with designed gate unit in highway connection, HNSR has the capability of efficiently learning different hierarchical features and recovering much more details in image reconstruction. The experimental results have validated that HNSR can provide not only improved quality but also less prone to a few common problems during training. Besides, the more robust and efficient model is suitable for implementation in real-time and mobile systems

    Deep Learning approaches for Robotic Grasp Detection and Image Super-Resolution

    Get PDF
    Department of Electrical EngineeringIn recent years, many papers mentioned that use Deep learning to objects detection and robot grasping detection have improved accuracy with higher image resolutions. We use the Deep learning to describe robot grasp detection and image supre-resolution related two papers. 0.0.1 Real-Time, Highly Accurate Robotic Grasp Detection using Fully Convolutional Neural Networks with High-Resolution Images Robotic grasp detection for novel objects is a challenging task, but for the last few years, deep learning based approaches have achieved remarkable performance improvements, up to 96.1% accuracy, with RGB-D data. In this paper, we propose fully convolutional neural network (FCNN) based methods for robotic grasp detection. Our methods also achieved state-of-the-art detection accuracy (up to 96.6%) with state-of-the-art real-time computation time for high-resolution images (6-20ms per 360 360 image) on Cornell dataset. Due to FCNN, our proposed method can be applied to images with any size for detecting multigrasps on multiobjects. Proposed methods were evaluated using 4-axis robot arm with small parallel gripper and RGB-D camera for grasping challenging small, novel objects. With accurate visionrobot coordinate calibration through our proposed learning-based, fully automatic approach, our proposed method yielded 90% success rate. 0.0.2 Efficient Module Based Single Image Super Resolution for Multiple Problems Example based single image super resolution (SR) is a fundamental task in computer vision. It is challenging, but recently, there have been significant performance improvements using deep learning approaches. In this article, we propose efficient module based single image SR networks (EMBSR) and tackle multiple SR problems in NTIRE 2018 challenge by recycling trained networks. Our proposed EMBSR allowed us to reduce training time with effectively deeper networks, to use modular ensemble for improved performance, and to separate subproblems for better erformance. We also proposed EDSR-PP, an improved version of previous ESDR by incorporating pyramid pooling so that global as well as local context information can be utilized. Lastly, we proposed a novel denoising / deblurring residual convolutional network (DnResNet) using residual block and batch normalization. Our proposed EMBSR with DnResNet demonstrated that multiple SR problems can be tackled efficiently and e ectively by winning the 2nd place for Track 2 and the 3rd place for Track 3. Our proposed method with EDSR-PP also achieved the ninth place for Track 1 with the fastest run time among top nine teams.clos

    Robust prior-based single image super resolution under multiple Gaussian degradations

    Get PDF
    Although SISR (Single Image Super Resolution) problem can be effectively solved by deep learning based methods, the training phase often considers single degradation type such as bicubic interpolation or Gaussian blur with fixed variance. These priori hypotheses often fail and lead to reconstruction error in real scenario. In this paper, we propose an end-to-end CNN model RPSRMD to handle SR problem in multiple Gaussian degradations by extracting and using as side information a shared image prior that is consistent in different Gaussian degradations. The shared image prior is generated by an AED network RPGen with a rationally designed loss function that contains two parts: consistency loss and validity loss. These losses supervise the training of AED to guarantee that the image priors of one image with different Gaussian blurs to be very similar. Afterwards we carefully designed a SR network, which is termed as PResNet (Prior based Residual Network) in this paper, to efficiently use the image priors and generate high quality and robust SR images when unknown Gaussian blur is presented. When we applied variant Gaussian blurs to the low resolution images, the experiments prove that our proposed RPSRMD, which includes RPGen and PResNet as two core components, is superior to many state-of-the-art SR methods that were designed and trained to handle multi-degradation

    Toward Efficient Rendering: A Neural Network Approach

    Get PDF
    Physically-based image synthesis has attracted considerable attention due to its wide applications in visual effects, video games, design visualization, and simulation. However, obtaining visually satisfactory renderings with ray tracing algorithms often requires casting a large number of rays and thus takes a vast amount of computation. The extensive computational and memory requirements of ray tracing methods pose a challenge, especially when running these rendering algorithms on resource-constrained platforms, and impede their applications that require high resolutions and refresh rates. This thesis presents three methods to address the challenge of efficient rendering. First, we present a hybrid rendering method to speed up Monte Carlo rendering algorithms. Our method first generates two versions of a rendering: one at a low resolution with a high sample rate (LRHS) and the other at a high resolution with a low sample rate (HRLS). We then develop a deep convolutional neural network to fuse these two renderings into a high-quality image as if it were rendered at a high resolution with a high sample rate. Specifically, we formulate this fusion task as a super-resolution problem that generates a high-resolution rendering from a low-resolution input (LRHS), assisted with the HRLS rendering. The HRLS rendering provides critical high-frequency details which are difficult to recover from the LRHS for any super-resolution methods. Our experiments show that our hybrid rendering algorithm is significantly faster than the state-of-the-art Monte Carlo denoising methods while rendering high-quality images when tested on both our own BCR dataset and the Gharbi dataset. Second, we investigate super-resolution to reduce the number of pixels to render and thus speed up Monte Carlo rendering algorithms. While great progress has been made in super-resolution technologies, it is essentially an ill-posed problem and cannot recover high-frequency details in renderings. To address this problem, we exploit high-resolution auxiliary features to guide the super-resolution of low-resolution renderings. These high-resolution auxiliary features can be quickly rendered by a rendering engine and, at the same time, provide valuable high-frequency details to assist super-resolution. To this end, we develop a cross-modality Transformer network that consists of an auxiliary feature branch and a low-resolution rendering branch. These two branches are designed to fuse high-resolution auxiliary features with the corresponding low-resolution rendering. Furthermore, we design residual densely-connected Swin Transformer groups for learning to extract representative features to enable high-quality super-resolution. Our experiments show that our auxiliary features-guided super-resolution method outperforms both state-of-the-art super-resolution methods and Monte Carlo denoising methods in producing high-quality renderings. Third, we present a deep-learning-based Monte Carlo Denoising method for the stereoscopic images. Research on deep-learning-based Monte Carlo denoising has made significant progress in recent years. However, existing methods are mostly designed for single-image Monte Carlo denoising, and stereoscopic image Monte Carlo denoising is less explored. Traditional methods require first rendering a noiseless for one view, which is time-consuming. Recent deep-learning-based methods achieve promising results on single-image Monte Carlo denoising, but their performance on the stereoscopic image is compromised as they do not consider the spatial correspondence between the left image and the right image. In this thesis, we present a deep-learning-based Monte Carlo denoising method for stereoscopic images. It takes low sampling per pixel (spp) stereoscopic images as inputs and estimates the high-quality result. Specifically, we extract features from two stereoscopic images and warp the features from one image to the other using the disparity finetuned from the disparity calculated from geometry. To train our network, we collected a large-scale Blender Cycles Stereo Ray-tracing dataset. Our experiments show that our method outperforms state-of-the-art methods when the sampling rates are low
    corecore