7,552 research outputs found
Learning a Hierarchical Latent-Variable Model of 3D Shapes
We propose the Variational Shape Learner (VSL), a generative model that
learns the underlying structure of voxelized 3D shapes in an unsupervised
fashion. Through the use of skip-connections, our model can successfully learn
and infer a latent, hierarchical representation of objects. Furthermore,
realistic 3D objects can be easily generated by sampling the VSL's latent
probabilistic manifold. We show that our generative model can be trained
end-to-end from 2D images to perform single image 3D model retrieval.
Experiments show, both quantitatively and qualitatively, the improved
generalization of our proposed model over a range of tasks, performing better
or comparable to various state-of-the-art alternatives.Comment: Accepted as oral presentation at International Conference on 3D
Vision (3DV), 201
Synthetic Low-Field MRI Super-Resolution Via Nested U-Net Architecture
Low-field (LF) MRI scanners have the power to revolutionize medical imaging
by providing a portable and cheaper alternative to high-field MRI scanners.
However, such scanners are usually significantly noisier and lower quality than
their high-field counterparts. The aim of this paper is to improve the SNR and
overall image quality of low-field MRI scans to improve diagnostic capability.
To address this issue, we propose a Nested U-Net neural network architecture
super-resolution algorithm that outperforms previously suggested deep learning
methods with an average PSNR of 78.83 and SSIM of 0.9551. We tested our network
on artificial noisy downsampled synthetic data from a major T1 weighted MRI
image dataset called the T1-mix dataset. One board-certified radiologist scored
25 images on the Likert scale (1-5) assessing overall image quality, anatomical
structure, and diagnostic confidence across our architecture and other
published works (SR DenseNet, Generator Block, SRCNN, etc.). We also introduce
a new type of loss function called natural log mean squared error (NLMSE). In
conclusion, we present a more accurate deep learning method for single image
super-resolution applied to synthetic low-field MRI via a Nested U-Net
architecture
Learning spatial and spectral features via 2D-1D generative adversarial network for hyperspectral image super-resolution
Three-dimensional (3D) convolutional networks have been proven to be able to explore spatial context and spectral information simultaneously for super-resolution (SR). However, such kind of network can’t be practically designed very
‘deep’ due to the long training time and GPU memory limitations involved in 3D convolution. Instead, in this paper, spatial context and spectral information in hyperspectral images (HSIs) are explored using Two-dimensional (2D) and Onedimenional (1D) convolution, separately. Therefore, a novel 2D-1D generative adversarial network architecture (2D-1DHSRGAN) is proposed for SR of HSIs. Specifically, the generator network consists of a spatial network and a spectral network, in which spatial network is trained with the least absolute deviations loss function to explore spatial context by 2D convolution and spectral network is trained with the spectral angle mapper (SAM) loss function to extract spectral information by 1D convolution. Experimental results over two real HSIs demonstrate that the proposed 2D-1D-HSRGAN clearly outperforms several state-of-the-art algorithms
Single Image Super-Resolution Using Multi-Scale Deep Encoder-Decoder with Phase Congruency Edge Map Guidance
This paper presents an end-to-end multi-scale deep encoder (convolution) and decoder (deconvolution) network for single image super-resolution (SISR) guided by phase congruency (PC) edge map. Our system starts by a single scale symmetrical encoder-decoder structure for SISR, which is extended to a multi-scale model by integrating wavelet multi-resolution analysis into our network. The new multi-scale deep learning system allows the low resolution (LR) input and its PC edge map to be combined so as to precisely predict the multi-scale super-resolved edge details with the guidance of the high-resolution (HR) PC edge map. In this way, the proposed deep model takes both the reconstruction of image pixels’ intensities and the recovery of multi-scale edge details into consideration under the same framework. We evaluate the proposed model on benchmark datasets of different data scenarios, such as Set14 and BSD100 - natural images, Middlebury and New Tsukuba - depth images. The evaluations based on both PSNR and visual perception reveal that the proposed model is superior to the state-of-the-art methods
EraseNet: A Recurrent Residual Network for Supervised Document Cleaning
Document denoising is considered one of the most challenging tasks in
computer vision. There exist millions of documents that are still to be
digitized, but problems like document degradation due to natural and man-made
factors make this task very difficult. This paper introduces a supervised
approach for cleaning dirty documents using a new fully convolutional
auto-encoder architecture. This paper focuses on restoring documents with
discrepancies like deformities caused due to aging of a document, creases left
on the pages that were xeroxed, random black patches, lightly visible text,
etc., and also improving the quality of the image for better optical character
recognition system (OCR) performance. Removing noise from scanned documents is
a very important step before the documents as this noise can severely affect
the performance of an OCR system. The experiments in this paper have shown
promising results as the model is able to learn a variety of ordinary as well
as unusual noises and rectify them efficiently.Comment: 10 pages, 5 figures, attempting for publication in International
Journal on Document Analysis and Recognition (IJDAR
- …