2,554 research outputs found

    Improved Image Coding Autoencoder With Deep Learning

    Full text link
    In this paper, we build autoencoder based pipelines for extreme end-to-end image compression based on Ball\'e's approach, which is the state-of-the-art open source implementation in image compression using deep learning. We deepened the network by adding one more hidden layer before each strided convolutional layer with exactly the same number of down-samplings and up-samplings. Our approach outperformed Ball\'e's approach, and achieved around 4.0% reduction in bits per pixel (bpp), 0.03% increase in multi-scale structural similarity (MS-SSIM), and only 0.47% decrease in peak signal-to-noise ratio (PSNR), It also outperforms all traditional image compression methods including JPEG2000 and HEIC by at least 20% in terms of compression efficiency at similar reconstruction image quality. Regarding encoding and decoding time, our approach takes similar amount of time compared with traditional methods with the support of GPU, which means it's almost ready for industrial applications

    Extreme Image Compression with Deep Learning Autoencoder

    Get PDF
    Image compression can save billions of dollars in the industry by reducing the bits needed to store and transfer an image without significantly losing visual quality. Traditional image compression methods use transform, quantization, predictive coding and entropy coding to tackle the problem, represented by international standards like JPEG (joint photographic experts group), JPEG 2000, BPG (better portable graphics), and HEIC (high efficiency image file format). Recently, there are deep learning based image compression approaches that achieved similar or better performance compared with traditional methods, represented by autoencoder, GAN (generative adversarial networks) and super-resolution based approaches. In this paper, we built autoencoder based pipelines for extreme end-to-end image compression based on Ballé’s approach in 2017 and 2018 and improved the cost function and network structure. We replaced MSE (mean square error) with RMSE (root mean square error) in the cost function and deepened the network by adding one more hidden layer before each strided convolutional layer. The source code is available in bit.ly/deepimagecompressiongithub. Our 2018 approach outperformed Ballé’s approach in 2018, which is the state-of-the-art open source implementation in image compression using deep learning in terms of PSNR (peak signalto- noise ratio) and MS-SSIM (multi-scale structural similarity) with similar bpp (bits per pixel). It also outperformed all traditional image compression methods including JPEG, and HEIC in terms of reconstruction image quality. Regarding encoding and decoding time, our 2018 approach takes significant longer than traditional methods even with the support of GPU, this need to be measured and improved in the future. Experimental results proved that deepening network in autoencoder can effectively increase model fitting without losing generalization when applied to image compression, if the network is designed appropriately. In the future, this image compression method can be applied to video compression if encoding and decoding time can be reduced to an acceptable level. Automatic neural architecture search might also be applied to help find optimal network structure for autoencoder in image compression. Optimizer can also be replaced with trainable ones, like LSTM (long short-term memory) based optimizer. Last but not least, the cost function can also include encoding and decoding time, so that these two metrics can also be optimized during training

    Multiparametric Deep Learning Tissue Signatures for a Radiological Biomarker of Breast Cancer: Preliminary Results

    Full text link
    A new paradigm is beginning to emerge in Radiology with the advent of increased computational capabilities and algorithms. This has led to the ability of real time learning by computer systems of different lesion types to help the radiologist in defining disease. For example, using a deep learning network, we developed and tested a multiparametric deep learning (MPDL) network for segmentation and classification using multiparametric magnetic resonance imaging (mpMRI) radiological images. The MPDL network was constructed from stacked sparse autoencoders with inputs from mpMRI. Evaluation of MPDL consisted of cross-validation, sensitivity, and specificity. Dice similarity between MPDL and post-DCE lesions were evaluated. We demonstrate high sensitivity and specificity for differentiation of malignant from benign lesions of 90% and 85% respectively with an AUC of 0.93. The Integrated MPDL method accurately segmented and classified different breast tissue from multiparametric breast MRI using deep leaning tissue signatures.Comment: Deep Learning, Machine learning, Magnetic resonance imaging, multiparametric MRI, Breast, Cancer, Diffusion, tissue biomarker

    Deep Learning Representation using Autoencoder for 3D Shape Retrieval

    Full text link
    We study the problem of how to build a deep learning representation for 3D shape. Deep learning has shown to be very effective in variety of visual applications, such as image classification and object detection. However, it has not been successfully applied to 3D shape recognition. This is because 3D shape has complex structure in 3D space and there are limited number of 3D shapes for feature learning. To address these problems, we project 3D shapes into 2D space and use autoencoder for feature learning on the 2D images. High accuracy 3D shape retrieval performance is obtained by aggregating the features learned on 2D images. In addition, we show the proposed deep learning feature is complementary to conventional local image descriptors. By combing the global deep learning representation and the local descriptor representation, our method can obtain the state-of-the-art performance on 3D shape retrieval benchmarks.Comment: 6 pages, 7 figures, 2014ICSPA

    Boltzmann Machines and Denoising Autoencoders for Image Denoising

    Full text link
    Image denoising based on a probabilistic model of local image patches has been employed by various researchers, and recently a deep (denoising) autoencoder has been proposed by Burger et al. [2012] and Xie et al. [2012] as a good model for this. In this paper, we propose that another popular family of models in the field of deep learning, called Boltzmann machines, can perform image denoising as well as, or in certain cases of high level of noise, better than denoising autoencoders. We empirically evaluate the two models on three different sets of images with different types and levels of noise. Throughout the experiments we also examine the effect of the depth of the models. The experiments confirmed our claim and revealed that the performance can be improved by adding more hidden layers, especially when the level of noise is high

    Deep Learning for Wireless Communications

    Full text link
    Existing communication systems exhibit inherent limitations in translating theory to practice when handling the complexity of optimization for emerging wireless applications with high degrees of freedom. Deep learning has a strong potential to overcome this challenge via data-driven solutions and improve the performance of wireless systems in utilizing limited spectrum resources. In this chapter, we first describe how deep learning is used to design an end-to-end communication system using autoencoders. This flexible design effectively captures channel impairments and optimizes transmitter and receiver operations jointly in single-antenna, multiple-antenna, and multiuser communications. Next, we present the benefits of deep learning in spectrum situation awareness ranging from channel modeling and estimation to signal detection and classification tasks. Deep learning improves the performance when the model-based methods fail. Finally, we discuss how deep learning applies to wireless communication security. In this context, adversarial machine learning provides novel means to launch and defend against wireless attacks. These applications demonstrate the power of deep learning in providing novel means to design, optimize, adapt, and secure wireless communications

    Simultaneous Feature Aggregating and Hashing for Large-scale Image Search

    Full text link
    In most state-of-the-art hashing-based visual search systems, local image descriptors of an image are first aggregated as a single feature vector. This feature vector is then subjected to a hashing function that produces a binary hash code. In previous work, the aggregating and the hashing processes are designed independently. In this paper, we propose a novel framework where feature aggregating and hashing are designed simultaneously and optimized jointly. Specifically, our joint optimization produces aggregated representations that can be better reconstructed by some binary codes. This leads to more discriminative binary hash codes and improved retrieval accuracy. In addition, we also propose a fast version of the recently-proposed Binary Autoencoder to be used in our proposed framework. We perform extensive retrieval experiments on several benchmark datasets with both SIFT and convolutional features. Our results suggest that the proposed framework achieves significant improvements over the state of the art.Comment: Accepted to CVPR 201

    Quantization-Based Regularization for Autoencoders

    Full text link
    Autoencoders and their variations provide unsupervised models for learning low-dimensional representations for downstream tasks. Without proper regularization, autoencoder models are susceptible to the overfitting problem and the so-called posterior collapse phenomenon. In this paper, we introduce a quantization-based regularizer in the bottleneck stage of autoencoder models to learn meaningful latent representations. We combine both perspectives of Vector Quantized-Variational AutoEncoders (VQ-VAE) and classical denoising regularization methods of neural networks. We interpret quantizers as regularizers that constrain latent representations while fostering a similarity-preserving mapping at the encoder. Before quantization, we impose noise on the latent codes and use a Bayesian estimator to optimize the quantizer-based representation. The introduced bottleneck Bayesian estimator outputs the posterior mean of the centroids to the decoder, and thus, is performing soft quantization of the noisy latent codes. We show that our proposed regularization method results in improved latent representations for both supervised learning and clustering downstream tasks when compared to autoencoders using other bottleneck structures.Comment: AAAI 202

    Deep Likelihood Network for Image Restoration with Multiple Degradation Levels

    Full text link
    Convolutional neural networks have been proven effective in a variety of image restoration tasks. Most state-of-the-art solutions, however, are trained using images with a single particular degradation level, and their performance deteriorates drastically when applied to other degradation settings. In this paper, we propose deep likelihood network (DL-Net), aiming at generalizing off-the-shelf image restoration networks to succeed over a spectrum of degradation levels. We slightly modify an off-the-shelf network by appending a simple recursive module, which is derived from a fidelity term, for disentangling the computation for multiple degradation levels. Extensive experimental results on image inpainting, interpolation, and super-resolution show the effectiveness of our DL-Net.Comment: Accepted by IEEE Transactions on Image Processing; 13 pages, 6 figure

    A Classification Supervised Auto-Encoder Based on Predefined Evenly-Distributed Class Centroids

    Full text link
    Classic variational autoencoders are used to learn complex data distributions, that are built on standard function approximators. Especially, VAE has shown promise on a lot of complex task. In this paper, a new autoencoder model - classification supervised autoencoder (CSAE) based on predefined evenly-distributed class centroids (PEDCC) is proposed. Our method uses PEDCC of latent variables to train the network to ensure the maximization of inter-class distance and the minimization of inner-class distance. Instead of learning mean/variance of latent variables distribution and taking reparameterization of VAE, latent variables of CSAE are directly used to classify and as input of decoder. In addition, a new loss function is proposed to combine the loss function of classification. Based on the basic structure of the universal autoencoder, we realized the comprehensive optimal results of encoding, decoding, classification, and good model generalization performance at the same time. Theoretical advantages are reflected in experimental results.Comment: 16 pages,12 figures, 4 table
    • …
    corecore