2,554 research outputs found
Improved Image Coding Autoencoder With Deep Learning
In this paper, we build autoencoder based pipelines for extreme end-to-end
image compression based on Ball\'e's approach, which is the state-of-the-art
open source implementation in image compression using deep learning. We
deepened the network by adding one more hidden layer before each strided
convolutional layer with exactly the same number of down-samplings and
up-samplings. Our approach outperformed Ball\'e's approach, and achieved around
4.0% reduction in bits per pixel (bpp), 0.03% increase in multi-scale
structural similarity (MS-SSIM), and only 0.47% decrease in peak
signal-to-noise ratio (PSNR), It also outperforms all traditional image
compression methods including JPEG2000 and HEIC by at least 20% in terms of
compression efficiency at similar reconstruction image quality. Regarding
encoding and decoding time, our approach takes similar amount of time compared
with traditional methods with the support of GPU, which means it's almost ready
for industrial applications
Extreme Image Compression with Deep Learning Autoencoder
Image compression can save billions of dollars in the industry by reducing the bits needed to store and transfer an image without significantly losing visual quality. Traditional image compression methods use transform, quantization, predictive coding and entropy coding to tackle the problem, represented by international standards like JPEG (joint photographic experts group), JPEG 2000, BPG (better portable graphics), and HEIC (high efficiency image file format). Recently, there are deep learning based image compression approaches that achieved similar or better performance compared with traditional methods, represented by autoencoder, GAN (generative adversarial networks) and super-resolution based approaches.
In this paper, we built autoencoder based pipelines for extreme end-to-end image compression based on Ballé’s approach in 2017 and 2018 and improved the cost function and network structure. We replaced MSE (mean square error) with RMSE (root mean square error) in the cost function and deepened the network by adding one more hidden layer before each strided convolutional layer. The source code is available in bit.ly/deepimagecompressiongithub.
Our 2018 approach outperformed Ballé’s approach in 2018, which is the state-of-the-art open source implementation in image compression using deep learning in terms of PSNR (peak signalto- noise ratio) and MS-SSIM (multi-scale structural similarity) with similar bpp (bits per pixel). It also outperformed all traditional image compression methods including JPEG, and HEIC in terms of reconstruction image quality. Regarding encoding and decoding time, our 2018 approach takes significant longer than traditional methods even with the support of GPU, this need to be measured and improved in the future.
Experimental results proved that deepening network in autoencoder can effectively increase model fitting without losing generalization when applied to image compression, if the network is designed appropriately.
In the future, this image compression method can be applied to video compression if encoding and decoding time can be reduced to an acceptable level. Automatic neural architecture search might also be applied to help find optimal network structure for autoencoder in image compression. Optimizer can also be replaced with trainable ones, like LSTM (long short-term memory) based optimizer. Last but not least, the cost function can also include encoding and decoding time, so that these two metrics can also be optimized during training
Multiparametric Deep Learning Tissue Signatures for a Radiological Biomarker of Breast Cancer: Preliminary Results
A new paradigm is beginning to emerge in Radiology with the advent of
increased computational capabilities and algorithms. This has led to the
ability of real time learning by computer systems of different lesion types to
help the radiologist in defining disease. For example, using a deep learning
network, we developed and tested a multiparametric deep learning (MPDL) network
for segmentation and classification using multiparametric magnetic resonance
imaging (mpMRI) radiological images. The MPDL network was constructed from
stacked sparse autoencoders with inputs from mpMRI. Evaluation of MPDL
consisted of cross-validation, sensitivity, and specificity. Dice similarity
between MPDL and post-DCE lesions were evaluated. We demonstrate high
sensitivity and specificity for differentiation of malignant from benign
lesions of 90% and 85% respectively with an AUC of 0.93. The Integrated MPDL
method accurately segmented and classified different breast tissue from
multiparametric breast MRI using deep leaning tissue signatures.Comment: Deep Learning, Machine learning, Magnetic resonance imaging,
multiparametric MRI, Breast, Cancer, Diffusion, tissue biomarker
Deep Learning Representation using Autoencoder for 3D Shape Retrieval
We study the problem of how to build a deep learning representation for 3D
shape. Deep learning has shown to be very effective in variety of visual
applications, such as image classification and object detection. However, it
has not been successfully applied to 3D shape recognition. This is because 3D
shape has complex structure in 3D space and there are limited number of 3D
shapes for feature learning. To address these problems, we project 3D shapes
into 2D space and use autoencoder for feature learning on the 2D images. High
accuracy 3D shape retrieval performance is obtained by aggregating the features
learned on 2D images. In addition, we show the proposed deep learning feature
is complementary to conventional local image descriptors. By combing the global
deep learning representation and the local descriptor representation, our
method can obtain the state-of-the-art performance on 3D shape retrieval
benchmarks.Comment: 6 pages, 7 figures, 2014ICSPA
Boltzmann Machines and Denoising Autoencoders for Image Denoising
Image denoising based on a probabilistic model of local image patches has
been employed by various researchers, and recently a deep (denoising)
autoencoder has been proposed by Burger et al. [2012] and Xie et al. [2012] as
a good model for this. In this paper, we propose that another popular family of
models in the field of deep learning, called Boltzmann machines, can perform
image denoising as well as, or in certain cases of high level of noise, better
than denoising autoencoders. We empirically evaluate the two models on three
different sets of images with different types and levels of noise. Throughout
the experiments we also examine the effect of the depth of the models. The
experiments confirmed our claim and revealed that the performance can be
improved by adding more hidden layers, especially when the level of noise is
high
Deep Learning for Wireless Communications
Existing communication systems exhibit inherent limitations in translating
theory to practice when handling the complexity of optimization for emerging
wireless applications with high degrees of freedom. Deep learning has a strong
potential to overcome this challenge via data-driven solutions and improve the
performance of wireless systems in utilizing limited spectrum resources. In
this chapter, we first describe how deep learning is used to design an
end-to-end communication system using autoencoders. This flexible design
effectively captures channel impairments and optimizes transmitter and receiver
operations jointly in single-antenna, multiple-antenna, and multiuser
communications. Next, we present the benefits of deep learning in spectrum
situation awareness ranging from channel modeling and estimation to signal
detection and classification tasks. Deep learning improves the performance when
the model-based methods fail. Finally, we discuss how deep learning applies to
wireless communication security. In this context, adversarial machine learning
provides novel means to launch and defend against wireless attacks. These
applications demonstrate the power of deep learning in providing novel means to
design, optimize, adapt, and secure wireless communications
Simultaneous Feature Aggregating and Hashing for Large-scale Image Search
In most state-of-the-art hashing-based visual search systems, local image
descriptors of an image are first aggregated as a single feature vector. This
feature vector is then subjected to a hashing function that produces a binary
hash code. In previous work, the aggregating and the hashing processes are
designed independently. In this paper, we propose a novel framework where
feature aggregating and hashing are designed simultaneously and optimized
jointly. Specifically, our joint optimization produces aggregated
representations that can be better reconstructed by some binary codes. This
leads to more discriminative binary hash codes and improved retrieval accuracy.
In addition, we also propose a fast version of the recently-proposed Binary
Autoencoder to be used in our proposed framework. We perform extensive
retrieval experiments on several benchmark datasets with both SIFT and
convolutional features. Our results suggest that the proposed framework
achieves significant improvements over the state of the art.Comment: Accepted to CVPR 201
Quantization-Based Regularization for Autoencoders
Autoencoders and their variations provide unsupervised models for learning
low-dimensional representations for downstream tasks. Without proper
regularization, autoencoder models are susceptible to the overfitting problem
and the so-called posterior collapse phenomenon. In this paper, we introduce a
quantization-based regularizer in the bottleneck stage of autoencoder models to
learn meaningful latent representations. We combine both perspectives of Vector
Quantized-Variational AutoEncoders (VQ-VAE) and classical denoising
regularization methods of neural networks. We interpret quantizers as
regularizers that constrain latent representations while fostering a
similarity-preserving mapping at the encoder. Before quantization, we impose
noise on the latent codes and use a Bayesian estimator to optimize the
quantizer-based representation. The introduced bottleneck Bayesian estimator
outputs the posterior mean of the centroids to the decoder, and thus, is
performing soft quantization of the noisy latent codes. We show that our
proposed regularization method results in improved latent representations for
both supervised learning and clustering downstream tasks when compared to
autoencoders using other bottleneck structures.Comment: AAAI 202
Deep Likelihood Network for Image Restoration with Multiple Degradation Levels
Convolutional neural networks have been proven effective in a variety of
image restoration tasks. Most state-of-the-art solutions, however, are trained
using images with a single particular degradation level, and their performance
deteriorates drastically when applied to other degradation settings. In this
paper, we propose deep likelihood network (DL-Net), aiming at generalizing
off-the-shelf image restoration networks to succeed over a spectrum of
degradation levels. We slightly modify an off-the-shelf network by appending a
simple recursive module, which is derived from a fidelity term, for
disentangling the computation for multiple degradation levels. Extensive
experimental results on image inpainting, interpolation, and super-resolution
show the effectiveness of our DL-Net.Comment: Accepted by IEEE Transactions on Image Processing; 13 pages, 6
figure
A Classification Supervised Auto-Encoder Based on Predefined Evenly-Distributed Class Centroids
Classic variational autoencoders are used to learn complex data
distributions, that are built on standard function approximators. Especially,
VAE has shown promise on a lot of complex task. In this paper, a new
autoencoder model - classification supervised autoencoder (CSAE) based on
predefined evenly-distributed class centroids (PEDCC) is proposed. Our method
uses PEDCC of latent variables to train the network to ensure the maximization
of inter-class distance and the minimization of inner-class distance. Instead
of learning mean/variance of latent variables distribution and taking
reparameterization of VAE, latent variables of CSAE are directly used to
classify and as input of decoder. In addition, a new loss function is proposed
to combine the loss function of classification. Based on the basic structure of
the universal autoencoder, we realized the comprehensive optimal results of
encoding, decoding, classification, and good model generalization performance
at the same time. Theoretical advantages are reflected in experimental results.Comment: 16 pages,12 figures, 4 table
- …