2,073 research outputs found
Multi-Scale Deep Compressive Sensing Network
With joint learning of sampling and recovery, the deep learning-based
compressive sensing (DCS) has shown significant improvement in performance and
running time reduction. Its reconstructed image, however, losses high-frequency
content especially at low subrates. This happens similarly in the multi-scale
sampling scheme which also samples more low-frequency components. In this
paper, we propose a multi-scale DCS convolutional neural network (MS-DCSNet) in
which we convert image signal using multiple scale-based wavelet transform,
then capture it through convolution block by block across scales. The initial
reconstructed image is directly recovered from multi-scale measurements.
Multi-scale wavelet convolution is utilized to enhance the final reconstruction
quality. The network is able to learn both multi-scale sampling and multi-scale
reconstruction, thus results in better reconstruction quality.Comment: 4 pages, 4 figures, 2 tables, IEEE International Conference on Visual
Communication and Image Processing (VCIP
Deep Sparse Coding Using Optimized Linear Expansion of Thresholds
We address the problem of reconstructing sparse signals from noisy and
compressive measurements using a feed-forward deep neural network (DNN) with an
architecture motivated by the iterative shrinkage-thresholding algorithm
(ISTA). We maintain the weights and biases of the network links as prescribed
by ISTA and model the nonlinear activation function using a linear expansion of
thresholds (LET), which has been very successful in image denoising and
deconvolution. The optimal set of coefficients of the parametrized activation
is learned over a training dataset containing measurement-sparse signal pairs,
corresponding to a fixed sensing matrix. For training, we develop an efficient
second-order algorithm, which requires only matrix-vector product computations
in every training epoch (Hessian-free optimization) and offers superior
convergence performance than gradient-descent optimization. Subsequently, we
derive an improved network architecture inspired by FISTA, a faster version of
ISTA, to achieve similar signal estimation performance with about 50% of the
number of layers. The resulting architecture turns out to be a deep residual
network, which has recently been shown to exhibit superior performance in
several visual recognition tasks. Numerical experiments demonstrate that the
proposed DNN architectures lead to 3 to 4 dB improvement in the reconstruction
signal-to-noise ratio (SNR), compared with the state-of-the-art sparse coding
algorithms.Comment: Submission date: November 11, 2016. 19 pages; 9 figure
Hardware Implementation of Compressed Sensing based Low Complex Video Encoder
This paper presents a memory efficient VLSI architecture of low complex video
encoder using three dimensional (3-D) wavelet and Compressed Sensing (CS) is
proposed for space and low power video applications. Majority of the
conventional video coding schemes are based on hybrid model, which requires
complex operations like transform coding (DCT), motion estimation and
deblocking filter at the encoder. Complexity of the proposed encoder is reduced
by replacing those complex operations by 3-D DWT and CS at the encoder. The
proposed architecture uses 3-D DWT to enable the scalability with levels of
wavelet decomposition and also to exploit the spatial and the temporal
redundancies. CS provides the good error resilience and coding efficiency. At
the first stage of the proposed architecture for encoder, 3-D DWT has been
applied (Lifting based 2-D DWT in spatial domain and Haar wavelet in temporal
domain) on each frame of the group of frames (GOF), and in the second stage CS
module exploits the sparsity of the wavelet coefficients. Small set of linear
measurements are extracted by projecting the sparse 3-D wavelet coefficients
onto random Bernoulli matrix at the encoder. Compared with the best existing
3-D DWT architectures, the proposed architecture for 3-D DWT requires less
memory and provide high throughput. For an N?N image, the proposed 3-D DWT
architecture consumes a total of only 2?(3N +40P) words of on-chip memory for
the one level of decomposition. The proposed architecture for an encoder is
first of its kind and to the best of my knowledge, no architecture is noted for
comparison. The proposed VLSI architecture of the encoder has been synthesized
on 90-nm CMOS process technology and results show that it consumes 90.08 mW
power and occupies an area equivalent to 416.799 K equivalent gate at frequency
of 158 MHz.Comment: Submitted in IEEE transactions on VLS
Applications of Compressed Sensing in Communications Networks
This paper presents a tutorial for CS applications in communications
networks. The Shannon's sampling theorem states that to recover a signal, the
sampling rate must be as least the Nyquist rate. Compressed sensing (CS) is
based on the surprising fact that to recover a signal that is sparse in certain
representations, one can sample at the rate far below the Nyquist rate. Since
its inception in 2006, CS attracted much interest in the research community and
found wide-ranging applications from astronomy, biology, communications, image
and video processing, medicine, to radar. CS also found successful applications
in communications networks. CS was applied in the detection and estimation of
wireless signals, source coding, multi-access channels, data collection in
sensor networks, and network monitoring, etc. In many cases, CS was shown to
bring performance gains on the order of 10X. We believe this is just the
beginning of CS applications in communications networks, and the future will
see even more fruitful applications of CS in our field.Comment: 18 page
Deep Compressive Autoencoder for Action Potential Compression in Large-Scale Neural Recording
Understanding the coordinated activity underlying brain computations requires
large-scale, simultaneous recordings from distributed neuronal structures at a
cellular-level resolution. One major hurdle to design high-bandwidth,
high-precision, large-scale neural interfaces lies in the formidable data
streams that are generated by the recorder chip and need to be online
transferred to a remote computer. The data rates can require hundreds to
thousands of I/O pads on the recorder chip and power consumption on the order
of Watts for data streaming alone. We developed a deep learning-based
compression model to reduce the data rate of multichannel action potentials.
The proposed model is built upon a deep compressive autoencoder (CAE) with
discrete latent embeddings. The encoder is equipped with residual
transformations to extract representative features from spikes, which are
mapped into the latent embedding space and updated via vector quantization
(VQ). The decoder network reconstructs spike waveforms from the quantized
latent embeddings. Experimental results show that the proposed model
consistently outperforms conventional methods by achieving much higher
compression ratios (20-500x) and better or comparable reconstruction
accuracies. Testing results also indicate that CAE is robust against a diverse
range of imperfections, such as waveform variation and spike misalignment, and
has minor influence on spike sorting accuracy. Furthermore, we have estimated
the hardware cost and real-time performance of CAE and shown that it could
support thousands of recording channels simultaneously without excessive
power/heat dissipation. The proposed model can reduce the required data
transmission bandwidth in large-scale recording experiments and maintain good
signal qualities. The code of this work has been made available at
https://github.com/tong-wu-umn/spike-compression-autoencoderComment: 19 pages, 13 figure
Recurrent Generative Adversarial Networks for Proximal Learning and Automated Compressive Image Recovery
Recovering images from undersampled linear measurements typically leads to an
ill-posed linear inverse problem, that asks for proper statistical priors.
Building effective priors is however challenged by the low train and test
overhead dictated by real-time tasks; and the need for retrieving visually
"plausible" and physically "feasible" images with minimal hallucination. To
cope with these challenges, we design a cascaded network architecture that
unrolls the proximal gradient iterations by permeating benefits from generative
residual networks (ResNet) to modeling the proximal operator. A mixture of
pixel-wise and perceptual costs is then deployed to train proximals. The
overall architecture resembles back-and-forth projection onto the intersection
of feasible and plausible images. Extensive computational experiments are
examined for a global task of reconstructing MR images of pediatric patients,
and a more local task of superresolving CelebA faces, that are insightful to
design efficient architectures. Our observations indicate that for MRI
reconstruction, a recurrent ResNet with a single residual block effectively
learns the proximal. This simple architecture appears to significantly
outperform the alternative deep ResNet architecture by 2dB SNR, and the
conventional compressed-sensing MRI by 4dB SNR with 100x faster inference. For
image superresolution, our preliminary results indicate that modeling the
denoising proximal demands deep ResNets.Comment: 11 pages, 11 figure
Learning Multi-Layer Transform Models
Learned data models based on sparsity are widely used in signal processing
and imaging applications. A variety of methods for learning synthesis
dictionaries, sparsifying transforms, etc., have been proposed in recent years,
often imposing useful structures or properties on the models. In this work, we
focus on sparsifying transform learning, which enjoys a number of advantages.
We consider multi-layer or nested extensions of the transform model, and
propose efficient learning algorithms. Numerical experiments with image data
illustrate the behavior of the multi-layer transform learning algorithm and its
usefulness for image denoising. Multi-layer models provide better denoising
quality than single layer schemes.Comment: In Proceedings of the Annual Allerton Conference on Communication,
Control, and Computing, 201
Multiscale Shrinkage and L\'evy Processes
A new shrinkage-based construction is developed for a compressible vector
, for cases in which the components of \xv are
naturally associated with a tree structure. Important examples are when \xv
corresponds to the coefficients of a wavelet or block-DCT representation of
data. The method we consider in detail, and for which numerical results are
presented, is based on increments of a gamma process. However, we demonstrate
that the general framework is appropriate for many other types of shrinkage
priors, all within the L\'{e}vy process family, with the gamma process a
special case. Bayesian inference is carried out by approximating the posterior
with samples from an MCMC algorithm, as well as by constructing a heuristic
variational approximation to the posterior. We also consider
expectation-maximization (EM) for a MAP (point) solution. State-of-the-art
results are manifested for compressive sensing and denoising applications, the
latter with spiky (non-Gaussian) noise.Comment: 11 pages, 5 figure
Efficient B-mode Ultrasound Image Reconstruction from Sub-sampled RF Data using Deep Learning
In portable, three dimensional, and ultra-fast ultrasound imaging systems,
there is an increasing demand for the reconstruction of high quality images
from a limited number of radio-frequency (RF) measurements due to receiver (Rx)
or transmit (Xmit) event sub-sampling. However, due to the presence of side
lobe artifacts from RF sub-sampling, the standard beamformer often produces
blurry images with less contrast, which are unsuitable for diagnostic purposes.
Existing compressed sensing approaches often require either hardware changes or
computationally expensive algorithms, but their quality improvements are
limited. To address this problem, here we propose a novel deep learning
approach that directly interpolates the missing RF data by utilizing redundancy
in the Rx-Xmit plane. Our extensive experimental results using sub-sampled RF
data from a multi-line acquisition B-mode system confirm that the proposed
method can effectively reduce the data rate without sacrificing image quality.Comment: The title has been changed. This version will appear in IEEE Trans.
on Medical Imagin
Robust flow field reconstruction from limited measurements via sparse representation
In many applications it is important to estimate a fluid flow field from
limited and possibly corrupt measurements. Current methods in flow estimation
often use least squares regression to reconstruct the flow field, finding the
minimum-energy solution that is consistent with the measured data. However,
this approach may be prone to overfitting and sensitive to noise. To address
these challenges we instead seek a sparse representation of the data in a
library of examples. Sparse representation has been widely used for image
recognition and reconstruction, and it is well-suited to structured data with
limited, corrupt measurements. We explore sparse representation for flow
reconstruction on a variety of fluid data sets with a wide range of complexity,
including vortex shedding past a cylinder at low Reynolds number, a mixing
layer, and two geophysical flows. In addition, we compare several measurement
strategies and consider various types of noise and corruption over a range of
intensities. We find that sparse representation has considerably improved
estimation accuracy and robustness to noise and corruption compared with least
squares methods. We also introduce a sparse estimation procedure on local
spatial patches for complex multiscale flows that preclude a global sparse
representation. Based on these results, sparse representation is a promising
framework for extracting useful information from complex flow fields with
realistic measurements
- …