290 research outputs found
Deep Component Analysis via Alternating Direction Neural Networks
Despite a lack of theoretical understanding, deep neural networks have
achieved unparalleled performance in a wide range of applications. On the other
hand, shallow representation learning with component analysis is associated
with rich intuition and theory, but smaller capacity often limits its
usefulness. To bridge this gap, we introduce Deep Component Analysis (DeepCA),
an expressive multilayer model formulation that enforces hierarchical structure
through constraints on latent variables in each layer. For inference, we
propose a differentiable optimization algorithm implemented using recurrent
Alternating Direction Neural Networks (ADNNs) that enable parameter learning
using standard backpropagation. By interpreting feed-forward networks as
single-iteration approximations of inference in our model, we provide both a
novel theoretical perspective for understanding them and a practical technique
for constraining predictions with prior knowledge. Experimentally, we
demonstrate performance improvements on a variety of tasks, including
single-image depth prediction with sparse output constraints
Multi-dimensional imaging data recovery via minimizing the partial sum of tubal nuclear norm
In this paper, we investigate tensor recovery problems within the tensor
singular value decomposition (t-SVD) framework. We propose the partial sum of
the tubal nuclear norm (PSTNN) of a tensor. The PSTNN is a surrogate of the
tensor tubal multi-rank. We build two PSTNN-based minimization models for two
typical tensor recovery problems, i.e., the tensor completion and the tensor
principal component analysis. We give two algorithms based on the alternating
direction method of multipliers (ADMM) to solve proposed PSTNN-based tensor
recovery models. Experimental results on the synthetic data and real-world data
reveal the superior of the proposed PSTNN
MoDL: Model Based Deep Learning Architecture for Inverse Problems
We introduce a model-based image reconstruction framework with a convolution
neural network (CNN) based regularization prior. The proposed formulation
provides a systematic approach for deriving deep architectures for inverse
problems with the arbitrary structure. Since the forward model is explicitly
accounted for, a smaller network with fewer parameters is sufficient to capture
the image information compared to black-box deep learning approaches, thus
reducing the demand for training data and training time. Since we rely on
end-to-end training, the CNN weights are customized to the forward model, thus
offering improved performance over approaches that rely on pre-trained
denoisers. The main difference of the framework from existing end-to-end
training strategies is the sharing of the network weights across iterations and
channels. Our experiments show that the decoupling of the number of iterations
from the network complexity offered by this approach provides benefits
including lower demand for training data, reduced risk of overfitting, and
implementations with significantly reduced memory footprint. We propose to
enforce data-consistency by using numerical optimization blocks such as
conjugate gradients algorithm within the network; this approach offers faster
convergence per iteration, compared to methods that rely on proximal gradients
steps to enforce data consistency. Our experiments show that the faster
convergence translates to improved performance, especially when the available
GPU memory restricts the number of iterations.Comment: published in IEEE Transaction on Medical Imagin
Tensor-based formulation and nuclear norm regularization for multi-energy computed tomography
The development of energy selective, photon counting X-ray detectors allows
for a wide range of new possibilities in the area of computed tomographic image
formation. Under the assumption of perfect energy resolution, here we propose a
tensor-based iterative algorithm that simultaneously reconstructs the X-ray
attenuation distribution for each energy. We use a multi-linear image model
rather than a more standard "stacked vector" representation in order to develop
novel tensor-based regularizers. Specifically, we model the multi-spectral
unknown as a 3-way tensor where the first two dimensions are space and the
third dimension is energy. This approach allows for the design of tensor
nuclear norm regularizers, which like its two dimensional counterpart, is a
convex function of the multi-spectral unknown. The solution to the resulting
convex optimization problem is obtained using an alternating direction method
of multipliers (ADMM) approach. Simulation results shows that the generalized
tensor nuclear norm can be used as a stand alone regularization technique for
the energy selective (spectral) computed tomography (CT) problem and when
combined with total variation regularization it enhances the regularization
capabilities especially at low energy images where the effects of noise are
most prominent
Decentralized learning for wireless communications and networking
This chapter deals with decentralized learning algorithms for in-network
processing of graph-valued data. A generic learning problem is formulated and
recast into a separable form, which is iteratively minimized using the
alternating-direction method of multipliers (ADMM) so as to gain the desired
degree of parallelization. Without exchanging elements from the distributed
training sets and keeping inter-node communications at affordable levels, the
local (per-node) learners consent to the desired quantity inferred globally,
meaning the one obtained if the entire training data set were centrally
available. Impact of the decentralized learning framework to contemporary
wireless communications and networking tasks is illustrated through case
studies including target tracking using wireless sensor networks, unveiling
Internet traffic anomalies, power system state estimation, as well as spectrum
cartography for wireless cognitive radio networks.Comment: Contributed chapter to appear in Splitting Methods in Communication
and Imaging, Science and Engineering, R. Glowinski, S. Osher, and W. Yin,
Editors, New York, Springer, 201
Sparse Optimization Problem with s-difference Regularization
In this paper, a s-difference type regularization for sparse recovery problem
is proposed, which is the difference of the normal penalty function R(x) and
its corresponding struncated function R (xs). First, we show the equivalent
conditions between the L0 constrained problem and the unconstrained
s-difference penalty regularized problem. Next, we choose the forward-backward
splitting (FBS) method to solve the nonconvex regularizes function and further
derive some closed-form solutions for the proximal mapping of the s-difference
regularization with some commonly used R(x), which makes the FBS easy and fast.
We also show that any cluster point of the sequence generated by the proposed
algorithm converges to a stationary point. Numerical experiments demonstrate
the efficiency of the proposed s-difference regularization in comparison with
some other existing penalty functions.Comment: 20 pages, 5 figure
On the Convergence of ADMM with Task Adaption and Beyond
Along with the development of learning and vision, Alternating Direction
Method of Multiplier (ADMM) has become a popular algorithm for separable
optimization model with linear constraint. However, the ADMM and its numerical
variants (e.g., inexact, proximal or linearized) are awkward to obtain
state-of-the-art performance when dealing with complex learning and vision
tasks due to their weak task-adaption ability. Recently, there has been an
increasing interest in incorporating task-specific computational modules (e.g.,
designed filters or learned architectures) into ADMM iterations. Unfortunately,
these task-related modules introduce uncontrolled and unstable iterative flows,
they also break the structures of the original optimization model. Therefore,
existing theoretical investigations are invalid for these resulted
task-specific iterations. In this paper, we develop a simple and generic
proximal ADMM framework to incorporate flexible task-specific module for
learning and vision problems. We rigorously prove the convergence both in
objective function values and the constraint violation and provide the
worst-case convergence rate measured by the iteration complexity. Our
investigations not only develop new perspectives for analyzing task-adaptive
ADMM but also supply meaningful guidelines on designing practical optimization
methods for real-world applications. Numerical experiments are conducted to
verify the theoretical results and demonstrate the efficiency of our
algorithmic framework
Convolutional Recurrent Neural Networks for Dynamic MR Image Reconstruction
Accelerating the data acquisition of dynamic magnetic resonance imaging (MRI)
leads to a challenging ill-posed inverse problem, which has received great
interest from both the signal processing and machine learning community over
the last decades. The key ingredient to the problem is how to exploit the
temporal correlation of the MR sequence to resolve the aliasing artefact.
Traditionally, such observation led to a formulation of a non-convex
optimisation problem, which were solved using iterative algorithms. Recently,
however, deep learning based-approaches have gained significant popularity due
to its ability to solve general inversion problems. In this work, we propose a
unique, novel convolutional recurrent neural network (CRNN) architecture which
reconstructs high quality cardiac MR images from highly undersampled k-space
data by jointly exploiting the dependencies of the temporal sequences as well
as the iterative nature of the traditional optimisation algorithms. In
particular, the proposed architecture embeds the structure of the traditional
iterative algorithms, efficiently modelling the recurrence of the iterative
reconstruction stages by using recurrent hidden connections over such
iterations. In addition, spatiotemporal dependencies are simultaneously learnt
by exploiting bidirectional recurrent hidden connections across time sequences.
The proposed algorithm is able to learn both the temporal dependency and the
iterative reconstruction process effectively with only a very small number of
parameters, while outperforming current MR reconstruction methods in terms of
computational complexity, reconstruction accuracy and speed.Comment: Published in IEEE Transactions on Medical Imagin
Harnessing Sparsity over the Continuum: Atomic Norm Minimization for Super Resolution
Convex optimization recently emerges as a compelling framework for performing
super resolution, garnering significant attention from multiple communities
spanning signal processing, applied mathematics, and optimization. This article
offers a friendly exposition to atomic norm minimization as a canonical convex
approach to solve super resolution problems. The mathematical foundations and
performances guarantees of this approach are presented, and its application in
super resolution image reconstruction for single-molecule fluorescence
microscopy are highlighted
Proximal Alternating Direction Network: A Globally Converged Deep Unrolling Framework
Deep learning models have gained great success in many real-world
applications. However, most existing networks are typically designed in
heuristic manners, thus lack of rigorous mathematical principles and
derivations. Several recent studies build deep structures by unrolling a
particular optimization model that involves task information. Unfortunately,
due to the dynamic nature of network parameters, their resultant deep
propagation networks do \emph{not} possess the nice convergence property as the
original optimization scheme does. This paper provides a novel proximal
unrolling framework to establish deep models by integrating experimentally
verified network architectures and rich cues of the tasks. More importantly, we
\emph{prove in theory} that 1) the propagation generated by our unrolled deep
model globally converges to a critical-point of a given variational energy, and
2) the proposed framework is still able to learn priors from training data to
generate a convergent propagation even when task information is only partially
available. Indeed, these theoretical results are the best we can ask for,
unless stronger assumptions are enforced. Extensive experiments on various
real-world applications verify the theoretical convergence and demonstrate the
effectiveness of designed deep models
- …