8,794 research outputs found
Fast L1-Minimization Algorithm for Sparse Approximation Based on an Improved LPNN-LCA framework
The aim of sparse approximation is to estimate a sparse signal according to
the measurement matrix and an observation vector. It is widely used in data
analytics, image processing, and communication, etc. Up to now, a lot of
research has been done in this area, and many off-the-shelf algorithms have
been proposed. However, most of them cannot offer a real-time solution. To some
extent, this shortcoming limits its application prospects. To address this
issue, we devise a novel sparse approximation algorithm based on Lagrange
programming neural network (LPNN), locally competitive algorithm (LCA), and
projection theorem. LPNN and LCA are both analog neural network which can help
us get a real-time solution. The non-differentiable objective function can be
solved by the concept of LCA. Utilizing the projection theorem, we further
modify the dynamics and proposed a new system with global asymptotic stability.
Simulation results show that the proposed sparse approximation method has the
real-time solutions with satisfactory MSEs
Machine learning approach to chance-constrained problems: An algorithm based on the stochastic gradient descent
We consider chance-constrained problems with discrete random distribution. We
aim for problems with a large number of scenarios. We propose a novel method
based on the stochastic gradient descent method which performs updates of the
decision variable based only on considering a few scenarios. We modify it to
handle the non-separable objective. Complexity analysis and a comparison with
the standard (batch) gradient descent method is provided. We give three
examples with non-convex data and show that our method provides a good solution
fast even when the number of scenarios is large
OptNet: Differentiable Optimization as a Layer in Neural Networks
This paper presents OptNet, a network architecture that integrates
optimization problems (here, specifically in the form of quadratic programs) as
individual layers in larger end-to-end trainable deep networks. These layers
encode constraints and complex dependencies between the hidden states that
traditional convolutional and fully-connected layers often cannot capture. In
this paper, we explore the foundations for such an architecture: we show how
techniques from sensitivity analysis, bilevel optimization, and implicit
differentiation can be used to exactly differentiate through these layers and
with respect to layer parameters; we develop a highly efficient solver for
these layers that exploits fast GPU-based batch solves within a primal-dual
interior point method, and which provides backpropagation gradients with
virtually no additional cost on top of the solve; and we highlight the
application of these approaches in several problems. In one notable example, we
show that the method is capable of learning to play mini-Sudoku (4x4) given
just input and output games, with no a priori information about the rules of
the game; this highlights the ability of our architecture to learn hard
constraints better than other neural architectures.Comment: ICML 201
Solving the L1 regularized least square problem via a box-constrained smooth minimization
In this paper, an equivalent smooth minimization for the L1 regularized least
square problem is proposed. The proposed problem is a convex box-constrained
smooth minimization which allows applying fast optimization methods to find its
solution. Further, it is investigated that the property "the dual of dual is
primal" holds for the L1 regularized least square problem. A solver for the
smooth problem is proposed, and its affinity to the proximal gradient is shown.
Finally, the experiments on L1 and total variation regularized problems are
performed, and the corresponding results are reported.Comment: 5 page
Projection Neural Network for a Class of Sparse Regression Problems with Cardinality Penalty
In this paper, we consider a class of sparse regression problems, whose
objective function is the summation of a convex loss function and a cardinality
penalty. By constructing a smoothing function for the cardinality function, we
propose a projected neural network and design a correction method for solving
this problem. The solution of the proposed neural network is unique, global
existent, bounded and globally Lipschitz continuous. Besides, we prove that all
accumulation points of the proposed neural network have a common support set
and a unified lower bound for the nonzero entries. Combining the proposed
neural network with the correction method, any corrected accumulation point is
a local minimizer of the considered sparse regression problem. Moreover, we
analyze the equivalent relationship on the local minimizers between the
considered sparse regression problem and another regression sparse problem.
Finally, some numerical experiments are provided to show the efficiency of the
proposed neural networks in solving some sparse regression problems in
practice
Solving a Class of Non-Convex Min-Max Games Using Iterative First Order Methods
Recent applications that arise in machine learning have surged significant
interest in solving min-max saddle point games. This problem has been
extensively studied in the convex-concave regime for which a global equilibrium
solution can be computed efficiently. In this paper, we study the problem in
the non-convex regime and show that an \varepsilon--first order stationary
point of the game can be computed when one of the player's objective can be
optimized to global optimality efficiently. In particular, we first consider
the case where the objective of one of the players satisfies the
Polyak-{\L}ojasiewicz (PL) condition. For such a game, we show that a simple
multi-step gradient descent-ascent algorithm finds an \varepsilon--first order
stationary point of the problem in \widetilde{\mathcal{O}}(\varepsilon^{-2})
iterations. Then we show that our framework can also be applied to the case
where the objective of the "max-player" is concave. In this case, we propose a
multi-step gradient descent-ascent algorithm that finds an \varepsilon--first
order stationary point of the game in \widetilde{\cal O}(\varepsilon^{-3.5})
iterations, which is the best known rate in the literature. We applied our
algorithm to a fair classification problem of Fashion-MNIST dataset and
observed that the proposed algorithm results in smoother training and better
generalization
The proximal point method revisited
In this short survey, I revisit the role of the proximal point method in
large scale optimization. I focus on three recent examples: a proximally guided
subgradient method for weakly convex stochastic approximation, the prox-linear
algorithm for minimizing compositions of convex functions and smooth maps, and
Catalyst generic acceleration for regularized Empirical Risk Minimization.Comment: 11 pages, submitted to SIAG/OPT Views and New
First-order Convergence Theory for Weakly-Convex-Weakly-Concave Min-max Problems
In this paper, we consider first-order convergence theory and algorithms for
solving a class of non-convex non-concave min-max saddle-point problems, whose
objective function is weakly convex in the variables of minimization and weakly
concave in the variables of maximization. It has many important applications in
machine learning including training Generative Adversarial Nets (GANs). We
propose an algorithmic framework motivated by the inexact proximal point
method, where the weakly monotone variational inequality (VI) corresponding to
the original min-max problem is solved through approximately solving a sequence
of strongly monotone VIs constructed by adding a strongly monotone mapping to
the original gradient mapping. We prove first-order convergence to a nearly
stationary solution of the original min-max problem of the generic algorithmic
framework and establish different rates by employing different algorithms for
solving each strongly monotone VI. Experiments verify the convergence theory
and also demonstrate the effectiveness of the proposed methods on training
GANs.Comment: In this revised version, we changed title to "First-order Convergence
Theory for Weakly-Convex-Weakly-Concave Min-max Problems" and added more
experimental result
Frank-Wolfe Network: An Interpretable Deep Structure for Non-Sparse Coding
The problem of -norm constrained coding is to convert signal into code
that lies inside an -ball and most faithfully reconstructs the signal.
Previous works under the name of sparse coding considered the cases of
and norms. The cases with values, i.e. non-sparse coding studied in
this paper, remain a difficulty. We propose an interpretable deep structure
namely Frank-Wolfe Network (F-W Net), whose architecture is inspired by
unrolling and truncating the Frank-Wolfe algorithm for solving an -norm
constrained problem with . We show that the Frank-Wolfe solver for the
-norm constraint leads to a novel closed-form nonlinear unit, which is
parameterized by and termed . The unit links the
conventional pooling, activation, and normalization operations, making F-W Net
distinct from existing deep networks either heuristically designed or converted
from projected gradient descent algorithms. We further show that the
hyper-parameter can be made learnable instead of pre-chosen in F-W Net,
which gracefully solves the non-sparse coding problem even with unknown . We
evaluate the performance of F-W Net on an extensive range of simulations as
well as the task of handwritten digit recognition, where F-W Net exhibits
strong learning capability. We then propose a convolutional version of F-W Net,
and apply the convolutional F-W Net into image denoising and super-resolution
tasks, where F-W Net all demonstrates impressive effectiveness, flexibility,
and robustness.Comment: Accepted to IEEE Transactions on Circuits and Systems for Video
Technology. Code and pretrained models: https://github.com/sunke123/FW-Ne
Tradeoffs between Convergence Speed and Reconstruction Accuracy in Inverse Problems
Solving inverse problems with iterative algorithms is popular, especially for
large data. Due to time constraints, the number of possible iterations is
usually limited, potentially affecting the achievable accuracy. Given an error
one is willing to tolerate, an important question is whether it is possible to
modify the original iterations to obtain faster convergence to a minimizer
achieving the allowed error without increasing the computational cost of each
iteration considerably. Relying on recent recovery techniques developed for
settings in which the desired signal belongs to some low-dimensional set, we
show that using a coarse estimate of this set may lead to faster convergence at
the cost of an additional reconstruction error related to the accuracy of the
set approximation. Our theory ties to recent advances in sparse recovery,
compressed sensing, and deep learning. Particularly, it may provide a possible
explanation to the successful approximation of the l1-minimization solution by
neural networks with layers representing iterations, as practiced in the
learned iterative shrinkage-thresholding algorithm (LISTA).Comment: To appear in IEEE Transactions on Signal Processin
- …