78,509 research outputs found
A Unified Successive Pseudo-Convex Approximation Framework
In this paper, we propose a successive pseudo-convex approximation algorithm
to efficiently compute stationary points for a large class of possibly
nonconvex optimization problems. The stationary points are obtained by solving
a sequence of successively refined approximate problems, each of which is much
easier to solve than the original problem. To achieve convergence, the
approximate problem only needs to exhibit a weak form of convexity, namely,
pseudo-convexity. We show that the proposed framework not only includes as
special cases a number of existing methods, for example, the gradient method
and the Jacobi algorithm, but also leads to new algorithms which enjoy easier
implementation and faster convergence speed. We also propose a novel line
search method for nondifferentiable optimization problems, which is carried out
over a properly constructed differentiable function with the benefit of a
simplified implementation as compared to state-of-the-art line search
techniques that directly operate on the original nondifferentiable objective
function. The advantages of the proposed algorithm are shown, both
theoretically and numerically, by several example applications, namely, MIMO
broadcast channel capacity computation, energy efficiency maximization in
massive MIMO systems and LASSO in sparse signal recovery.Comment: submitted to IEEE Transactions on Signal Processing; original title:
A Novel Iterative Convex Approximation Metho
A fast and recursive algorithm for clustering large datasets with -medians
Clustering with fast algorithms large samples of high dimensional data is an
important challenge in computational statistics. Borrowing ideas from MacQueen
(1967) who introduced a sequential version of the -means algorithm, a new
class of recursive stochastic gradient algorithms designed for the -medians
loss criterion is proposed. By their recursive nature, these algorithms are
very fast and are well adapted to deal with large samples of data that are
allowed to arrive sequentially. It is proved that the stochastic gradient
algorithm converges almost surely to the set of stationary points of the
underlying loss criterion. A particular attention is paid to the averaged
versions, which are known to have better performances, and a data-driven
procedure that allows automatic selection of the value of the descent step is
proposed.
The performance of the averaged sequential estimator is compared on a
simulation study, both in terms of computation speed and accuracy of the
estimations, with more classical partitioning techniques such as -means,
trimmed -means and PAM (partitioning around medoids). Finally, this new
online clustering technique is illustrated on determining television audience
profiles with a sample of more than 5000 individual television audiences
measured every minute over a period of 24 hours.Comment: Under revision for Computational Statistics and Data Analysi
Visualising Basins of Attraction for the Cross-Entropy and the Squared Error Neural Network Loss Functions
Quantification of the stationary points and the associated basins of
attraction of neural network loss surfaces is an important step towards a
better understanding of neural network loss surfaces at large. This work
proposes a novel method to visualise basins of attraction together with the
associated stationary points via gradient-based random sampling. The proposed
technique is used to perform an empirical study of the loss surfaces generated
by two different error metrics: quadratic loss and entropic loss. The empirical
observations confirm the theoretical hypothesis regarding the nature of neural
network attraction basins. Entropic loss is shown to exhibit stronger gradients
and fewer stationary points than quadratic loss, indicating that entropic loss
has a more searchable landscape. Quadratic loss is shown to be more resilient
to overfitting than entropic loss. Both losses are shown to exhibit local
minima, but the number of local minima is shown to decrease with an increase in
dimensionality. Thus, the proposed visualisation technique successfully
captures the local minima properties exhibited by the neural network loss
surfaces, and can be used for the purpose of fitness landscape analysis of
neural networks.Comment: Preprint submitted to the Neural Networks journa
- …