199 research outputs found
Inexact Alternating Optimization for Phase Retrieval In the Presence of Outliers
Phase retrieval has been mainly considered in the presence of Gaussian noise.
However, the performance of the algorithms proposed under the Gaussian noise
model severely degrades when grossly corrupted data, i.e., outliers, exist.
This paper investigates techniques for phase retrieval in the presence of
heavy-tailed noise -- which is considered a better model for situations where
outliers exist. An -norm () based estimator is proposed for
fending against such noise, and two-block inexact alternating optimization is
proposed as the algorithmic framework to tackle the resulting optimization
problem. Two specific algorithms are devised by exploring different local
approximations within this framework. Interestingly, the core conditional
minimization steps can be interpreted as iteratively reweighted least squares
and gradient descent. Convergence properties of the algorithms are discussed,
and the Cram\'er-Rao bound (CRB) is derived. Simulations demonstrate that the
proposed algorithms approach the CRB and outperform state-of-the-art algorithms
in heavy-tailed noise.Comment: 23 pages, 16 figure
Relax-and-split method for nonsmooth nonconvex problems
We develop and analyze a new `relax-and-split' (RS) approach for compositions
of separable nonconvex nonsmooth functions with linear maps. RS uses a
relaxation technique together with partial minimization, and brings classic
techniques including direct factorization, matrix decompositions, and fast
iterative methods to bear on nonsmooth nonconvex problems. We also extend the
approach to trimmed nonconvex-composite formulations; the resulting Trimmed RS
(TRS) can fit models while detecting outliers in the data.
We then test RS and TRS on a diverse set of applications: (1) phase
retrieval, (2) stochastic shortest path problems, (3) semi-supervised
classification, and (4) new clustering approaches. RS/TRS can be applied to
models with very weak functional assumptions, are easy to implement,
competitive with existing methods, and enable a new level of modeling
formulations to be put forward to address emerging challenges in the
mathematical sciences
Convex-Concave Backtracking for Inertial Bregman Proximal Gradient Algorithms in Non-Convex Optimization
Backtracking line-search is an old yet powerful strategy for finding a better
step sizes to be used in proximal gradient algorithms. The main principle is to
locally find a simple convex upper bound of the objective function, which in
turn controls the step size that is used. In case of inertial proximal gradient
algorithms, the situation becomes much more difficult and usually leads to very
restrictive rules on the extrapolation parameter. In this paper, we show that
the extrapolation parameter can be controlled by locally finding also a simple
concave lower bound of the objective function. This gives rise to a double
convex-concave backtracking procedure which allows for an adaptive choice of
both the step size and extrapolation parameters. We apply this procedure to the
class of inertial Bregman proximal gradient methods, and prove that any
sequence generated by these algorithms converges globally to a critical point
of the function at hand. Numerical experiments on a number of challenging
non-convex problems in image processing and machine learning were conducted and
show the power of combining inertial step and double backtracking strategy in
achieving improved performances.Comment: 29 page
Level-set methods for convex optimization
Convex optimization problems arising in applications often have favorable
objective functions and complicated constraints, thereby precluding first-order
methods from being immediately applicable. We describe an approach that
exchanges the roles of the objective and constraint functions, and instead
approximately solves a sequence of parametric level-set problems. A
zero-finding procedure, based on inexact function evaluations and possibly
inexact derivative information, leads to an efficient solution scheme for the
original problem. We describe the theoretical and practical properties of this
approach for a broad range of problems, including low-rank semidefinite
optimization, sparse optimization, and generalized linear models for inference.Comment: 38 page
Defending Against Saddle Point Attack in Byzantine-Robust Distributed Learning
We study robust distributed learning that involves minimizing a non-convex
loss function with saddle points. We consider the Byzantine setting where some
worker machines have abnormal or even arbitrary and adversarial behavior. In
this setting, the Byzantine machines may create fake local minima near a saddle
point that is far away from any true local minimum, even when robust gradient
estimators are used. We develop ByzantinePGD, a robust first-order algorithm
that can provably escape saddle points and fake local minima, and converge to
an approximate true local minimizer with low iteration complexity. As a
by-product, we give a simpler algorithm and analysis for escaping saddle points
in the usual non-Byzantine setting. We further discuss three robust gradient
estimators that can be used in ByzantinePGD, including median, trimmed mean,
and iterative filtering. We characterize their performance in concrete
statistical settings, and argue for their near-optimality in low and high
dimensional regimes.Comment: ICML 201
Nonconvex Optimization Meets Low-Rank Matrix Factorization: An Overview
Substantial progress has been made recently on developing provably accurate
and efficient algorithms for low-rank matrix factorization via nonconvex
optimization. While conventional wisdom often takes a dim view of nonconvex
optimization algorithms due to their susceptibility to spurious local minima,
simple iterative methods such as gradient descent have been remarkably
successful in practice. The theoretical footings, however, had been largely
lacking until recently.
In this tutorial-style overview, we highlight the important role of
statistical models in enabling efficient nonconvex optimization with
performance guarantees. We review two contrasting approaches: (1) two-stage
algorithms, which consist of a tailored initialization step followed by
successive refinement; and (2) global landscape analysis and
initialization-free algorithms. Several canonical matrix factorization problems
are discussed, including but not limited to matrix sensing, phase retrieval,
matrix completion, blind deconvolution, robust principal component analysis,
phase synchronization, and joint alignment. Special care is taken to illustrate
the key technical insights underlying their analyses. This article serves as a
testament that the integrated consideration of optimization and statistics
leads to fruitful research findings.Comment: Invited overview articl
Accurate and Scalable Image Clustering Based On Sparse Representation of Camera Fingerprint
Clustering images according to their acquisition devices is a well-known
problem in multimedia forensics, which is typically faced by means of camera
Sensor Pattern Noise (SPN). Such an issue is challenging since SPN is a
noise-like signal, hard to be estimated and easy to be attenuated or destroyed
by many factors. Moreover, the high dimensionality of SPN hinders large-scale
applications. Existing approaches are typically based on the correlation among
SPNs in the pixel domain, which might not be able to capture intrinsic data
structure in union of vector subspaces. In this paper, we propose an accurate
clustering framework, which exploits linear dependencies among SPNs in their
intrinsic vector subspaces. Such dependencies are encoded under sparse
representations which are obtained by solving a LASSO problem with
non-negativity constraint. The proposed framework is highly accurate in number
of clusters estimation and image association. Moreover, our framework is
scalable to the number of images and robust against double JPEG compression as
well as the presence of outliers, owning big potential for real-world
applications. Experimental results on Dresden and Vision database show that our
proposed framework can adapt well to both medium-scale and large-scale
contexts, and outperforms state-of-the-art methods
Solving Systems of Random Quadratic Equations via Truncated Amplitude Flow
This paper presents a new algorithm, termed \emph{truncated amplitude flow}
(TAF), to recover an unknown vector from a system of quadratic
equations of the form , where
's are given random measurement vectors. This problem is known to be
\emph{NP-hard} in general. We prove that as soon as the number of equations is
on the order of the number of unknowns, TAF recovers the solution exactly (up
to a global unimodular constant) with high probability and complexity growing
linearly with both the number of unknowns and the number of equations. Our TAF
approach adopts the \emph{amplitude-based} empirical loss function, and
proceeds in two stages. In the first stage, we introduce an
\emph{orthogonality-promoting} initialization that can be obtained with a few
power iterations. Stage two refines the initial estimate by successive updates
of scalable \emph{truncated generalized gradient iterations}, which are able to
handle the rather challenging nonconvex and nonsmooth amplitude-based objective
function. In particular, when vectors and 's are
real-valued, our gradient truncation rule provably eliminates erroneously
estimated signs with high probability to markedly improve upon its untruncated
version. Numerical tests using synthetic data and real images demonstrate that
our initialization returns more accurate and robust estimates relative to
spectral initializations. Furthermore, even under the same initialization, the
proposed amplitude-based refinement outperforms existing Wirtinger flow
variants, corroborating the superior performance of TAF over state-of-the-art
algorithms.Comment: 37 Pages, 16 figure
Amplitude Retrieval for Channel Estimation of MIMO Systems with One-Bit ADCs
This letter revisits the channel estimation problem for MIMO systems with
one-bit analog-to-digital converters (ADCs) through a novel
algorithm--Amplitude Retrieval (AR). Unlike the state-of-the-art methods such
as those based on one-bit compressive sensing, AR takes a different approach.
It accounts for the lost amplitudes of the one-bit quantized measurements, and
performs channel estimation and amplitude completion jointly. This way, the
direction information of the propagation paths can be estimated via accurate
direction finding algorithms in array processing, e.g., maximum likelihood. The
upsot is that AR is able to handle off-grid angles and provide more accurate
channel estimates. Simulation results are included to showcase the advantages
of AR
Robust Wavefield Inversion via Phase Retrieval
Extended formulation of Full Waveform Inversion (FWI), called Wavefield
Reconstruction Inversion (WRI), offers potential benefits of decreasing the
nonlinearity of the inverse problem by replacing the explicit inverse of the
ill-conditioned wave-equation operator of classical FWI (the oscillating Green
functions) with a suitably defined data-driven regularized inverse. This
regularization relaxes the wave-equation constraint to reconstruct wavefields
that match the data, hence mitigating the risk of cycle skipping. The
subsurface model parameters are then updated in a direction that reduces these
constraint violations. However, in the case of a rough initial model, the phase
errors in the reconstructed wavefields may trap the waveform inversion in a
local minimum leading to inaccurate subsurface models. In this paper, in order
to avoid matching such incorrect phase information during the early WRI
iterations, we design a new cost function based upon phase retrieval, namely a
process which seeks to reconstruct a signal from the amplitude of linear
measurements. This new formulation, called Wavefield Inversion with Phase
Retrieval (WIPR), further improves the robustness of the parameter estimation
subproblem by a suitable phase correction. We implement the resulting WIPR
problem with an alternating-direction approach, which combines the
Majorization-Minimization (MM) algorithm to linearise the phase-retrieval term
and a variable splitting technique based upon the alternating direction method
of multipliers (ADMM). This new workflow equipped with Tikhonov-total variation
(TT) regularization, which is the combination of second-order Tikhonov and
total variation regularizations and bound constraints, successfully
reconstructs the 2004 BP salt model from a sparse fixed-spread acquisition
using a 3~Hz starting frequency and a homogeneous initial velocity model
- …