186 research outputs found
Inexact Alternating Optimization for Phase Retrieval In the Presence of Outliers
Phase retrieval has been mainly considered in the presence of Gaussian noise.
However, the performance of the algorithms proposed under the Gaussian noise
model severely degrades when grossly corrupted data, i.e., outliers, exist.
This paper investigates techniques for phase retrieval in the presence of
heavy-tailed noise -- which is considered a better model for situations where
outliers exist. An -norm () based estimator is proposed for
fending against such noise, and two-block inexact alternating optimization is
proposed as the algorithmic framework to tackle the resulting optimization
problem. Two specific algorithms are devised by exploring different local
approximations within this framework. Interestingly, the core conditional
minimization steps can be interpreted as iteratively reweighted least squares
and gradient descent. Convergence properties of the algorithms are discussed,
and the Cram\'er-Rao bound (CRB) is derived. Simulations demonstrate that the
proposed algorithms approach the CRB and outperform state-of-the-art algorithms
in heavy-tailed noise.Comment: 23 pages, 16 figure
Defending Against Saddle Point Attack in Byzantine-Robust Distributed Learning
We study robust distributed learning that involves minimizing a non-convex
loss function with saddle points. We consider the Byzantine setting where some
worker machines have abnormal or even arbitrary and adversarial behavior. In
this setting, the Byzantine machines may create fake local minima near a saddle
point that is far away from any true local minimum, even when robust gradient
estimators are used. We develop ByzantinePGD, a robust first-order algorithm
that can provably escape saddle points and fake local minima, and converge to
an approximate true local minimizer with low iteration complexity. As a
by-product, we give a simpler algorithm and analysis for escaping saddle points
in the usual non-Byzantine setting. We further discuss three robust gradient
estimators that can be used in ByzantinePGD, including median, trimmed mean,
and iterative filtering. We characterize their performance in concrete
statistical settings, and argue for their near-optimality in low and high
dimensional regimes.Comment: ICML 201
Convex-Concave Backtracking for Inertial Bregman Proximal Gradient Algorithms in Non-Convex Optimization
Backtracking line-search is an old yet powerful strategy for finding a better
step sizes to be used in proximal gradient algorithms. The main principle is to
locally find a simple convex upper bound of the objective function, which in
turn controls the step size that is used. In case of inertial proximal gradient
algorithms, the situation becomes much more difficult and usually leads to very
restrictive rules on the extrapolation parameter. In this paper, we show that
the extrapolation parameter can be controlled by locally finding also a simple
concave lower bound of the objective function. This gives rise to a double
convex-concave backtracking procedure which allows for an adaptive choice of
both the step size and extrapolation parameters. We apply this procedure to the
class of inertial Bregman proximal gradient methods, and prove that any
sequence generated by these algorithms converges globally to a critical point
of the function at hand. Numerical experiments on a number of challenging
non-convex problems in image processing and machine learning were conducted and
show the power of combining inertial step and double backtracking strategy in
achieving improved performances.Comment: 29 page
Nonconvex Optimization Meets Low-Rank Matrix Factorization: An Overview
Substantial progress has been made recently on developing provably accurate
and efficient algorithms for low-rank matrix factorization via nonconvex
optimization. While conventional wisdom often takes a dim view of nonconvex
optimization algorithms due to their susceptibility to spurious local minima,
simple iterative methods such as gradient descent have been remarkably
successful in practice. The theoretical footings, however, had been largely
lacking until recently.
In this tutorial-style overview, we highlight the important role of
statistical models in enabling efficient nonconvex optimization with
performance guarantees. We review two contrasting approaches: (1) two-stage
algorithms, which consist of a tailored initialization step followed by
successive refinement; and (2) global landscape analysis and
initialization-free algorithms. Several canonical matrix factorization problems
are discussed, including but not limited to matrix sensing, phase retrieval,
matrix completion, blind deconvolution, robust principal component analysis,
phase synchronization, and joint alignment. Special care is taken to illustrate
the key technical insights underlying their analyses. This article serves as a
testament that the integrated consideration of optimization and statistics
leads to fruitful research findings.Comment: Invited overview articl
Accurate and Scalable Image Clustering Based On Sparse Representation of Camera Fingerprint
Clustering images according to their acquisition devices is a well-known
problem in multimedia forensics, which is typically faced by means of camera
Sensor Pattern Noise (SPN). Such an issue is challenging since SPN is a
noise-like signal, hard to be estimated and easy to be attenuated or destroyed
by many factors. Moreover, the high dimensionality of SPN hinders large-scale
applications. Existing approaches are typically based on the correlation among
SPNs in the pixel domain, which might not be able to capture intrinsic data
structure in union of vector subspaces. In this paper, we propose an accurate
clustering framework, which exploits linear dependencies among SPNs in their
intrinsic vector subspaces. Such dependencies are encoded under sparse
representations which are obtained by solving a LASSO problem with
non-negativity constraint. The proposed framework is highly accurate in number
of clusters estimation and image association. Moreover, our framework is
scalable to the number of images and robust against double JPEG compression as
well as the presence of outliers, owning big potential for real-world
applications. Experimental results on Dresden and Vision database show that our
proposed framework can adapt well to both medium-scale and large-scale
contexts, and outperforms state-of-the-art methods
Solving Systems of Random Quadratic Equations via Truncated Amplitude Flow
This paper presents a new algorithm, termed \emph{truncated amplitude flow}
(TAF), to recover an unknown vector from a system of quadratic
equations of the form , where
's are given random measurement vectors. This problem is known to be
\emph{NP-hard} in general. We prove that as soon as the number of equations is
on the order of the number of unknowns, TAF recovers the solution exactly (up
to a global unimodular constant) with high probability and complexity growing
linearly with both the number of unknowns and the number of equations. Our TAF
approach adopts the \emph{amplitude-based} empirical loss function, and
proceeds in two stages. In the first stage, we introduce an
\emph{orthogonality-promoting} initialization that can be obtained with a few
power iterations. Stage two refines the initial estimate by successive updates
of scalable \emph{truncated generalized gradient iterations}, which are able to
handle the rather challenging nonconvex and nonsmooth amplitude-based objective
function. In particular, when vectors and 's are
real-valued, our gradient truncation rule provably eliminates erroneously
estimated signs with high probability to markedly improve upon its untruncated
version. Numerical tests using synthetic data and real images demonstrate that
our initialization returns more accurate and robust estimates relative to
spectral initializations. Furthermore, even under the same initialization, the
proposed amplitude-based refinement outperforms existing Wirtinger flow
variants, corroborating the superior performance of TAF over state-of-the-art
algorithms.Comment: 37 Pages, 16 figure
Amplitude Retrieval for Channel Estimation of MIMO Systems with One-Bit ADCs
This letter revisits the channel estimation problem for MIMO systems with
one-bit analog-to-digital converters (ADCs) through a novel
algorithm--Amplitude Retrieval (AR). Unlike the state-of-the-art methods such
as those based on one-bit compressive sensing, AR takes a different approach.
It accounts for the lost amplitudes of the one-bit quantized measurements, and
performs channel estimation and amplitude completion jointly. This way, the
direction information of the propagation paths can be estimated via accurate
direction finding algorithms in array processing, e.g., maximum likelihood. The
upsot is that AR is able to handle off-grid angles and provide more accurate
channel estimates. Simulation results are included to showcase the advantages
of AR
An Inexact Projected Gradient Method with Rounding and Lifting by Nonlinear Programming for Solving Rank-One Semidefinite Relaxation of Polynomial Optimization
We consider solving high-order semidefinite programming (SDP) relaxations of
nonconvex polynomial optimization problems (POPs) that often admit degenerate
rank-one optimal solutions. Instead of solving the SDP alone, we propose a new
algorithmic framework that blends local search using the nonconvex POP into
global descent using the convex SDP. In particular, we first design a globally
convergent inexact projected gradient method (iPGM) for solving the SDP that
serves as the backbone of our framework. We then accelerate iPGM by taking
long, but safeguarded, rank-one steps generated by fast nonlinear programming
algorithms. We prove that the new framework is still globally convergent for
solving the SDP. To solve the iPGM subproblem of projecting a given point onto
the feasible set of the SDP, we design a two-phase algorithm with phase one
using a symmetric Gauss-Seidel based accelerated proximal gradient method
(sGS-APG) to generate a good initial point, and phase two using a modified
limited-memory BFGS (L-BFGS) method to obtain an accurate solution. We analyze
the convergence for both phases and establish a novel global convergence result
for the modified L-BFGS that does not require the objective function to be
twice continuously differentiable. We conduct numerical experiments for solving
second-order SDP relaxations arising from a diverse set of POPs. Our framework
demonstrates state-of-the-art efficiency, scalability, and robustness in
solving degenerate rank-one SDPs to high accuracy, even in the presence of
millions of equality constraints.Comment: Code available at https://github.com/MIT-SPARK/STRID
Robust Wavefield Inversion via Phase Retrieval
Extended formulation of Full Waveform Inversion (FWI), called Wavefield
Reconstruction Inversion (WRI), offers potential benefits of decreasing the
nonlinearity of the inverse problem by replacing the explicit inverse of the
ill-conditioned wave-equation operator of classical FWI (the oscillating Green
functions) with a suitably defined data-driven regularized inverse. This
regularization relaxes the wave-equation constraint to reconstruct wavefields
that match the data, hence mitigating the risk of cycle skipping. The
subsurface model parameters are then updated in a direction that reduces these
constraint violations. However, in the case of a rough initial model, the phase
errors in the reconstructed wavefields may trap the waveform inversion in a
local minimum leading to inaccurate subsurface models. In this paper, in order
to avoid matching such incorrect phase information during the early WRI
iterations, we design a new cost function based upon phase retrieval, namely a
process which seeks to reconstruct a signal from the amplitude of linear
measurements. This new formulation, called Wavefield Inversion with Phase
Retrieval (WIPR), further improves the robustness of the parameter estimation
subproblem by a suitable phase correction. We implement the resulting WIPR
problem with an alternating-direction approach, which combines the
Majorization-Minimization (MM) algorithm to linearise the phase-retrieval term
and a variable splitting technique based upon the alternating direction method
of multipliers (ADMM). This new workflow equipped with Tikhonov-total variation
(TT) regularization, which is the combination of second-order Tikhonov and
total variation regularizations and bound constraints, successfully
reconstructs the 2004 BP salt model from a sparse fixed-spread acquisition
using a 3~Hz starting frequency and a homogeneous initial velocity model
Unfolded Algorithms for Deep Phase Retrieval
Exploring the idea of phase retrieval has been intriguing researchers for
decades, due to its appearance in a wide range of applications. The task of a
phase retrieval algorithm is typically to recover a signal from linear
phaseless measurements. In this paper, we approach the problem by proposing a
hybrid model-based data-driven deep architecture, referred to as Unfolded Phase
Retrieval (UPR), that exhibits significant potential in improving the
performance of state-of-the art data-driven and model-based phase retrieval
algorithms. The proposed method benefits from versatility and interpretability
of well-established model-based algorithms, while simultaneously benefiting
from the expressive power of deep neural networks. In particular, our proposed
model-based deep architecture is applied to the conventional phase retrieval
problem (via the incremental reshaped Wirtinger flow algorithm) and the sparse
phase retrieval problem (via the sparse truncated amplitude flow algorithm),
showing immense promise in both cases. Furthermore, we consider a joint design
of the sensing matrix and the signal processing algorithm and utilize the deep
unfolding technique in the process. Our numerical results illustrate the
effectiveness of such hybrid model-based and data-driven frameworks and
showcase the untapped potential of data-aided methodologies to enhance the
existing phase retrieval algorithms
- …