58,233 research outputs found
Global Convergence and Stability of Stochastic Gradient Descent
In machine learning, stochastic gradient descent (SGD) is widely deployed to
train models using highly non-convex objectives with equally complex noise
models. Unfortunately, SGD theory often makes restrictive assumptions that fail
to capture the non-convexity of real problems, and almost entirely ignore the
complex noise models that exist in practice. In this work, we make substantial
progress on this shortcoming. First, we establish that SGD's iterates will
either globally converge to a stationary point or diverge under nearly
arbitrary nonconvexity and noise models. Under a slightly more restrictive
assumption on the joint behavior of the non-convexity and noise model that
generalizes current assumptions in the literature, we show that the objective
function cannot diverge, even if the iterates diverge. As a consequence of our
results, SGD can be applied to a greater range of stochastic optimization
problems with confidence about its global convergence behavior and stability
Universal Compressed Sensing
In this paper, the problem of developing universal algorithms for compressed
sensing of stochastic processes is studied. First, R\'enyi's notion of
information dimension (ID) is generalized to analog stationary processes. This
provides a measure of complexity for such processes and is connected to the
number of measurements required for their accurate recovery. Then a minimum
entropy pursuit (MEP) optimization approach is proposed, and it is proven that
it can reliably recover any stationary process satisfying some mixing
constraints from sufficient number of randomized linear measurements, without
having any prior information about the distribution of the process. It is
proved that a Lagrangian-type approximation of the MEP optimization problem,
referred to as Lagrangian-MEP problem, is identical to a heuristic
implementable algorithm proposed by Baron et al. It is shown that for the right
choice of parameters the Lagrangian-MEP algorithm, in addition to having the
same asymptotic performance as MEP optimization, is also robust to the
measurement noise. For memoryless sources with a discrete-continuous mixture
distribution, the fundamental limits of the minimum number of required
measurements by a non-universal compressed sensing decoder is characterized by
Wu et al. For such sources, it is proved that there is no loss in universal
coding, and both the MEP and the Lagrangian-MEP asymptotically achieve the
optimal performance
- …