Search CORE

7,071 research outputs found

Non-convex Optimization for Machine Learning

Author: Jain Prateek
Kar Purushottam
Publication venue: 'Now Publishers'
Publication date: 01/01/2017
Field of study

A vast majority of machine learning algorithms train their models and perform inference by solving optimization problems. In order to capture the learning and prediction problems accurately, structural constraints such as sparsity or low rank are frequently imposed or else the objective itself is designed to be a non-convex function. This is especially true of algorithms that operate in high-dimensional spaces or that train non-linear models such as tensor models and deep networks. The freedom to express the learning problem as a non-convex optimization problem gives immense modeling power to the algorithm designer, but often such problems are NP-hard to solve. A popular workaround to this has been to relax non-convex problems to convex ones and use traditional methods to solve the (convex) relaxed optimization problems. However this approach may be lossy and nevertheless presents significant challenges for large scale optimization. On the other hand, direct approaches to non-convex optimization have met with resounding success in several domains and remain the methods of choice for the practitioner, as they frequently outperform relaxation-based techniques - popular heuristics include projected gradient descent and alternating minimization. However, these are often poorly understood in terms of their convergence and other properties. This monograph presents a selection of recent advances that bridge a long-standing gap in our understanding of these heuristics. The monograph will lead the reader through several widely used non-convex optimization techniques, as well as applications thereof. The goal of this monograph is to both, introduce the rich literature in this area, as well as equip the reader with the tools and techniques needed to analyze these simple procedures for non-convex problems.Comment: The official publication is available from now publishers via http://dx.doi.org/10.1561/220000005

arXiv.org e-Print Archive

Crossref

CERN Document Server

Simple Bounds for Noisy Linear Inverse Problems with Exact Side Information

Author: Hassibi Babak
Oymak Samet
Thrampoulidis Christos
Publication venue
Publication date: 05/12/2013
Field of study

This paper considers the linear inverse problem where we wish to estimate a structured signal

x

from its corrupted observations. When the problem is ill-posed, it is natural to make use of a convex function

f(\cdot)

that exploits the structure of the signal. For example,

\ell_1

norm can be used for sparse signals. To carry out the estimation, we consider two well-known convex programs: 1) Second order cone program (SOCP), and, 2) Lasso. Assuming Gaussian measurements, we show that, if precise information about the value

f(x)

or the

\ell_2

-norm of the noise is available, one can do a particularly good job at estimation. In particular, the reconstruction error becomes proportional to the "sparsity" of the signal rather than the ambient dimension of the noise vector. We connect our results to existing works and provide a discussion on the relation of our results to the standard least-squares problem. Our error bounds are non-asymptotic and sharp, they apply to arbitrary convex functions and do not assume any distribution on the noise.Comment: 13 page

arXiv.org e-Print Archive

Caltech Authors

Learning Topic Models and Latent Bayesian Networks Under Expansion Constraints

Author: Anandkumar Animashree
Hsu Daniel
Javanmard Adel
Kakade Sham M.
Publication venue
Publication date: 24/09/2012
Field of study

Unsupervised estimation of latent variable models is a fundamental problem central to numerous applications of machine learning and statistics. This work presents a principled approach for estimating broad classes of such models, including probabilistic topic models and latent linear Bayesian networks, using only second-order observed moments. The sufficient conditions for identifiability of these models are primarily based on weak expansion constraints on the topic-word matrix, for topic models, and on the directed acyclic graph, for Bayesian networks. Because no assumptions are made on the distribution among the latent variables, the approach can handle arbitrary correlations among the topics or latent factors. In addition, a tractable learning method via

\ell_1

optimization is proposed and studied in numerical experiments.Comment: 38 pages, 6 figures, 2 tables, applications in topic models and Bayesian networks are studied. Simulation section is adde

arXiv.org e-Print Archive

CiteSeerX

eScholarship - University of California

Measure What Should be Measured: Progress and Challenges in Compressive Sensing

Author: Strohmer Thomas
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 24/10/2012
Field of study

Is compressive sensing overrated? Or can it live up to our expectations? What will come after compressive sensing and sparsity? And what has Galileo Galilei got to do with it? Compressive sensing has taken the signal processing community by storm. A large corpus of research devoted to the theory and numerics of compressive sensing has been published in the last few years. Moreover, compressive sensing has inspired and initiated intriguing new research directions, such as matrix completion. Potential new applications emerge at a dazzling rate. Yet some important theoretical questions remain open, and seemingly obvious applications keep escaping the grip of compressive sensing. In this paper I discuss some of the recent progress in compressive sensing and point out key challenges and opportunities as the area of compressive sensing and sparse representations keeps evolving. I also attempt to assess the long-term impact of compressive sensing

arXiv.org e-Print Archive

CiteSeerX

eScholarship - University of California