Search CORE

56,905 research outputs found

Non-convex Optimization for Machine Learning

Author: Jain Prateek
Kar Purushottam
Publication venue: 'Now Publishers'
Publication date: 01/01/2017
Field of study

A vast majority of machine learning algorithms train their models and perform inference by solving optimization problems. In order to capture the learning and prediction problems accurately, structural constraints such as sparsity or low rank are frequently imposed or else the objective itself is designed to be a non-convex function. This is especially true of algorithms that operate in high-dimensional spaces or that train non-linear models such as tensor models and deep networks. The freedom to express the learning problem as a non-convex optimization problem gives immense modeling power to the algorithm designer, but often such problems are NP-hard to solve. A popular workaround to this has been to relax non-convex problems to convex ones and use traditional methods to solve the (convex) relaxed optimization problems. However this approach may be lossy and nevertheless presents significant challenges for large scale optimization. On the other hand, direct approaches to non-convex optimization have met with resounding success in several domains and remain the methods of choice for the practitioner, as they frequently outperform relaxation-based techniques - popular heuristics include projected gradient descent and alternating minimization. However, these are often poorly understood in terms of their convergence and other properties. This monograph presents a selection of recent advances that bridge a long-standing gap in our understanding of these heuristics. The monograph will lead the reader through several widely used non-convex optimization techniques, as well as applications thereof. The goal of this monograph is to both, introduce the rich literature in this area, as well as equip the reader with the tools and techniques needed to analyze these simple procedures for non-convex problems.Comment: The official publication is available from now publishers via http://dx.doi.org/10.1561/220000005

arXiv.org e-Print Archive

Tensor Computation: A New Framework for High-Dimensional Problems in EDA

Author: Batselier Kim
Daniel Luca
Liu Haotian
Wong Ngai
Zhang Zheng
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/10/2016
Field of study

Many critical EDA problems suffer from the curse of dimensionality, i.e. the very fast-scaling computational burden produced by large number of parameters and/or unknown variables. This phenomenon may be caused by multiple spatial or temporal factors (e.g. 3-D field solvers discretizations and multi-rate circuit simulation), nonlinearity of devices and circuits, large number of design or optimization parameters (e.g. full-chip routing/placement and circuit sizing), or extensive process variations (e.g. variability/reliability analysis and design for manufacturability). The computational challenges generated by such high dimensional problems are generally hard to handle efficiently with traditional EDA core algorithms that are based on matrix and vector computation. This paper presents "tensor computation" as an alternative general framework for the development of efficient EDA algorithms and tools. A tensor is a high-dimensional generalization of a matrix and a vector, and is a natural choice for both storing and solving efficiently high-dimensional EDA problems. This paper gives a basic tutorial on tensors, demonstrates some recent examples of EDA applications (e.g., nonlinear circuit modeling and high-dimensional uncertainty quantification), and suggests further open EDA problems where the use of tensor computation could be of advantage.Comment: 14 figures. Accepted by IEEE Trans. CAD of Integrated Circuits and System

arXiv.org e-Print Archive

HKU Scholars Hub