Search CORE

880 research outputs found

Non-convex Optimization for Machine Learning

Author: Jain Prateek
Kar Purushottam
Publication venue: 'Now Publishers'
Publication date: 01/01/2017
Field of study

A vast majority of machine learning algorithms train their models and perform inference by solving optimization problems. In order to capture the learning and prediction problems accurately, structural constraints such as sparsity or low rank are frequently imposed or else the objective itself is designed to be a non-convex function. This is especially true of algorithms that operate in high-dimensional spaces or that train non-linear models such as tensor models and deep networks. The freedom to express the learning problem as a non-convex optimization problem gives immense modeling power to the algorithm designer, but often such problems are NP-hard to solve. A popular workaround to this has been to relax non-convex problems to convex ones and use traditional methods to solve the (convex) relaxed optimization problems. However this approach may be lossy and nevertheless presents significant challenges for large scale optimization. On the other hand, direct approaches to non-convex optimization have met with resounding success in several domains and remain the methods of choice for the practitioner, as they frequently outperform relaxation-based techniques - popular heuristics include projected gradient descent and alternating minimization. However, these are often poorly understood in terms of their convergence and other properties. This monograph presents a selection of recent advances that bridge a long-standing gap in our understanding of these heuristics. The monograph will lead the reader through several widely used non-convex optimization techniques, as well as applications thereof. The goal of this monograph is to both, introduce the rich literature in this area, as well as equip the reader with the tools and techniques needed to analyze these simple procedures for non-convex problems.Comment: The official publication is available from now publishers via http://dx.doi.org/10.1561/220000005

arXiv.org e-Print Archive

Crossref

CERN Document Server

System Level Synthesis

Author: Anderson James
Doyle John C.
Low Steven
Matni Nikolai
Publication venue
Publication date: 02/04/2019
Field of study

This article surveys the System Level Synthesis framework, which presents a novel perspective on constrained robust and optimal controller synthesis for linear systems. We show how SLS shifts the controller synthesis task from the design of a controller to the design of the entire closed loop system, and highlight the benefits of this approach in terms of scalability and transparency. We emphasize two particular applications of SLS, namely large-scale distributed optimal control and robust control. In the case of distributed control, we show how SLS allows for localized controllers to be computed, extending robust and optimal control methods to large-scale systems under practical and realistic assumptions. In the case of robust control, we show how SLS allows for novel design methodologies that, for the first time, quantify the degradation in performance of a robust controller due to model uncertainty -- such transparency is key in allowing robust control methods to interact, in a principled way, with modern techniques from machine learning and statistical inference. Throughout, we emphasize practical and efficient computational solutions, and demonstrate our methods on easy to understand case studies.Comment: To appear in Annual Reviews in Contro

arXiv.org e-Print Archive

Caltech Authors

Meta-Processing: A robust framework for multi-tasks seismic processing

Author: Alkhalifah Tariq
Cheng Shijun
Harsuko Randy
Publication venue
Publication date: 20/09/2023
Field of study

Machine learning-based seismic processing models are typically trained separately to perform specific seismic processing tasks (SPTs), and as a result, require plenty of training data. However, preparing training data sets is not trivial, especially for supervised learning (SL). Nevertheless, seismic data of different types and from different regions share generally common features, such as their sinusoidal nature and geometric texture. To learn the shared features, and thus, quickly adapt to various SPTs, we develop a unified paradigm for neural network-based seismic processing, called Meta-Processing, that uses limited training data for meta learning a common network initialization, which offers universal adaptability features. The proposed Meta-Processing framework consists of two stages: meta-training and meta-testing. In the meta-training stage, each SPT is treated as a separate task and the training dataset is divided into support and query sets. Unlike conventional SL methods, here, the neural network (NN) parameters are updated by a bilevel gradient descent from the support set to the query set, iterating through all tasks. In the meta-testing stage, we also utilize limited data to fine-tune the optimized NN parameters in an SL fashion to conduct various SPTs, such as denoising, interpolation, ground-roll attenuation, image enhancement, and velocity estimation, aiming to converge quickly to ideal performance. Comprehensive numerical examples are performed to evaluate the performance of Meta-Processing on both synthetic and field data. The results demonstrate that our method significantly improves the convergence speed and prediction accuracy of the NN

arXiv.org e-Print Archive

Fast and Accurate Retrieval of Methane Concentration From Imaging Spectrometer Data Using Sparsity Prior

Author: Dennison Philip E.
Foote Markus D.
Frankenberg Christian
Jongaramrungruang Siraput
Joshi Sarang C.
Thompson David R.
Thorpe Andrew K.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/09/2020
Field of study

The strong radiative forcing by atmospheric methane has stimulated interest in identifying natural and anthropogenic sources of this potent greenhouse gas. Point sources are important targets for quantification, and anthropogenic targets have the potential for emissions reduction. Methane point-source plume detection and concentration retrieval have been previously demonstrated using data from the Airborne Visible InfraRed Imaging Spectrometer-Next Generation (AVIRIS-NG). Current quantitative methods have tradeoffs between computational requirements and retrieval accuracy, creating obstacles for processing real-time data or large data sets from flight campaigns. We present a new computationally efficient algorithm that applies sparsity and an albedo correction to matched the filter retrieval of trace gas concentration path length. The new algorithm was tested using the AVIRIS-NG data acquired over several point-source plumes in Ahmedabad, India. The algorithm was validated using the simulated AVIRIS-NG data, including synthetic plumes of known methane concentration. Sparsity and albedo correction together reduced the root-mean-squared error of retrieved methane concentration-path length enhancement by 60.7% compared with a previous robust matched filter method. Background noise was reduced by a factor of 2.64. The new algorithm was able to process the entire 300 flight line 2016 AVIRIS-NG India campaign in just over 8 h on a desktop computer with GPU acceleration