Search CORE

231 research outputs found

A New Optimal Stepsize For Approximate Dynamic Programming

Author: Frazier Peter I.
Powell Warren B.
Ryzhov Ilya O.
Publication venue
Publication date: 13/07/2014
Field of study

Approximate dynamic programming (ADP) has proven itself in a wide range of applications spanning large-scale transportation problems, health care, revenue management, and energy systems. The design of effective ADP algorithms has many dimensions, but one crucial factor is the stepsize rule used to update a value function approximation. Many operations research applications are computationally intensive, and it is important to obtain good results quickly. Furthermore, the most popular stepsize formulas use tunable parameters and can produce very poor results if tuned improperly. We derive a new stepsize rule that optimizes the prediction error in order to improve the short-term performance of an ADP algorithm. With only one, relatively insensitive tunable parameter, the new rule adapts to the level of noise in the problem and produces faster convergence in numerical experiments.Comment: Matlab files are included with the paper sourc

arXiv.org e-Print Archive

Princeton University Open Access Repository

An Online Parallel and Distributed Algorithm for Recursive Estimation of Sparse Signals

Author: Palomar Daniel P.
Pesavento Marius
Yang Yang
Zhang Mengyi
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 12/03/2015
Field of study

In this paper, we consider a recursive estimation problem for linear regression where the signal to be estimated admits a sparse representation and measurement samples are only sequentially available. We propose a convergent parallel estimation scheme that consists in solving a sequence of

\ell_{1}

-regularized least-square problems approximately. The proposed scheme is novel in three aspects: i) all elements of the unknown vector variable are updated in parallel at each time instance, and convergence speed is much faster than state-of-the-art schemes which update the elements sequentially; ii) both the update direction and stepsize of each element have simple closed-form expressions, so the algorithm is suitable for online (real-time) implementation; and iii) the stepsize is designed to accelerate the convergence but it does not suffer from the common trouble of parameter tuning in literature. Both centralized and distributed implementation schemes are discussed. The attractive features of the proposed algorithm are also numerically consolidated.Comment: Part of this work has been presented at The Asilomar Conference on Signals, Systems, and Computers, Nov. 201

arXiv.org e-Print Archive

CiteSeerX

TUbiblio

Adaptive learning rates and parallelization for stochastic, sparse, non-smooth gradients

Author: LeCun Yann
Schaul Tom
Publication venue
Publication date: 27/03/2013
Field of study

Recent work has established an empirically successful framework for adapting learning rates for stochastic gradient descent (SGD). This effectively removes all needs for tuning, while automatically reducing learning rates over time on stationary problems, and permitting learning rates to grow appropriately in non-stationary tasks. Here, we extend the idea in three directions, addressing proper minibatch parallelization, including reweighted updates for sparse or orthogonal gradients, improving robustness on non-smooth loss functions, in the process replacing the diagonal Hessian estimation procedure that may not always be available by a robust finite-difference approximation. The final algorithm integrates all these components, has linear complexity and is hyper-parameter free.Comment: Published at the First International Conference on Learning Representations (ICLR-2013). Public reviews are available at http://openreview.net/document/c14f2204-fd66-4d91-bed4-153523694041#c14f2204-fd66-4d91-bed4-15352369404

arXiv.org e-Print Archive

CiteSeerX

Approximate dynamic programming by practical examples

Author: Mes Martijn R.K.
Perez Rivera Arturo Eduardo
Publication venue: TU Eindhoven, Research School for Operations Management and Logistics (BETA)
Publication date: 01/01/2016
Field of study

University of Twente Research Information