Search CORE

911 research outputs found

Mean Field Equilibrium in Dynamic Games with Strategic Complementarities

Author: Adlakha S
Bertsekas DP
Bertsekas DP
Bertsekas DP
Billingsley P
Doraszelski U
Fudenberg D
Glynn P
Kleene SC
Maitra A
Ramesh Johari
Sachin Adlakha
Stokey NL
Topkis DM
Yin H
Publication venue: 'Institute for Operations Research and the Management Sciences (INFORMS)'
Publication date
Field of study

Asynchronous Stochastic Variational Inference

Author: C Andrieu
DM Blei
DP Bertsekas
HR Feyzmahdavian
MD Hoffman
MJ Wainwright
Publication venue: 'Center for Open Science'
Publication date: 12/01/2018
Field of study

Stochastic variational inference (SVI) employs stochastic optimization to scale up Bayesian computation to massive data. Since SVI is at its core a stochastic gradient-based algorithm, horizontal parallelism can be harnessed to allow larger scale inference. We propose a lock-free parallel implementation for SVI which allows distributed computations over multiple slaves in an asynchronous style. We show that our implementation leads to linear speed-up while guaranteeing an asymptotic ergodic convergence rate O(1/√T) given that the number of slaves is bounded by √T (T is the total number of iterations). The implementation is done in a high-performance computing (HPC) environment using message passing interface (MPI) for python (MPI4py). The extensive empirical evaluation shows that our parallel SVI is lossless, performing comparably well to its counterpart serial SVI with linear speed-up

arXiv.org e-Print Archive

Crossref

Bournemouth University Research Online

Online optimal and adaptive integral tracking control for varying discrete‐time systems using reinforcement learning

Author: Bertsekas DP
Levine WS
Lewis FL
Sutton RS
Werbos PJ
Åström KJ
Publication venue: 'Wiley'
Publication date: 16/04/2020
Field of study

Conventional closed‐form solution to the optimal control problem using optimal control theory is only available under the assumption that there are known system dynamics/models described as differential equations. Without such models, reinforcement learning (RL) as a candidate technique has been successfully applied to iteratively solve the optimal control problem for unknown or varying systems. For the optimal tracking control problem, existing RL techniques in the literature assume either the use of a predetermined feedforward input for the tracking control, restrictive assumptions on the reference model dynamics, or discounted tracking costs. Furthermore, by using discounted tracking costs, zero steady‐state error cannot be guaranteed by the existing RL methods. This article therefore presents an optimal online RL tracking control framework for discrete‐time (DT) systems, which does not impose any restrictive assumptions of the existing methods and equally guarantees zero steady‐state tracking error. This is achieved by augmenting the original system dynamics with the integral of the error between the reference inputs and the tracked outputs for use in the online RL framework. It is further shown that the resulting value function for the DT linear quadratic tracker using the augmented formulation with integral control is also quadratic. This enables the development of Bellman equations, which use only the system measurements to solve the corresponding DT algebraic Riccati equation and obtain the optimal tracking control inputs online. Two RL strategies are thereafter proposed based on both the value function approximation and the Q‐learning along with bounds on excitation for the convergence of the parameter estimates. Simulation case studies show the effectiveness of the proposed approach

Crossref

White Rose Research Online

Privacy-Preserving Outsourcing of Large-Scale Nonlinear Programming to the Cloud

Author: C Wang
DP Bertsekas
F Chen
F Chen
J Katz
K Ren
K-M Chung
L Zhou
M Barbosa
MS Bazaraa
S Murugesan
X Chen
X Chen
X Lei
X Lei
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/10/2018
Field of study

The increasing massive data generated by various sources has given birth to big data analytics. Solving large-scale nonlinear programming problems (NLPs) is one important big data analytics task that has applications in many domains such as transport and logistics. However, NLPs are usually too computationally expensive for resource-constrained users. Fortunately, cloud computing provides an alternative and economical service for resource-constrained users to outsource their computation tasks to the cloud. However, one major concern with outsourcing NLPs is the leakage of user's private information contained in NLP formulations and results. Although much work has been done on privacy-preserving outsourcing of computation tasks, little attention has been paid to NLPs. In this paper, we for the first time investigate secure outsourcing of general large-scale NLPs with nonlinear constraints. A secure and efficient transformation scheme at the user side is proposed to protect user's private information; at the cloud side, generalized reduced gradient method is applied to effectively solve the transformed large-scale NLPs. The proposed protocol is implemented on a cloud computing testbed. Experimental evaluations demonstrate that significant time can be saved for users and the proposed mechanism has the potential for practical use.Comment: Ang Li and Wei Du equally contributed to this work. This work was done when Wei Du was at the University of Arkansas. 2018 EAI International Conference on Security and Privacy in Communication Networks (SecureComm

arXiv.org e-Print Archive

Crossref

Sparsity and cosparsity for audio declipping: a flexible non-convex approach

Author: A Adler
A Janssen
B Defraene
DP Bertsekas
M Elad
M Goto
M Kahrs
M Kowalski
MD Plumbley
S Boyd
S Foucart
S Nam
SK Naik
T Blumensath
Y Tachioka
YC Eldar
Publication venue
Publication date: 09/06/2015
Field of study

This work investigates the empirical performance of the sparse synthesis versus sparse analysis regularization for the ill-posed inverse problem of audio declipping. We develop a versatile non-convex heuristics which can be readily used with both data models. Based on this algorithm, we report that, in most cases, the two models perform almost similarly in terms of signal enhancement. However, the analysis version is shown to be amenable for real time audio processing, when certain analysis operators are considered. Both versions outperform state-of-the-art methods in the field, especially for the severely saturated signals

arXiv.org e-Print Archive

HAL-CentraleSupelec

Crossref

INRIA a CCSD electronic archive server

Hal-Diderot

HAL-Rennes 1

Tracking Target Signal Strengths on a Grid using Sparsity

Author: AK Jain
BN Vo
C Coue
CM Bishop
D Angelosante
D Angelosante
DE Clark
DE Clark
DP Bertsekas
DP Bertsekas
EJ Candes
G Taylor
HW Kuhn
JA Bazerque
K Mekhnacha
K Panta
K Panta
L Lin
M Tanaka
N Vaswani
O Erdinc
Q Ling
R Mahler
R Mahler
R Xu
S Blackman
S Farahmand
S Farahmand
S Nannuru
T Hastie
V Cevher
Y Bar-Shalom
Y Bar-Shalom
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 27/04/2011
Field of study

Multi-target tracking is mainly challenged by the nonlinearity present in the measurement equation, and the difficulty in fast and accurate data association. To overcome these challenges, the present paper introduces a grid-based model in which the state captures target signal strengths on a known spatial grid (TSSG). This model leads to \emph{linear} state and measurement equations, which bypass data association and can afford state estimation via sparsity-aware Kalman filtering (KF). Leveraging the grid-induced sparsity of the novel model, two types of sparsity-cognizant TSSG-KF trackers are developed: one effects sparsity through

\ell_1

-norm regularization, and the other invokes sparsity as an extra measurement. Iterative extended KF and Gauss-Newton algorithms are developed for reduced-complexity tracking, along with accurate error covariance updates for assessing performance of the resultant sparsity-aware state estimators. Based on TSSG state estimates, more informative target position and track estimates can be obtained in a follow-up step, ensuring that track association and position estimation errors do not propagate back into TSSG state estimates. The novel TSSG trackers do not require knowing the number of targets or their signal strengths, and exhibit considerably lower complexity than the benchmark hidden Markov model filter, especially for a large number of targets. Numerical simulations demonstrate that sparsity-cognizant trackers enjoy improved root mean-square error performance at reduced complexity when compared to their sparsity-agnostic counterparts.Comment: Submitted to IEEE Trans. on Signal Processin

arXiv.org e-Print Archive

Crossref

Michigan Technological University

TU Delft Repository

Springer - Publisher Connector

Approximate Consensus in Highly Dynamic Networks: The Role of Averaging Algorithms

Author: AD Fekete
B Charron-Bost
D Angluin
D Dolev
DP Bertsekas
FR Chung
H Attiya
H Attiya
I Daubechies
M Biely
M Cao
M Cao
MJ Fischer
N Santoro
NA Lynch
V Blondel
É Coulouma
Publication venue
Publication date: 12/11/2014
Field of study

In this paper, we investigate the approximate consensus problem in highly dynamic networks in which topology may change continually and unpredictably. We prove that in both synchronous and partially synchronous systems, approximate consensus is solvable if and only if the communication graph in each round has a rooted spanning tree, i.e., there is a coordinator at each time. The striking point in this result is that the coordinator is not required to be unique and can change arbitrarily from round to round. Interestingly, the class of averaging algorithms, which are memoryless and require no process identifiers, entirely captures the solvability issue of approximate consensus in that the problem is solvable if and only if it can be solved using any averaging algorithm. Concerning the time complexity of averaging algorithms, we show that approximate consensus can be achieved with precision of

\varepsilon

in a coordinated network model in

O(n^{n+1} \log\frac{1}{\varepsilon})

synchronous rounds, and in

O(\Delta n^{n\Delta+1} \log\frac{1}{\varepsilon})

rounds when the maximum round delay for a message to be delivered is

\Delta

. While in general, an upper bound on the time complexity of averaging algorithms has to be exponential, we investigate various network models in which this exponential bound in the number of nodes reduces to a polynomial bound. We apply our results to networked systems with a fixed topology and classical benign fault models, and deduce both known and new results for approximate consensus in these systems. In particular, we show that for solving approximate consensus, a complete network can tolerate up to 2n-3 arbitrarily located link faults at every round, in contrast with the impossibility result established by Santoro and Widmayer (STACS '89) showing that exact consensus is not solvable with n-1 link faults per round originating from the same node

arXiv.org e-Print Archive

Use of approximations of Hamilton-Jacobi-Bellman inequality for solving periodic optimization problems

Author: AG Bhatt
D Hernandez-Hernandez
DP Bertsekas
EJ Anderson
G Grammel
KR Parthasarathy
L Finlay
L Finlay
M Bardi
P Billingsley
R Vinter
RB Ash
TG Kurtz
V Gaitsgory
V Gaitsgory
V Gaitsgory
V Gaitsgory
WH Fleming
Publication venue
Publication date: 07/09/2013
Field of study

We show that necessary and sufficient conditions of optimality in periodic optimization problems can be stated in terms of a solution of the corresponding HJB inequality, the latter being equivalent to a max-min type variational problem considered on the space of continuously differentiable functions. We approximate the latter with a maximin problem on a finite dimensional subspace of the space of continuously differentiable functions and show that a solution of this problem (existing under natural controllability conditions) can be used for construction of near optimal controls. We illustrate the construction with a numerical example.Comment: 29 pages, 2 figure

arXiv.org e-Print Archive

Crossref

Macquarie University ResearchOnline

Flinders Academic Commons

Separable and Low-Rank Continuous Games

Author: Asuman Ozdaglar
CE Lemke
CH Papadimitriou
D Bertsimas
D Monderer
DP Bertsekas
H Scarf
IL Glicksberg
KR Parthasarathy
M Dresher
Noah D. Stein
Pablo A. Parrilo
RD McKelvey
S Karlin
S Karlin
S Vavasis
T Başar
W Rudin
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 24/07/2007
Field of study

In this paper, we study nonzero-sum separable games, which are continuous games whose payoffs take a sum-of-products form. Included in this subclass are all finite games and polynomial games. We investigate the structure of equilibria in separable games. We show that these games admit finitely supported Nash equilibria. Motivated by the bounds on the supports of mixed equilibria in two-player finite games in terms of the ranks of the payoff matrices, we define the notion of the rank of an n-player continuous game and use this to provide bounds on the cardinality of the support of equilibrium strategies. We present a general characterization theorem that states that a continuous game has finite rank if and only if it is separable. Using our rank results, we present an efficient algorithm for computing approximate equilibria of two-player separable games with fixed strategy spaces in time polynomial in the rank of the game

arXiv.org e-Print Archive

CiteSeerX

Crossref

Research Papers in Economics

Mean-Payoff Optimization in Continuous-Time Markov Chains with Parametric Alarms

Author: A Jovanovic
A Jovanović
C Haase
C Lindemann
DLP Minh
DP Bertsekas
EG Amparore
EM Hahn
H Choi
JR Norris
L Alfaro
L-M Traonouez
M Češka
ML Puterman
PJ Haas
R German
SK Jha
T Brázdil
T Brázdil
W Nelson
Publication venue
Publication date: 20/06/2017
Field of study

Continuous-time Markov chains with alarms (ACTMCs) allow for alarm events that can be non-exponentially distributed. Within parametric ACTMCs, the parameters of alarm-event distributions are not given explicitly and can be subject of parameter synthesis. An algorithm solving the

\varepsilon

-optimal parameter synthesis problem for parametric ACTMCs with long-run average optimization objectives is presented. Our approach is based on reduction of the problem to finding long-run average optimal strategies in semi-Markov decision processes (semi-MDPs) and sufficient discretization of parameter (i.e., action) space. Since the set of actions in the discretized semi-MDP can be very large, a straightforward approach based on explicit action-space construction fails to solve even simple instances of the problem. The presented algorithm uses an enhanced policy iteration on symbolic representations of the action space. The soundness of the algorithm is established for parametric ACTMCs with alarm-event distributions satisfying four mild assumptions that are shown to hold for uniform, Dirac and Weibull distributions in particular, but are satisfied for many other distributions as well. An experimental implementation shows that the symbolic technique substantially improves the efficiency of the synthesis algorithm and allows to solve instances of realistic size.Comment: This article is a full version of a paper accepted to the Conference on Quantitative Evaluation of SysTems (QEST) 201

arXiv.org e-Print Archive

Crossref