Search CORE

464 research outputs found

$QD$ -Learning: A Collaborative Distributed Strategy for Multi-Agent Reinforcement Learning Through Consensus + Innovations

Author: Kar Soummya
Moura Jose' M. F.
Poor H. Vincent
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/04/2012
Field of study

The paper considers a class of multi-agent Markov decision processes (MDPs), in which the network agents respond differently (as manifested by the instantaneous one-stage random costs) to a global controlled state and the control actions of a remote controller. The paper investigates a distributed reinforcement learning setup with no prior information on the global state transition and local agent cost statistics. Specifically, with the agents' objective consisting of minimizing a network-averaged infinite horizon discounted cost, the paper proposes a distributed version of

Q

-learning,

\mathcal{QD}

-learning, in which the network agents collaborate by means of local processing and mutual information exchange over a sparse (possibly stochastic) communication network to achieve the network goal. Under the assumption that each agent is only aware of its local online cost data and the inter-agent communication network is \emph{weakly} connected, the proposed distributed scheme is almost surely (a.s.) shown to yield asymptotically the desired value function and the optimal stationary control policy at each network agent. The analytical techniques developed in the paper to address the mixed time-scale stochastic dynamics of the \emph{consensus + innovations} form, which arise as a result of the proposed interactive distributed scheme, are of independent interest.Comment: Submitted to the IEEE Transactions on Signal Processing, 33 page

arXiv.org e-Print Archive

CiteSeerX

Crossref

FigShare

Exploiting Chordality in Optimization Algorithms for Model Predictive Control

Author: A Alessio
A Hansson
D Axehill
D Koller
E. Arnold
J Gondzio
JL Jerez
L Vandenberghe
M Diehl
M Åkerblad
Marc C. Steinbach
S Khoshfetrat Pakazad
SJ Qin
SJ Wright
SJ Wright
T Cormen
V Gopal
Y Wang
Publication venue
Publication date: 28/11/2017
Field of study

In this chapter we show that chordal structure can be used to devise efficient optimization methods for many common model predictive control problems. The chordal structure is used both for computing search directions efficiently as well as for distributing all the other computations in an interior-point method for solving the problem. The chordal structure can stem both from the sequential nature of the problem as well as from distributed formulations of the problem related to scenario trees or other formulations. The framework enables efficient parallel computations.Comment: arXiv admin note: text overlap with arXiv:1502.0638

arXiv.org e-Print Archive

Crossref

Indefinite metric spaces in estimation, control and adaptive filtering

Author: Hassibi Babak
Publication venue
Publication date: 01/08/1996
Field of study

The goal of this thesis is two-fold: first to present a unified mathematical framework (based upon optimization in indefinite metric spaces) for a wide range of problems in estimation and control, and second, to motivate and introduce the problem of robust estimation and control, and to study its implications to the area of adaptive signal processing. Robust estimation (and control) is concerned with the design of estimators (and controllers that have acceptable performance in the face of model uncertainties and lack of statistical information, and can be considered an outgrowth and extension of (the now classical) LQG theory, developed in the 1950's and 1960's which assumed perfect models and complete statistical knowledge. It has particular significance in adaptive signal processing where one needs to cope with time-variations of system parameters and to compensate for lack of a priori knowledge of the statistics of the input data and disturbances. One method of addressing the above problem is the so-called H∞ approach, which was introduced by G. Zames in 1980 and that has been recently solved by various authors. Despite the "fundamental differences" between the philosophies of the H∞ and LQG approaches to control and estimation, there are striking "formal similarities" between the controllers and estimators obtained from these two methodologies. In an attempt to explain these similarities, we shall describe a new approach to H∞ estimation (and control), different from the existing (e.g., interpolation-theoretic-based, game-theoretic-based, etc) approaches, that is based upon setting up estimation (and control problems) not in the usual Hilbert space of random variables, but in an indefinite (so-called Krein) space

Caltech Authors

Learning Pose Estimation for UAV Autonomous Navigation and Landing Using Visual-Inertial Sensor Data

Author: bateux
byravan
clark
eigen
engel
han
hongtao
karami
kingma
lecun
mourikis
shah
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/07/2020
Field of study

In this work, we propose a robust network-in-the-loop control system for autonomous navigation and landing of an Unmanned-Aerial-Vehicle (UAV). To estimate the UAV’s absolute pose, we develop a deep neural network (DNN) architecture for visual-inertial odometry, which provides a robust alternative to traditional methods. We first evaluate the accuracy of the estimation by comparing the prediction of our model to traditional visual-inertial approaches on the publicly available EuRoC MAV dataset. The results indicate a clear improvement in the accuracy of the pose estimation up to 25% over the baseline. Finally, we integrate the data-driven estimator in the closed-loop flight control system of Airsim, a simulator available as a plugin for Unreal Engine, and we provide simulation results for autonomous navigation and landing

Crossref

Caltech Authors

Distributed Constrained Recursive Nonlinear Least-Squares Estimation: Algorithms and Asymptotics

Author: Kar Soummya
Moura Jose' M. F.
Poor H. Vincent
Sahu Anit Kumar
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 19/10/2016
Field of study

This paper focuses on the problem of recursive nonlinear least squares parameter estimation in multi-agent networks, in which the individual agents observe sequentially over time an independent and identically distributed (i.i.d.) time-series consisting of a nonlinear function of the true but unknown parameter corrupted by noise. A distributed recursive estimator of the \emph{consensus} + \emph{innovations} type, namely

\mathcal{CIWNLS}

, is proposed, in which the agents update their parameter estimates at each observation sampling epoch in a collaborative way by simultaneously processing the latest locally sensed information~(\emph{innovations}) and the parameter estimates from other agents~(\emph{consensus}) in the local neighborhood conforming to a pre-specified inter-agent communication topology. Under rather weak conditions on the connectivity of the inter-agent communication and a \emph{global observability} criterion, it is shown that at every network agent, the proposed algorithm leads to consistent parameter estimates. Furthermore, under standard smoothness assumptions on the local observation functions, the distributed estimator is shown to yield order-optimal convergence rates, i.e., as far as the order of pathwise convergence is concerned, the local parameter estimates at each agent are as good as the optimal centralized nonlinear least squares estimator which would require access to all the observations across all the agents at all times. In order to benchmark the performance of the proposed distributed

\mathcal{CIWNLS}

estimator with that of the centralized nonlinear least squares estimator, the asymptotic normality of the estimate sequence is established and the asymptotic covariance of the distributed estimator is evaluated. Finally, simulation results are presented which illustrate and verify the analytical findings.Comment: 28 pages. Initial Submission: Feb. 2016, Revised: July 2016, Accepted: September 2016, To appear in IEEE Transactions on Signal and Information Processing over Networks: Special Issue on Inference and Learning over Network

arXiv.org e-Print Archive

Princeton University Open Access Repository

Multitask Diffusion Adaptation over Networks

Author: Chen Jie
Richard Cédric
Sayed Ali. H.
Publication venue
Publication date: 01/01/2013
Field of study

Adaptive networks are suitable for decentralized inference tasks, e.g., to monitor complex natural phenomena. Recent research works have intensively studied distributed optimization problems in the case where the nodes have to estimate a single optimum parameter vector collaboratively. However, there are many important applications that are multitask-oriented in the sense that there are multiple optimum parameter vectors to be inferred simultaneously, in a collaborative manner, over the area covered by the network. In this paper, we employ diffusion strategies to develop distributed algorithms that address multitask problems by minimizing an appropriate mean-square error criterion with

\ell_2

-regularization. The stability and convergence of the algorithm in the mean and in the mean-square sense is analyzed. Simulations are conducted to verify the theoretical findings, and to illustrate how the distributed strategy can be used in several useful applications related to spectral sensing, target localization, and hyperspectral data unmixing.Comment: 29 pages, 11 figures, submitted for publicatio

arXiv.org e-Print Archive

CiteSeerX

Sequential and adaptive Bayesian computation for inference and optimization

Author: Akyildiz Omer Deniz
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 11/03/2019
Field of study

With the advent of cheap and ubiquitous measurement devices, today more data is measured, recorded, and archived in a relatively short span of time than all data recorded throughout history. Moreover, advances in computation have made it possible to model much more complicated phenomena and to use the vast amounts of data to calibrate the resulting high-dimensional models. In this thesis, we are interested in two fundamental problems which are repeatedly being faced in practice as the dimension of the models and datasets are growing steadily: the problem of inference in high-dimensional models and the problem of optimization for problems when the number of data points is very large. The inference problem gets diﬃcult when the model one wants to calibrate and estimate is deﬁned in a high-dimensional space. The behavior of computational algorithms in high-dimensional spaces is complicated and deﬁes intuition. Computational methods which work accurately for inferring low-dimensional models, for example, may fail to generalize the same performance to high-dimensional models. In recent years, due to the signiﬁcant interest in high-dimensional models, there has been a plethora of work in signal processing and machine learning to develop computational methods which are robust in high-dimensional spaces. In particular, the high-dimensional stochastic ﬁltering problem has attracted signiﬁcant attention as it arises in multiple ﬁelds which are of crucial importance such as geophysics, aerospace, control. In particular, a class of algorithms called particle ﬁlters has received attention and become a fruitful ﬁeld of research because of their accuracy and robustness in low-dimensional systems. In short, these methods keep a cloud of particles (samples in a state space), which describe the empirical probability distribution over the state variable of interest. The particle ﬁlters use a model of the phenomenon of interest to propagate and predict the future states and use an observation model to assimilate the observations to correct the state estimates. The most common particle ﬁlter, called the bootstrap particle ﬁlter (BPF), consists of an iterative sampling-weighting-resampling scheme. However, BPFs also largely fail at inferring high-dimensional dynamical systems due to a number of reasons. In this work, we propose a novel particle ﬁlter, named the nudged particle ﬁlter (NuPF), which speciﬁcally aims at improving the performance of particle ﬁlters in high-dimensional systems. The algorithm relies on the idea of nudging, which has been widely used in the geophysics literature to tackle high-dimensional inference problems. In particular, in addition to standard sampling-weighting-resampling steps of the particle ﬁlter, we deﬁne a general nudging step based on the gradient of the likelihoods, which generalize some of the nudging schemes proposed in the literature. This step is based on modifying the particles, generated in the sampling step, using the gradients of the likelihoods. In particular, the nudging step moves a fraction of the particles to the regions under which they have high-likelihoods. This scheme results in signiﬁcantly improved behavior in high-dimensional models. The resulting NuPF is able to track high-dimensional systems successfully. Unlike the proposed nudging schemes in the literature, the NuPF does not rely on Gaussianity assumptions and can be deﬁned for a general likelihood. We analytically prove that, because we only move a fraction of the particles and not all of them, the algorithm has a convergence rate that matches standard Monte Carlo algorithms. More precisely, the NuPF has the same asymptotic convergence guarantees as the bootstrap particle ﬁlter. As a byproduct, we also show that the nudging step improves the robustness of the particle ﬁlter against model misspeciﬁcation. In particular, model misspeciﬁcation occurs when the true data-generating system and the model posed by the user of the algorithm diﬀer signiﬁcantly. In this case, a majority of computational inference methods fail due to the discrepancy between the modeling assumptions and the observed data. We show that the nudging step increases the robustness of particle ﬁlters against model misspeciﬁcation. Specifically, we prove that the NuPF generates particle systems which have provably higher marginal likelihoods compared to the standard bootstrap particle ﬁlter. This theoretical result is attained by showing that the NuPF can be interpreted as a bootstrap particle ﬁlter for a modiﬁed state-space model. Finally, we demonstrate the empirical behavior of the NuPF with several examples. In particular, we show results on high-dimensional linear state-space models, a misspeciﬁed Lorenz 63 model, a high-dimensional Lorenz 96 model, and a misspeciﬁed object tracking model. In all examples, the NuPF infers the states successfully. The second problem, the so-called scability problem in optimization, occurs because of the large number of data points in modern datasets. With the increasing abundance of data, many problems in signal processing, statistical inference, and machine learning turn into a large-scale optimization problems. For example, in signal processing, one might be interested in estimating a sparse signal given a large number of corrupted observations. Similarly, maximum-likelihood inference problems in statistics result in large-scale optimization problems. Another signiﬁcant application domain is machine learning, where all important training methods are deﬁned as optimization problems. To tackle these problems, computational optimization methods developed over the past decades are ineﬃcient since they need to compute function evaluations or gradients over all the data for a single iteration. Because of this reason, a class of optimization methods, termed stochastic optimization methods, have emerged. The algorithms of this class are designed to tackle problems which are deﬁned over a big number of data points. In short, these methods utilize a subsample of the dataset in order to update the parameter estimate and do so iteratively until some convergence criterion is met. However, there is a major diﬃculty that has to be addressed: Although the convergence theory for these algorithms is understood, they can have unstable behavior in practice. In particular, the most commonly used stochastic optimization method, namely the stochastic gradient descent, can diverge easily if its step-size is poorly set. Over the years, practitioners have developed a number of rules of thumb to alleviate stability issues. We argue in this thesis that one way to develop robust stochastic optimization methods is to frame them as inference methods. In particular, we show that stochastic optimization schemes can be recast as inference methods and can be understood as inference algorithms. Framing the problem as an inference problem opens the way to compare these methods to the optimal inference algorithms and understand why they might be failing or producing unstable behavior. In this vein, we show that there is an intrinsic relationship between a class of stochastic optimization methods, called incremental proximal methods, and Kalman (and extended Kalman) ﬁlters. The ﬁltering approach to stochastic optimization results in an automatic calibration of the step-size, which removes the instability problems depending on the step-sizes. The probabilistic interpretation of stochastic optimization problems also paves the way to develop new optimization methods based on strategies which are popular in the inference literature. In particular, one can use a set of sampling methods in order to solve the inference problem and hence obtain the global minimum. In this manner, we propose a parallel sequential Monte Carlo optimizer (PSMCO), which is aiming at solving stochastic optimization problems. The PSMCO is designed as a zeroth order method which does not use gradients. It only uses subsets of the data points in order to move at each iteration. The PSMCO obtains an estimate of a global minimum at each iteration by utilizing a cheap kernel density estimator. We prove that the resulting estimator converges to a global minimum almost surely as the number of Monte Carlo samples tends to inﬁnity. We also empirically demonstrate that the algorithm is able to reconstruct multiple global minima and solve diﬃcult global optimization problems. By further exploiting the relationship between inference and optimization, we also propose a probabilistic and online matrix factorization method, termed the dictionary ﬁlter to solve large-scale matrix factorization problems. Matrix factorization methods have received signiﬁcant interest from the machine learning community due to their expressive representations of high-dimensional data and interpretability of their estimates. As the majority of the matrix factorization methods are deﬁned as optimization problems, they suﬀer from the same issues as stochastic optimization methods. In particular, when using stochastic gradient descent, one might need to try and err many times before deciding to use a step-size. To alleviate these problems, we introduce a matrix-variate probabilistic model for which inference results in a matrix factorization scheme. The scheme is online, in the sense that it only uses a single data point at a time to update the factors. The algorithm bears relationship with optimization schemes, namely with the incremental proximal method deﬁned over a matrix-variate cost function. By way of intuition we developed for the optimization-inference relationship, we devise a model which results in similar update rules for matrix factorization as for the incremental proximal method. However, the probabilistic updates are more stable and eﬃcient. Moreover, the algorithm does not have a step-size parameter to tune, as its role is played by the posterior covariance matrix. We demonstrate the utility of the algorithm on a missing data problem and a video processing problem. We show that the algorithm can be successfully used in machine learning problems and several promising extensions of the method can be constructed easily.Programa Oficial de Doctorado en Multimedia y ComunicacionesPresidente: Ricardo Cao Abad.- Secretario: Michael Peter Wiper.- Vocal: Nicholas Paul Whitele

Universidad Carlos III de Madrid e-Archivo