Search CORE

6,277 research outputs found

Two Timescale Convergent Q-learning for Sleep--Scheduling in Wireless Sensor Networks

Author: A. Prashanth L.
Bhatnagar Shalabh
Chatterjee Abhranil
Publication venue
Publication date: 23/03/2014
Field of study

In this paper, we consider an intrusion detection application for Wireless Sensor Networks (WSNs). We study the problem of scheduling the sleep times of the individual sensors to maximize the network lifetime while keeping the tracking error to a minimum. We formulate this problem as a partially-observable Markov decision process (POMDP) with continuous state-action spaces, in a manner similar to (Fuemmeler and Veeravalli [2008]). However, unlike their formulation, we consider infinite horizon discounted and average cost objectives as performance criteria. For each criterion, we propose a convergent on-policy Q-learning algorithm that operates on two timescales, while employing function approximation to handle the curse of dimensionality associated with the underlying POMDP. Our proposed algorithm incorporates a policy gradient update using a one-simulation simultaneous perturbation stochastic approximation (SPSA) estimate on the faster timescale, while the Q-value parameter (arising from a linear function approximation for the Q-values) is updated in an on-policy temporal difference (TD) algorithm-like fashion on the slower timescale. The feature selection scheme employed in each of our algorithms manages the energy and tracking components in a manner that assists the search for the optimal sleep-scheduling policy. For the sake of comparison, in both discounted and average settings, we also develop a function approximation analogue of the Q-learning algorithm. This algorithm, unlike the two-timescale variant, does not possess theoretical convergence guarantees. Finally, we also adapt our algorithms to include a stochastic iterative estimation scheme for the intruder's mobility model. Our simulation results on a 2-dimensional network setting suggest that our algorithms result in better tracking accuracy at the cost of only a few additional sensors, in comparison to a recent prior work

arXiv.org e-Print Archive

Open Access Repository of IISc Research Publications

Sequential Design for Ranking Response Surfaces

Author: Hu Ruimeng
Ludkovski Mike
Publication venue: 'Society for Industrial & Applied Mathematics (SIAM)'
Publication date: 12/07/2016
Field of study

We propose and analyze sequential design methods for the problem of ranking several response surfaces. Namely, given

L \ge 2

response surfaces over a continuous input space

\cal X

, the aim is to efficiently find the index of the minimal response across the entire

\cal X

. The response surfaces are not known and have to be noisily sampled one-at-a-time. This setting is motivated by stochastic control applications and requires joint experimental design both in space and response-index dimensions. To generate sequential design heuristics we investigate stepwise uncertainty reduction approaches, as well as sampling based on posterior classification complexity. We also make connections between our continuous-input formulation and the discrete framework of pure regret in multi-armed bandits. To model the response surfaces we utilize kriging surrogates. Several numerical examples using both synthetic data and an epidemics control problem are provided to illustrate our approach and the efficacy of respective adaptive designs.Comment: 26 pages, 7 figures (updated several sections and figures

arXiv.org e-Print Archive

eScholarship - University of California

Consensus-based Distributed Algorithm for Multisensor-Multitarget Tracking under Unknown–but–Bounded Disturbances

Author: Amelina Natalia
Erofeeva Victoria
Granichin Oleg
Ivanskiy Yury
Jiang Yuming
Proskurnikov Anton
Sergeenko Anna
Publication venue: 'Elsevier BV'
Publication date: 01/01/2020
Field of study

We consider a dynamic network of sensors that cooperate to estimate parameters of multiple targets. Each sensor can observe parameters of a few targets, reconstructing the trajectories of the remaining targets via interactions with “neighbouring” sensors. The multi-target tracking has to be provided in the face of uncertainties, which include unknown-but-bounded drift of parameters, noise in observations and distortions introduced by communication channels. To provide tracking in presence of these uncertainties, we employ a distributed algorithm, being an “offspring” of a consensus protocol and the stochastic gradient descent. The mathematical results on the algorithm’s convergence are illustrated by numerical simulations

PORTO@iris (Publications Open Repository TOrino - Politecnico di Torino)

Non-Uniform Stochastic Average Gradient Method for Training Conditional Random Fields

Author: Ahmed Mohamed Osama
Babanezhad Reza
Clifton Ann
Defazio Aaron
Sarkar Anoop
Schmidt Mark
Publication venue
Publication date: 16/04/2015
Field of study

We apply stochastic average gradient (SAG) algorithms for training conditional random fields (CRFs). We describe a practical implementation that uses structure in the CRF gradient to reduce the memory requirement of this linearly-convergent stochastic gradient method, propose a non-uniform sampling scheme that substantially improves practical performance, and analyze the rate of convergence of the SAGA variant under non-uniform sampling. Our experimental results reveal that our method often significantly outperforms existing methods in terms of the training objective, and performs as well or better than optimally-tuned stochastic gradient methods in terms of test error.Comment: AI/Stats 2015, 24 page

arXiv.org e-Print Archive

CiteSeerX

The Australian National University