6,277 research outputs found

    Two Timescale Convergent Q-learning for Sleep--Scheduling in Wireless Sensor Networks

    Full text link
    In this paper, we consider an intrusion detection application for Wireless Sensor Networks (WSNs). We study the problem of scheduling the sleep times of the individual sensors to maximize the network lifetime while keeping the tracking error to a minimum. We formulate this problem as a partially-observable Markov decision process (POMDP) with continuous state-action spaces, in a manner similar to (Fuemmeler and Veeravalli [2008]). However, unlike their formulation, we consider infinite horizon discounted and average cost objectives as performance criteria. For each criterion, we propose a convergent on-policy Q-learning algorithm that operates on two timescales, while employing function approximation to handle the curse of dimensionality associated with the underlying POMDP. Our proposed algorithm incorporates a policy gradient update using a one-simulation simultaneous perturbation stochastic approximation (SPSA) estimate on the faster timescale, while the Q-value parameter (arising from a linear function approximation for the Q-values) is updated in an on-policy temporal difference (TD) algorithm-like fashion on the slower timescale. The feature selection scheme employed in each of our algorithms manages the energy and tracking components in a manner that assists the search for the optimal sleep-scheduling policy. For the sake of comparison, in both discounted and average settings, we also develop a function approximation analogue of the Q-learning algorithm. This algorithm, unlike the two-timescale variant, does not possess theoretical convergence guarantees. Finally, we also adapt our algorithms to include a stochastic iterative estimation scheme for the intruder's mobility model. Our simulation results on a 2-dimensional network setting suggest that our algorithms result in better tracking accuracy at the cost of only a few additional sensors, in comparison to a recent prior work

    Sequential Design for Ranking Response Surfaces

    Full text link
    We propose and analyze sequential design methods for the problem of ranking several response surfaces. Namely, given L2L \ge 2 response surfaces over a continuous input space X\cal X, the aim is to efficiently find the index of the minimal response across the entire X\cal X. The response surfaces are not known and have to be noisily sampled one-at-a-time. This setting is motivated by stochastic control applications and requires joint experimental design both in space and response-index dimensions. To generate sequential design heuristics we investigate stepwise uncertainty reduction approaches, as well as sampling based on posterior classification complexity. We also make connections between our continuous-input formulation and the discrete framework of pure regret in multi-armed bandits. To model the response surfaces we utilize kriging surrogates. Several numerical examples using both synthetic data and an epidemics control problem are provided to illustrate our approach and the efficacy of respective adaptive designs.Comment: 26 pages, 7 figures (updated several sections and figures

    Consensus-based Distributed Algorithm for Multisensor-Multitarget Tracking under Unknown–but–Bounded Disturbances

    Get PDF
    We consider a dynamic network of sensors that cooperate to estimate parameters of multiple targets. Each sensor can observe parameters of a few targets, reconstructing the trajectories of the remaining targets via interactions with “neighbouring” sensors. The multi-target tracking has to be provided in the face of uncertainties, which include unknown-but-bounded drift of parameters, noise in observations and distortions introduced by communication channels. To provide tracking in presence of these uncertainties, we employ a distributed algorithm, being an “offspring” of a consensus protocol and the stochastic gradient descent. The mathematical results on the algorithm’s convergence are illustrated by numerical simulations

    Non-Uniform Stochastic Average Gradient Method for Training Conditional Random Fields

    Full text link
    We apply stochastic average gradient (SAG) algorithms for training conditional random fields (CRFs). We describe a practical implementation that uses structure in the CRF gradient to reduce the memory requirement of this linearly-convergent stochastic gradient method, propose a non-uniform sampling scheme that substantially improves practical performance, and analyze the rate of convergence of the SAGA variant under non-uniform sampling. Our experimental results reveal that our method often significantly outperforms existing methods in terms of the training objective, and performs as well or better than optimally-tuned stochastic gradient methods in terms of test error.Comment: AI/Stats 2015, 24 page
    corecore