254 research outputs found
Multitarget tracking via restless bandit marginal productivity indices and Kalman Filter in discrete time
This paper designs, evaluates, and tests a tractable priority-index policy for scheduling target updates in a discrete-time multitarget tracking model, which aims to be close to optimal relative to a discounted or average performance objective accounting for tracking-error variance and measurement costs. The policy is to be used by a sensor system composed of M phased-array radars coordinated to track the positions of N targets moving according to independent scalar Gauss-Markov linear dynamics, which therefore allows for the use of the Kalman Filter for track estimation. The paper exploits the natural problem formulation as a multiarmed restless bandit problem (MARBP) with real-state projects subject to deterministic dynamics by deploying Whittle's (1988) index policy for the MARBP. The challenging issues of indexability (existence of the index) and index evaluation are resolved by applying a method recently introduced by the first author for the analysis of real-state restless bandits. Computational results are reported demonstrating the tractability of index evaluation, the substantial performance gains that the Whittle's marginal productivity (MP) index policy achieves against myopic policies advocated in previous work and the resulting index policies suboptimality gaps. Further, a preliminary small scale computational study shows that the (MP) index policy exhibits a nearly optimal behavior as the number of distinct objective targets grows with the number of radars per target constant.Multitarget tracking, Sensor management, Phased array radar, Radar scheduling, Scaled track-error variance (STEV), Kalman Filter, Index policy, Marginal productivity (MP) index, Real-state multiarmed restless bandit problems (MARBP)
Multitarget tracking via restless bandit marginal productivity indices and Kalman Filter in discrete time
This paper designs, evaluates, and tests a tractable priority-index policy for
scheduling target updates in a discrete-time multitarget tracking model, which aims
to be close to optimal relative to a discounted or average performance objective accounting
for tracking-error variance and measurement costs. The policy is to be
used by a sensor system composed of M phased-array radars coordinated to track
the positions of N targets moving according to independent scalar Gauss-Markov
linear dynamics, which therefore allows for the use of the Kalman Filter for track
estimation. The paper exploits the natural problem formulation as a multiarmed
restless bandit problem (MARBP) with real-state projects subject to deterministic
dynamics by deploying Whittle's (1988) index policy for the MARBP. The challenging
issues of indexability (existence of the index) and index evaluation are resolved
by applying a method recently introduced by the first author for the analysis of
real-state restless bandits. Computational results are reported demonstrating the
tractability of index evaluation, the substantial performance gains that the Whittle's
marginal productivity (MP) index policy achieves against myopic policies advocated
in previous work and the resulting index policies suboptimality gaps. Further, a preliminary
small scale computational study shows that the (MP) index policy exhibits
a nearly optimal behavior as the number of distinct objective targets grows with
the number of radars per target constant
Sensor scheduling for hunting elusive hiding targets: a restless bandit index policy
We consider a sensor scheduling model where a set of identical sensors are used to hunt a larger set of heterogeneous targets, each of which is located at a corresponding site. Target states change randomly over discrete time slots between âexposedâ and âhidden,â according to Markovian transition probabilities that depend on whether sites are searched or not, so as to make the targets elusive. Sensors are imperfect, failing to detect an exposed target when searching its site with a positive misdetection probability. We formulate as a partially observable Markov decision process the problem of scheduling the sensors to search the sites so as to maximize the expected total discounted value of rewards earned (when targets are hunted) minus search costs incurred. Given the intractability of finding an optimal policy, we introduce a tractable heuristic search policy of priority index type based on the Whittleâs index for restless bandits. Preliminary computational results are reported showing that such a policy is nearly optimal and can substantially outperform the myopic policy and other simple heuristics.This work has been
supported in part by the Spanish Ministry of Education and Science project MTM2007-
63140 and by the Ministry of Science and Innovation project MTM2010-2080
- âŠ