Search CORE

97 research outputs found

Optimal Estimation with Limited Measurements and Noisy Communication

Author: Akyol Emrah
Basar Tamer
Gao Xiaobin
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 28/09/2015
Field of study

This paper considers a sequential estimation and sensor scheduling problem with one sensor and one estimator. The sensor makes sequential observations about the state of an underlying memoryless stochastic process, and makes a decision as to whether or not to send this measurement to the estimator. The sensor and the estimator have the common objective of minimizing expected distortion in the estimation of the state of the process, over a finite time horizon, with the constraint that the sensor can transmit its observation only a limited number of times. As opposed to the prior work where communication between the sensor and the estimator was assumed to be perfect (noiseless), in this work an additive noise channel with fixed power constraint is considered; hence, the sensor has to encode its message before transmission. For some specific source and channel noise densities, we obtain the optimal encoding and estimation policies in conjunction with the optimal transmission schedule. The impact of the presence of a noisy channel is analyzed numerically based on dynamic programming. This analysis yields some rather surprising results such as a phase-transition phenomenon in the number of used transmission opportunities, which was not encountered in the noiseless communication setting.Comment: X. Gao, E. Akyol, and T. Basar. Optimal estimation with limited measurements and noisy communication. In 54th IEEE Conference on Decision and Control (CDC15), 2015, to appea

arXiv.org e-Print Archive

Crossref

Rollout algorithm based duty cycle control with joint optimisation of delay and energy efficiency for beacon-enabled IEEE 802.15.4 networks

Author: Chai Kok Keong
Chen Yue
Li Yun
Loo Jonathan
Publication venue
Publication date: 14/05/2014
Field of study

Duty cycle control is applied in IEEE 802.15.4 medium access control (MAC) protocol to reduce energy consumption. A low duty cycle improves the energy efficiency but it reduces the available transmission time, thereby increases the end-to-end delay. Thus, it is a challenge issue to achieve a good trade-off between energy efficiency and delay. In this paper, we study a duty cycle control problem with the aim of minimising the joint-cost of energy consumption and end-to-end delay. By applying dynamic programming (DP), the optimal duty cycle control is derived. Furthermore, to ensure the feasibility of implementing the control on computation limited sensor devices, a low complexity rollout algorithm based duty cycle control (RADutyCon) is proposed. The joint-cost upper bound of the proposed RADutyCon is investigated. Simulation results show that RADutyCon can effectively reduces the joint-cost of energy consumption and end-to-end delay under various network traffic. In addition, RADutyCon achieves an exponential reduction of computation complexity compared with DP optimal control

UWL Repository

Q-Learning and Enhanced Policy Iteration in Discounted Dynamic Programming

Author: Bertsekas Dimitri P.
Yu Huizhen
Publication venue
Publication date: 15/06/2010
Field of study

We consider the classical nite-state discounted Markovian decision problem, and we introduce a new policy iteration-like algorithm for fi nding the optimal Q-factors. Instead of policy evaluation by solving a linear system of equations, our algorithm requires (possibly inexact) solution of a nonlinear system of equations, involving estimates of state costs as well as Q-factors. This is Bellman's equation for an optimal stopping problem that can be solved with simple Q-learning iterations, in the case where a lookup table representation is used; it can also be solved with the Q-learning algorithm of Tsitsiklis and Van Roy [TsV99], in the case where feature-based Q-factor approximations are used. In exact/lookup table representation form, our algorithm admits asynchronous and stochastic iterative implementations, in the spirit of asynchronous/modi ed policy iteration, with lower overhead and more reliable convergence advantages over existing Q-learning schemes. Furthermore, for large-scale problems, where linear basis function approximations and simulation-based temporal di erence implementations are used, our algorithm resolves e ffectively the inherent difficulties of existing schemes due to inadequate exploration

Helsingin yliopiston digitaalinen arkisto