2 research outputs found

    Optimal Scheduling Policy for Minimizing Age of Information with a Relay

    Full text link
    We consider IoT sensor network where multiple sensors are connected to corresponding destination nodes via a relay. Thus, the relay schedules sensors to sample and destination nodes to update. The relay can select multiple sensors and destination nodes in each time. In order to minimize average weighted sum AoI, joint optimization of sampling and updating policy of the relay is investigated. For errorless and symmetric case where weights are equally given, necessary and sufficient conditions for optimality is found. Using this result, we obtain that the minimum average sum AoI in a closed-form expression which can be interpreted as fundamental limit of sum AoI in a single relay network. Also, for error-prone and symmetric case, we have proved that greedy policy achieves the minimum average sum AoI at the destination nodes. For general case, we have proposed scheduling policy obtained via reinforcement learning.Comment: 30 page

    Minimizing the AoI in Resource-Constrained Multi-Source Relaying Systems: Dynamic and Learning-based Scheduling

    Full text link
    We consider a multi-source relaying system where the independent sources randomly generate status update packets which are sent to the destination with the aid of a relay through unreliable links. We develop transmission scheduling policies to minimize the sum average age of information (AoI) subject to transmission capacity and long-run average resource constraints. We formulate a stochastic control optimization problem. To solve the problem, a constrained Markov decision process (CMDP) approach and a drift-plus-penalty method are proposed. The CMDP problem is solved by transforming it into an MDP problem using the Lagrangian relaxation method. We theoretically analyze the structure of optimal policies for the MDP problem and subsequently propose a structure-aware algorithm that returns a practical near-optimal policy. By the drift-plus-penalty method, we devise a dynamic near-optimal low-complexity policy. We also develop a model-free deep reinforcement learning policy, which does not require the full knowledge of system statistics. To do so, we employ the Lyapunov optimization theory and a dueling double deep Q-network. Simulation results are provided to assess the performance of our policies and validate the theoretical results. The results show up to 91% performance improvement compared to a baseline policy.Comment: 30 Pages, preliminary results of this paper were presented at IEEE Globecom 2021, https://ieeexplore.ieee.org/document/968594
    corecore