6,819 research outputs found

    Stability Enforced Bandit Algorithms for Channel Selection in Remote State Estimation of Gauss-Markov Processes

    Full text link
    In this paper we consider the problem of remote state estimation of a Gauss-Markov process, where a sensor can, at each discrete time instant, transmit on one out of M different communication channels. A key difficulty of the situation at hand is that the channel statistics are unknown. We study the case where both learning of the channel reception probabilities and state estimation is carried out simultaneously. Methods for choosing the channels based on techniques for multi-armed bandits are presented, and shown to provide stability. Furthermore, we define the performance notion of estimation regret, and derive bounds on how it scales with time for the considered algorithms.Comment: to appear in IEEE Transactions on Automatic Contro

    Structure-Enhanced DRL for Optimal Transmission Scheduling

    Full text link
    Remote state estimation of large-scale distributed dynamic processes plays an important role in Industry 4.0 applications. In this paper, we focus on the transmission scheduling problem of a remote estimation system. First, we derive some structural properties of the optimal sensor scheduling policy over fading channels. Then, building on these theoretical guidelines, we develop a structure-enhanced deep reinforcement learning (DRL) framework for optimal scheduling of the system to achieve the minimum overall estimation mean-square error (MSE). In particular, we propose a structure-enhanced action selection method, which tends to select actions that obey the policy structure. This explores the action space more effectively and enhances the learning efficiency of DRL agents. Furthermore, we introduce a structure-enhanced loss function to add penalties to actions that do not follow the policy structure. The new loss function guides the DRL to converge to the optimal policy structure quickly. Our numerical experiments illustrate that the proposed structure-enhanced DRL algorithms can save the training time by 50% and reduce the remote estimation MSE by 10% to 25% when compared to benchmark DRL algorithms. In addition, we show that the derived structural properties exist in a wide range of dynamic scheduling problems that go beyond remote state estimation.Comment: Paper submitted to IEEE. Copyright may be transferred without notice, after which this version may no longer be accessible. arXiv admin note: substantial text overlap with arXiv:2211.1082

    Energy Sharing for Multiple Sensor Nodes with Finite Buffers

    Full text link
    We consider the problem of finding optimal energy sharing policies that maximize the network performance of a system comprising of multiple sensor nodes and a single energy harvesting (EH) source. Sensor nodes periodically sense the random field and generate data, which is stored in the corresponding data queues. The EH source harnesses energy from ambient energy sources and the generated energy is stored in an energy buffer. Sensor nodes receive energy for data transmission from the EH source. The EH source has to efficiently share the stored energy among the nodes in order to minimize the long-run average delay in data transmission. We formulate the problem of energy sharing between the nodes in the framework of average cost infinite-horizon Markov decision processes (MDPs). We develop efficient energy sharing algorithms, namely Q-learning algorithm with exploration mechanisms based on the ϵ\epsilon-greedy method as well as upper confidence bound (UCB). We extend these algorithms by incorporating state and action space aggregation to tackle state-action space explosion in the MDP. We also develop a cross entropy based method that incorporates policy parameterization in order to find near optimal energy sharing policies. Through simulations, we show that our algorithms yield energy sharing policies that outperform the heuristic greedy method.Comment: 38 pages, 10 figure

    Thirty Years of Machine Learning: The Road to Pareto-Optimal Wireless Networks

    Full text link
    Future wireless networks have a substantial potential in terms of supporting a broad range of complex compelling applications both in military and civilian fields, where the users are able to enjoy high-rate, low-latency, low-cost and reliable information services. Achieving this ambitious goal requires new radio techniques for adaptive learning and intelligent decision making because of the complex heterogeneous nature of the network structures and wireless services. Machine learning (ML) algorithms have great success in supporting big data analytics, efficient parameter estimation and interactive decision making. Hence, in this article, we review the thirty-year history of ML by elaborating on supervised learning, unsupervised learning, reinforcement learning and deep learning. Furthermore, we investigate their employment in the compelling applications of wireless networks, including heterogeneous networks (HetNets), cognitive radios (CR), Internet of things (IoT), machine to machine networks (M2M), and so on. This article aims for assisting the readers in clarifying the motivation and methodology of the various ML algorithms, so as to invoke them for hitherto unexplored services as well as scenarios of future wireless networks.Comment: 46 pages, 22 fig

    Distributed Channel Access for Control Over Unknown Memoryless Communication Channels

    Get PDF
    We consider the distributed channel access problem for a system consisting of multiple control subsystems that close their loop over a shared wireless network. We propose a distributed method for providing deterministic channel access without requiring explicit information exchange between the subsystems. This is achieved by utilizing timers for prioritizing channel access with respect to a local cost which we derive by transforming the control objective cost to a form that allows its local computation. This property is then exploited for developing our distributed deterministic channel access scheme. A framework to verify the stability of the system under the resulting scheme is then proposed. Next, we consider a practical scenario in which the channel statistics are unknown. We propose learning algorithms for learning the parameters of imperfect communication links for estimating the channel quality and, hence, define the local cost as a function of this estimation and control performance. We establish that our learning approach results in collision-free channel access. The behavior of the overall system is exemplified via a proof-of-concept illustrative example, and the efficacy of this mechanism is evaluated for large-scale networks via simulations.Comment: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessibl
    corecore