4 research outputs found

    Fast reinforcement learning for decentralized MAC optimization

    Get PDF
    In this paper, we propose a novel decentralized framework for optimizing the transmission strategy of Irregular Repetition Slotted ALOHA (IRSA) protocol in sensor networks. We consider a hierarchical communication framework that ensures adaptivity to changing network conditions and does not require centralized control. The proposed solution is inspired by the reinforcement learning literature, and, in particular, Q-learning. To deal with sensor nodes' limited lifetime and communication range, we allow them to decide how many packet replicas to transmit considering only their own buffer state. We show that this information is sufficient and can help avoiding packets' collisions and improving the throughput significantly. We solve the problem using the decentralized partially observable Markov Decision Process (Dec-POMDP) framework, where we allow each node to decide independently of the others how many packet replicas to transmit. We enhance the proposed Q-learning based method with the concept of virtual experience, and we theoretically and experimentally prove that convergence time is, thus, significantly reduced. The experiments prove that our method leads to large throughput gains, in particular when network traffic is heavy, and scales well with the size of the network. To comprehend the effect of the problem's nature on the learning dynamics and vice versa, we investigate the waterfall effect, a severe degradation in performance above a particular traffic load, typical for codes-on-graphs and prove that our algorithm learns to alleviate it

    A Regret Minimization Approach to Frameless Irregular Repetition Slotted Aloha: IRSA-RM

    Get PDF
    International audienceWireless communications play an important part in the systems of the Internet of Things (IoT). Recently, there has been a trend towards long-range communications systems for the IoT, including cellular networks. For many use cases, such as massive machine-type communications (mMTC), performance can be gained by moving away from the classical model of connection establishment and adopting random access methods. Associated with physical layer techniques such as Successive Interference Cancellation (SIC), or Non-Orthogonal Multiple Access (NOMA), the performance of random access can be dramatically improved, giving rise to novel random access protocol designs. This article studies one of these modern random access protocols: Irregular Repetition Slotted Aloha (IRSA). Since optimizing its parameters is not an easily solved problem, in this article we use a reinforcement learning approach for that purpose. We adopt one specific variant of reinforcement learning, Regret Minimization, to learn the protocol parameters. We explain why it is selected, how to apply it to our problem with centralized learning, and finally, we provide both simulation results and insights into the learning process. The results obtained show the excellent performance of IRSA when it is optimized with Regret Minimization

    Q-learning Channel Access Methods for Wireless Powered Internet of Things Networks

    Get PDF
    The Internet of Things (IoT) is becoming critical in our daily life. A key technology of interest in this thesis is Radio Frequency (RF) charging. The ability to charge devices wirelessly creates so called RF-energy harvesting IoT networks. In particular, there is a hybrid access point (HAP) that provides energy in an on-demand manner to RF-energy harvesting devices. These devices then collect data and transmit it to the HAP. In this respect, a key issue is ensuring devices have a high number of successful transmissions. There are a number of issues to consider when scheduling the transmissions of devices in the said network. First, the channel gain to/from devices varies over time. This means the efficiency to deliver energy to devices and to transmit the same amount of data is different over time. Second, during channel access, devices are not aware of the energy level of other devices nor whether they will transmit data. Third, devices have non-causal knowledge of their energy arrivals and channel gain information. Consequently, they do not know whether they should delay their transmissions in hope of better channel conditions or less contention in future time slots or doing so would result in energy overflow
    corecore