5 research outputs found

    Towards Energy Efficient LPWANs through Learning-based Multi-hop Routing

    Full text link
    Low-power wide area networks (LPWANs) have been identified as one of the top emerging wireless technologies due to their autonomy and wide range of applications. Yet, the limited energy resources of battery-powered sensor nodes is a top constraint, especially in single-hop topologies, where nodes located far from the base station must conduct uplink (UL) communications in high power levels. On this point, multi-hop routings in the UL are starting to gain attention due to their capability of reducing energy consumption by enabling transmissions to closer hops. Nonetheless, a priori identifying energy efficient multi-hop routings is not trivial due to the unpredictable factors affecting the communication links in large LPWAN areas. In this paper, we propose epsilon multi-hop (EMH), a simple reinforcement learning (RL) algorithm based on epsilon-greedy to enable reliable and low consumption LPWAN multi-hop topologies. Results from a real testbed show that multi-hop topologies based on EMH achieve significant energy savings with respect to the default single-hop approach, which are accentuated as the network operation progresses

    Implications of decentralized Q-learning resource allocation in wireless networks

    No full text
    Comunicació presentada al 2017 IEEE 28th Annual International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC), celebrat els dies 8 a 13 d'octubre de 2017 a Montreal, Canadà.Reinforcement Learning is gaining attention by the wireless networking community due to its potential to learn good-performing configurations only from the observed results. In this work we propose a stateless variation of Q-learning, which we apply to exploit spatial reuse in a wireless network. In particular, we allow networks to modify both their transmission power and the channel used solely based on the experienced throughput. We concentrate in a completely decentralized scenario in which no information about neighbouring nodes is available to the learners. Our results show that although the algorithm is able to find the best-performing actions to enhance aggregate throughput, there is high variability in the throughput experienced by the individual networks. We identify the cause of this variability as the adversarial setting of our setup, in which the most played actions provide intermittent good/poor performance depending on the neighbouring decisions. We also evaluate the effect of the intrinsic learning parameters of the algorithm on this variability.This work has been partially supported by the Spanish Ministry of Economy and Competitiveness under the Maria de Maeztu Units of Excellence Programme (MDM-2015-0502), and by the European Regional Development Fund under grant TEC2015-71303-R (MINECO/FEDER)

    Implications of decentralized Q-learning resource allocation in wireless networks

    No full text
    Comunicació presentada al 2017 IEEE 28th Annual International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC), celebrat els dies 8 a 13 d'octubre de 2017 a Montreal, Canadà.Reinforcement Learning is gaining attention by the wireless networking community due to its potential to learn good-performing configurations only from the observed results. In this work we propose a stateless variation of Q-learning, which we apply to exploit spatial reuse in a wireless network. In particular, we allow networks to modify both their transmission power and the channel used solely based on the experienced throughput. We concentrate in a completely decentralized scenario in which no information about neighbouring nodes is available to the learners. Our results show that although the algorithm is able to find the best-performing actions to enhance aggregate throughput, there is high variability in the throughput experienced by the individual networks. We identify the cause of this variability as the adversarial setting of our setup, in which the most played actions provide intermittent good/poor performance depending on the neighbouring decisions. We also evaluate the effect of the intrinsic learning parameters of the algorithm on this variability.This work has been partially supported by the Spanish Ministry of Economy and Competitiveness under the Maria de Maeztu Units of Excellence Programme (MDM-2015-0502), and by the European Regional Development Fund under grant TEC2015-71303-R (MINECO/FEDER)
    corecore