8 research outputs found

    Decentralized Smart Charging of Large-Scale EVs using Adaptive Multi-Agent Multi-Armed Bandits

    Full text link
    The drastic growth of electric vehicles and photovoltaics can introduce new challenges, such as electrical current congestion and voltage limit violations due to peak load demands. These issues can be mitigated by controlling the operation of electric vehicles i.e., smart charging. Centralized smart charging solutions have already been proposed in the literature. But such solutions may lack scalability and suffer from inherent drawbacks of centralization, such as a single point of failure, and data privacy concerns. Decentralization can help tackle these challenges. In this paper, a fully decentralized smart charging system is proposed using the philosophy of adaptive multi-agent systems. The proposed system utilizes multi-armed bandit learning to handle uncertainties in the system. The presented system is decentralized, scalable, real-time, model-free, and takes fairness among different players into account. A detailed case study is also presented for performance evaluation.Comment: CIRED 2023 International Conference & Exhibition on Electricity Distribution, Jun 2023, Rome, Ital

    On the relevance of bandit algorithms in digital world

    No full text
    In the digital world, more and more autonomous agents make automatic decisions to optimize a criterion by learning from their past decisions. Their rapid multiplication implies that these agents are going to interact with each other, whereas the algorithms they use were not necessarily designed for this. Unexpected emergent behaviors could therefore occur. In this thesis, we contribute to a first step towards the control of a large number of autonomous agents, which learn by interacting with an unknown environment, but also with other agents. We focus on bandit algorithms: an agent aims to minimize its regret with respect to the best possible policy by choosing the best actions. As the agent observes only the outcomes of the actions it has chosen, it must handle the exploration-exploitation dilemma: should one choose the action whose outcome is loosely estimated, or play the action whose result is empirically the best?Our contributions begin with the study of bandits in evolving environment, which is often the case in applications. We then propose contributions on the applications of contextual bandits in the digital world: dynamic allocation of resources for cloud computing, optimization of marketing campaigns, or automatic dialogue. We then state the problem of massively decentralized bandits to handle a very common problem in the digital world, A/B testing, but with the constraint of guarantying privacy. Finally, for optimizing communications in Internet of Things, we propose the massively multiplayer bandits

    Massive Multi-Player Multi-Armed Bandits for IoT Networks: An Application on LoRa Networks

    No full text
    More and more manufacturers, as part of the transition toward Industry 4.0, are using Internet of Things (IoT) networks for more efficient production. The wide and extensive expansion of IoT devices and the variety of applications generate different challenges, mainly in terms of reliability and energy efficiency. In this paper, we propose an approach to optimize the performance of IoT networks by making the IoT devices intelligent using machine learning techniques. We formulate the optimization problem as a massive multi-player multi-armed bandit and introduce two novel policies: Decreasing-Order-Reward-Greedy (DORG) focuses on the number of successful transmissions, while Decreasing-Order-Fair-Greedy (DOFG) also guarantees some measure of fairness between the devices. We then present an efficient way to manage the trade-off between energy consumption and packet losses in Long-Range (LoRa) networks using our algorithms, by which LoRa nodes adjust their emission parameters (Spreading Factor and transmitting power). We implement our algorithms on a LoRa network simulator and show that such learning techniques largely outperform the Adaptive Data Rate (ADR) algorithm currently implemented in LoRa devices, in terms of both energy consumption and packet losses

    Massive Multi-Player Multi-Armed Bandits for IoT Networks: An Application on LoRa Networks

    No full text
    More and more manufacturers, as part of the transition toward Industry 4.0, are using Internet of Things (IoT) networks for more efficient production. The wide and extensive expansion of IoT devices and the variety of applications generate different challenges, mainly in terms of reliability and energy efficiency. In this paper, we propose an approach to optimize the performance of IoT networks by making the IoT devices intelligent using machine learning techniques. We formulate the optimization problem as a massive multi-player multi-armed bandit and introduce two novel policies: Decreasing-Order-Reward-Greedy (DORG) focuses on the number of successful transmissions, while Decreasing-Order-Fair-Greedy (DOFG) also guarantees some measure of fairness between the devices. We then present an efficient way to manage the trade-off between energy consumption and packet losses in Long-Range (LoRa) networks using our algorithms, by which LoRa nodes adjust their emission parameters (Spreading Factor and transmitting power). We implement our algorithms on a LoRa network simulator and show that such learning techniques largely outperform the Adaptive Data Rate (ADR) algorithm currently implemented in LoRa devices, in terms of both energy consumption and packet losses

    Massive Multi-Player Multi-Armed Bandits for IoT Networks: An Application on LoRa Networks

    No full text
    More and more manufacturers, as part of the transition toward Industry 4.0, are using Internet of Things (IoT) networks for more efficient production. The wide and extensive expansion of IoT devices and the variety of applications generate different challenges, mainly in terms of reliability and energy efficiency. In this paper, we propose an approach to optimize the performance of IoT networks by making the IoT devices intelligent using machine learning techniques. We formulate the optimization problem as a massive multi-player multi-armed bandit and introduce two novel policies: Decreasing-Order-Reward-Greedy (DORG) focuses on the number of successful transmissions, while Decreasing-Order-Fair-Greedy (DOFG) also guarantees some measure of fairness between the devices. We then present an efficient way to manage the trade-off between energy consumption and packet losses in Long-Range (LoRa) networks using our algorithms, by which LoRa nodes adjust their emission parameters (Spreading Factor and transmitting power). We implement our algorithms on a LoRa network simulator and show that such learning techniques largely outperform the Adaptive Data Rate (ADR) algorithm currently implemented in LoRa devices, in terms of both energy consumption and packet losses

    Decentralized Smart Charging of Large-Scale EVs using Adaptive Multi-Agent Multi-Armed Bandits

    No full text
    The drastic growth of electric vehicles and photovoltaics can introduce new challenges, such as electrical current congestion and voltage limit violations due to peak load demands. These issues can be mitigated by controlling the operation of electric vehicles i.e., smart charging. Centralized smart charging solutions have already been proposed in the literature. But such solutions may lack scalability and suffer from inherent drawbacks of centralization, such as a single point of failure, and data privacy concerns. Decentralization can help tackle these challenges. In this paper, a fully decentralized smart charging system is proposed using the philosophy of adaptive multi-agent systems. The proposed system utilizes multi-armed bandit learning to handle uncertainties in the system. The presented system is decentralized, scalable, real-time, model-free, and takes fairness among different players into account. A detailed case study is also presented for performance evaluation

    Horizontal Scaling in Cloud Using Contextual Bandits

    No full text
    International audienceOne characteristic of the Cloud is elasticity: it provides the ability to adapt resources allocated to applications as needed at runtime. This capacity relies on scaling and scheduling. In this article online horizontal scaling is studied. The aim is to determine dynamically applications deployment parameters and to adjust them in order to respect a Quality of Service level without any human parameters tuning. This work focuses on CaaS (container-based) environments and proposes an algorithm based on contextual bandits (HSLinUCB). Our proposal has been evaluated on a simulated platform and on a real Kubernetes’s platform. The comparison has been done with several baselines: threshold based auto-scaler, Q-Learning, and Deep Q-Learning. The results show that HSLinUCB gives very good results compared to other baselines, even when used without any training period
    corecore