13 research outputs found

    Reinforcement learning method for plug-in electric vehicle bidding

    Get PDF
    This study proposes a novel multi-agent method for electric vehicle (EV) owners who will take part in the electricity market. Each EV is considered as an agent, and all the EVs have vehicle-to-grid capability. These agents aim to minimise the charging cost and to increase the privacy of EV owners due to omitting the aggregator role in the system. Each agent has two independent decision cores for buying and selling energy. These cores are developed based on a reinforcement learning (RL) algorithm, i.e. Q-learning algorithm, due to its high efficiency and appropriate performance in multi-agent methods. Based on the proposed method, agents can buy and sell energy with the cost minimisation goal, while they should always have enough energy for the trip, considering the uncertain behaviours of EV owners. Numeric simulations on an illustrative example with one agent and a testing system with 500 agents demonstrate the effectiveness of the proposed method

    Task allocation based multi-agent reinforcement learning for LoRa nodes in gas wellhead monitoring service

    Get PDF
    This paper investigates a new alternative approach to handle the tasks allocation problem that associate with numerous Long Range (LoRa) nodes in the High-Pressure High-Temperature (HPHT) gas wellhead monitoring service. A Multi-Agent Reinforcement Learning approach is proposed in this paper to overcome this problem with the Proximal Policy Optimization (PPO) is chosen as the policy gradient method. An action space is the spreading factor and other parameters such as frequency and transmission power has been kept constant. The reward function for the training process will be determined by two parameters which are the acknowledge flag (ACK) and collision between packets. Each node will be distributed across a defined disc radius. Each node will be represented as an agent. Each agent will undergo packet transmission and the packet will be evaluated according to the reward function. The results show that PPO with Multi Agent Reinforcement Learning was able to determine the optimal configuration for each LoRa node. The total reward value corresponds to the total number of nodes. Furthermore, since this study also implements the use of CUDA, the training was able to done in 200 steps and 45 minutes

    Task allocation based multi-agent reinforcement learning for LoRa nodes in gas wellhead monitoring service

    Get PDF
    897-903This paper investigates a new alternative approach to handle the tasks allocation problem that associate with numerous Long Range (LoRa) nodes in the High-Pressure High-Temperature (HPHT) gas wellhead monitoring service. A Multi- Agent Reinforcement Learning approach is proposed in this paper to overcome this problem with the Proximal Policy Optimization (PPO) is chosen as the policy gradient method. An action space is the spreading factor and other parameters such as frequency and transmission power has been kept constant. The reward function for the training process will be determined by two parameters which are the acknowledge flag (ACK) and collision between packets. Each node will be distributed across a defined disc radius. Each node will be represented as an agent. Each agent will undergo packet transmission and the packet will be evaluated according to the reward function. The results show that PPO with Multi Agent Reinforcement Learning was able to determine the optimal configuration for each LoRa node. The total reward value corresponds to the total number of nodes. Furthermore, since this study also implements the use of CUDA, the training was able to done in 200 steps and 45 minutes

    Solving reward-collecting problems with UAVs: a comparison of online optimization and Q-learning

    Get PDF
    Uncrewed autonomous vehicles (UAVs) have made significant contributions to reconnaissance and surveillance missions in past US military campaigns. As the prevalence of UAVs increases, there has also been improvements in counter-UAV technology that makes it difficult for them to successfully obtain valuable intelligence within an area of interest. Hence, it has become important that modern UAVs can accomplish their missions while maximizing their chances of survival. In this work, we specifically study the problem of identifying a short path from a designated start to a goal, while collecting all rewards and avoiding adversaries that move randomly on the grid. We also provide a possible application of the framework in a military setting, that of autonomous casualty evacuation. We present a comparison of three methods to solve this problem: namely we implement a Deep Q-Learning model, an ε-greedy tabular Q-Learning model, and an online optimization framework. Our computational experiments, designed using simple grid-world environments with random adversaries showcase how these approaches work and compare them in terms of performance, accuracy, and computational time.R.Y. is partially supported by NSF DMS 1916037 and Consortium for Robotics and Unmanned Systems Education and Research (CRUSER).NSF DMS 1916037Consortium for Robotics and Unmanned Systems Education and Research (CRUSER

    Thirty Years of Machine Learning: The Road to Pareto-Optimal Wireless Networks

    Full text link
    Future wireless networks have a substantial potential in terms of supporting a broad range of complex compelling applications both in military and civilian fields, where the users are able to enjoy high-rate, low-latency, low-cost and reliable information services. Achieving this ambitious goal requires new radio techniques for adaptive learning and intelligent decision making because of the complex heterogeneous nature of the network structures and wireless services. Machine learning (ML) algorithms have great success in supporting big data analytics, efficient parameter estimation and interactive decision making. Hence, in this article, we review the thirty-year history of ML by elaborating on supervised learning, unsupervised learning, reinforcement learning and deep learning. Furthermore, we investigate their employment in the compelling applications of wireless networks, including heterogeneous networks (HetNets), cognitive radios (CR), Internet of things (IoT), machine to machine networks (M2M), and so on. This article aims for assisting the readers in clarifying the motivation and methodology of the various ML algorithms, so as to invoke them for hitherto unexplored services as well as scenarios of future wireless networks.Comment: 46 pages, 22 fig

    Energy-Aware Real-time Tasks Processing for FPGA Based Heterogeneous Cloud

    Get PDF
    Cloud computing is becoming an popular model of computing. Due to the increasing complexity of the cloud service requests, it often exploits heterogeneous architecture. Moreover, some service requests (SRs)/tasks exhibit real-time features, which are required to be handled within a specified duration. Along with the stipulated temporal management, the strategy should also be energy efficient, as energy consumption in cloud computing is challenging. In this paper, we have proposed a strategy, called ``Efficient Resource Allocation of Service Request" (ERASER) for energy efficient allocation and scheduling of periodic real-time SRs on cloud platform. Our target cloud platform is consist of Field Programmable Gate Arrays (FPGAs) as Processing Elements (PEs) along with the General Purpose Processors (GPP). We have further proposed, a SR migration technique to service maximum SRs. Simulation based experimental results demonstrate that the proposed methodology is capable to achieve upto 90% resource utilization with only 26% SR rejection rate over different experimental scenarios. Comparison results with other state-of-the-art techniques reveal that the proposed strategy outperforms the existing technique with 17% reduction in SR rejection rate and 21% less energy consumption. Further, the simulation outcomes have been validated on a real test-bed based on Xilinx Zynq SoC with benchmark tasks

    Energy-Efficient Scheduling for Real-Time Systems Based on Deep Q-Learning Model

    No full text
    corecore