In order to replace fossil fuels with the use of renewable energy resources,
unbalanced resource production of intermittent wind and photovoltaic (PV) power
is a critical issue for peer-to-peer (P2P) power trading. To resolve this
problem, a reinforcement learning (RL) technique is introduced in this paper.
For RL, graph convolutional network (GCN) and bi-directional long short-term
memory (Bi-LSTM) network are jointly applied to P2P power trading between
nanogrid clusters based on cooperative game theory. The flexible and reliable
DC nanogrid is suitable to integrate renewable energy for distribution system.
Each local nanogrid cluster takes the position of prosumer, focusing on power
production and consumption simultaneously. For the power management of nanogrid
clusters, multi-objective optimization is applied to each local nanogrid
cluster with the Internet of Things (IoT) technology. Charging/discharging of
electric vehicle (EV) is performed considering the intermittent characteristics
of wind and PV power production. RL algorithms, such as deep Q-learning network
(DQN), deep recurrent Q-learning network (DRQN), Bi-DRQN, proximal policy
optimization (PPO), GCN-DQN, GCN-DRQN, GCN-Bi-DRQN, and GCN-PPO, are used for
simulations. Consequently, the cooperative P2P power trading system maximizes
the profit utilizing the time of use (ToU) tariff-based electricity cost and
system marginal price (SMP), and minimizes the amount of grid power
consumption. Power management of nanogrid clusters with P2P power trading is
simulated on the distribution test feeder in real-time and proposed GCN-PPO
technique reduces the electricity cost of nanogrid clusters by 36.7%.Comment: 22 pages, 8 figures, to be submitted to Applied Energy of Elsevie