research article

Research on optimization of data transmission strategies based on deep reinforcement learning

Abstract

Based on the theoretical framework of deep reinforcement learning, a hierarchical and progressive solution was proposed. Firstly, a heterogeneous data transmission architecture integrating edge computing nodes was constructed, and a multi-dimensional state space Markov decision process with time-varying characteristics was established. Secondly, the entropy regularization constraint term was embedded in the traditional deep Q-learning network (DQN) algorithm, and the experience replay mechanism of the same strategy was combined. An enhanced ESERDQN (improved DQN algorithm based on entropy and same-strategy experience replay) optimizer was formed. Finally, a five-dimensional evaluation index system (convergence rate, cumulative reward value, energy consumption, end-to-end delay, transmission cost) was designed to carry out multi-algorithm comparison experiments. The simulation results show that ESERDQN achieves stable convergence within 1 500 training cycles, which improves the convergence speed by 49.2%, 41.7%, 30.1% and 13.3% respectively compared with the benchmark greedy algorithm, random algorithm, DDPG algorithm and PPO. In terms of comprehensive business indicators, the unit energy cost was reduced by 27.8%, and the delay of key tasks is controlled within 12.3 ms, which verifies the technical superiority of the proposed method in complex transmission scenarios of smart cities

    Similar works