To ensure that the data aggregation, data storage, and data processing are
all performed in a decentralized but trusted manner, we propose to use the
blockchain with the mining pool to support IoT services based on cognitive
radio networks. As such, the secondary user can send its sensing data, i.e.,
transactions, to the mining pools. After being verified by miners, the
transactions are added to the blocks. However, under the dynamics of the
primary channel and the uncertainty of the mempool state of the mining pool, it
is challenging for the secondary user to determine an optimal transaction
transmission policy. In this paper, we propose to use the deep reinforcement
learning algorithm to derive an optimal transaction transmission policy for the
secondary user. Specifically, we adopt a Double Deep-Q Network (DDQN) that
allows the secondary user to learn the optimal policy. The simulation results
clearly show that the proposed deep reinforcement learning algorithm
outperforms the conventional Q-learning scheme in terms of reward and learning
speed