Learning from the interaction is the primary way biological agents know about
the environment and themselves. Modern deep reinforcement learning (DRL)
explores a computational approach to learning from interaction and has
significantly progressed in solving various tasks. However, the powerful DRL is
still far from biological agents in energy efficiency. Although the underlying
mechanisms are not fully understood, we believe that the integration of spiking
communication between neurons and biologically-plausible synaptic plasticity
plays a prominent role. Following this biological intuition, we optimize a
spiking policy network (SPN) by a genetic algorithm as an energy-efficient
alternative to DRL. Our SPN mimics the sensorimotor neuron pathway of insects
and communicates through event-based spikes. Inspired by biological research
that the brain forms memories by forming new synaptic connections and rewires
these connections based on new experiences, we tune the synaptic connections
instead of weights in SPN to solve given tasks. Experimental results on several
robotic control tasks show that our method can achieve the performance level of
mainstream DRL methods and exhibit significantly higher energy efficiency