Conceived in the early 1990s, Experience Replay (ER) has been shown to be a
successful mechanism to allow online learning algorithms to reuse past
experiences. Traditionally, ER can be applied to all machine learning paradigms
(i.e., unsupervised, supervised, and reinforcement learning). Recently, ER has
contributed to improving the performance of deep reinforcement learning. Yet,
its application to many practical settings is still limited by the memory
requirements of ER, necessary to explicitly store previous observations. To
remedy this issue, we explore a novel approach, Online Contrastive Divergence
with Generative Replay (OCD_GR), which uses the generative capability of
Restricted Boltzmann Machines (RBMs) instead of recorded past experiences. The
RBM is trained online, and does not require the system to store any of the
observed data points. We compare OCD_GR to ER on 9 real-world datasets,
considering a worst-case scenario (data points arriving in sorted order) as
well as a more realistic one (sequential random-order data points). Our results
show that in 64.28% of the cases OCD_GR outperforms ER and in the remaining
35.72% it has an almost equal performance, while having a considerably reduced
space complexity (i.e., memory usage) at a comparable time complexity