This thesis studies the effect of adding a term usually neglected during
the training phase of energy-based models. It is called ``KL term'' because
of its dependence on Kullback-Leibler divergence, and it does not have a
significant impact on the training in terms of running time and
computational cost. I will initially present an analysis of its impact on training stability and some considerations relative to the general structure of the learning model. I will then study the denoising capabilities of the model by implementing top-down processes applied to different types of noisy input. Thirdly, to understand the quality of internal representations emerging in the hidden layers, I will apply a read-out classifier to the deepest hidden layer, calculating psychometric curves produced with different noise values. The final analysis will explore the model capability to resist to adversarial attacks using forward-backward iterations, also considering the spontaneous generative activity of the network by stimulating the read-out neurons relative to a specific class. Interestingly, results of this latter investigation are very encouraging for the MNIST dataset, suggesting that this type of energy-based models has potential to improve current defenses on adversarial attacks. However, for Cifar10 dataset it seems that more powerful computing hardware is needed to train models of larger sizes
Is data on this page outdated, violates copyrights or anything else? Report the problem now and we will take corresponding actions after reviewing your request.