Search CORE

1,331 research outputs found

Contrastive Learning for Lifted Networks

Author: Estellers Virginia
Zach Christopher
Publication venue
Publication date: 26/07/2019
Field of study

In this work we address supervised learning of neural networks via lifted network formulations. Lifted networks are interesting because they allow training on massively parallel hardware and assign energy models to discriminatively trained neural networks. We demonstrate that the training methods for lifted networks proposed in the literature have significant limitations and show how to use a contrastive loss to address those limitations. We demonstrate that this contrastive training approximates back-propagation in theory and in practice and that it is superior to the training objective regularly used for lifted networks.Comment: 9 pages, BMVC 201

arXiv.org e-Print Archive

Chalmers Research

Lifted Regression/Reconstruction Networks

Author: Kj\ue6r H\uf8ier Rasmus
Zach Christopher
Publication venue
Publication date: 01/01/2020
Field of study

In this work we propose lifted regression/reconstruction networks(LRRNs), which combine lifted neural networks with a guaranteed Lipschitz continuity property for the output layer. Lifted neural networks explicitly optimize an energy model to infer the unit activations and therefore—in contrast to standard feed-forward neural networks—allow bidirectional feedback between layers. So far lifted neural networks have been modelled around standard feed-forward architectures. We propose to take further advantage of the feedback property by letting the layers simultaneously perform regression and reconstruction. The resulting lifted network architecture allows to control the desired amount of Lipschitz continuity, which is an important feature to obtain adversarially robust regression and classification methods. We analyse and numerically demonstrate applications for unsupervised and supervised learnin

Chalmers Research

Lifted Regression/Reconstruction Networks

Author: Høier Rasmus Kjær
Zach Christopher
Publication venue
Publication date: 01/01/2020
Field of study

In this work we propose lifted regression/reconstruction networks (LRRNs), which combine lifted neural networks with a guaranteed Lipschitz continuity property for the output layer. Lifted neural networks explicitly optimize an energy model to infer the unit activations and therefore---in contrast to standard feed-forward neural networks---allow bidirectional feedback between layers. So far lifted neural networks have been modelled around standard feed-forward architectures. We propose to take further advantage of the feedback property by letting the layers simultaneously perform regression and reconstruction. The resulting lifted network architecture allows to control the desired amount of Lipschitz continuity, which is an important feature to obtain adversarially robust regression and classification methods. We analyse and numerically demonstrate applications for unsupervised and supervised learning.Comment: 12 pages, 8 figure

arXiv.org e-Print Archive

Chalmers Research

You Only Propagate Once: Accelerating Adversarial Training via Maximal Principle

Author: Dong Bin
Lu Yiping
Zhang Dinghuai
Zhang Tianyuan
Zhu Zhanxing
Publication venue
Publication date: 01/11/2019
Field of study

Deep learning achieves state-of-the-art results in many tasks in computer vision and natural language processing. However, recent works have shown that deep networks can be vulnerable to adversarial perturbations, which raised a serious robustness issue of deep networks. Adversarial training, typically formulated as a robust optimization problem, is an effective way of improving the robustness of deep networks. A major drawback of existing adversarial training algorithms is the computational overhead of the generation of adversarial examples, typically far greater than that of the network training. This leads to the unbearable overall computational cost of adversarial training. In this paper, we show that adversarial training can be cast as a discrete time differential game. Through analyzing the Pontryagin's Maximal Principle (PMP) of the problem, we observe that the adversary update is only coupled with the parameters of the first layer of the network. This inspires us to restrict most of the forward and back propagation within the first layer of the network during adversary updates. This effectively reduces the total number of full forward and backward propagation to only one for each group of adversary updates. Therefore, we refer to this algorithm YOPO (You Only Propagate Once). Numerical experiments demonstrate that YOPO can achieve comparable defense accuracy with approximately 1/5 ~ 1/4 GPU time of the projected gradient descent (PGD) algorithm. Our codes are available at https://https://github.com/a1600012888/YOPO-You-Only-Propagate-Once.Comment: Accepted as a conference paper at NeurIPS 201

arXiv.org e-Print Archive

Southampton (e-Prints Soton)