Search CORE

1,064 research outputs found

Robust learning with implicit residual networks

Author: Reshniak Viktor
Webster Clayton
Publication venue: 'MDPI AG'
Publication date: 31/12/2020
Field of study

In this effort, we propose a new deep architecture utilizing residual blocks inspired by implicit discretization schemes. As opposed to the standard feed-forward networks, the outputs of the proposed implicit residual blocks are defined as the fixed points of the appropriately chosen nonlinear transformations. We show that this choice leads to the improved stability of both forward and backward propagations, has a favorable impact on the generalization power and allows to control the robustness of the network with only a few hyperparameters. In addition, the proposed reformulation of ResNet does not introduce new parameters and can potentially lead to a reduction in the number of required layers due to improved forward stability. Finally, we derive the memory-efficient training algorithm, propose a stochastic regularization technique and provide numerical results in support of our findings

arXiv.org e-Print Archive

Multidisciplinary Digital Publishing Institute

Reversible GANs for Memory-efficient Image-to-Image Translation

Author: van der Ouderaa Tycho F. A.
Worrall Daniel E.
Publication venue
Publication date: 01/01/2019
Field of study

The Pix2pix and CycleGAN losses have vastly improved the qualitative and quantitative visual quality of results in image-to-image translation tasks. We extend this framework by exploring approximately invertible architectures which are well suited to these losses. These architectures are approximately invertible by design and thus partially satisfy cycle-consistency before training even begins. Furthermore, since invertible architectures have constant memory complexity in depth, these models can be built arbitrarily deep. We are able to demonstrate superior quantitative output on the Cityscapes and Maps datasets at near constant memory budget

arXiv.org e-Print Archive

Crossref

International Migration, Integration and Social Cohesion online publications

Sparsely Aggregated Convolutional Networks

Author: Deng Ruizhi
Deng Zhiwei
Maire Michael
Mori Greg
Tan Ping
Zhu Ligeng
Publication venue
Publication date: 07/02/2019
Field of study

We explore a key architectural aspect of deep convolutional neural networks: the pattern of internal skip connections used to aggregate outputs of earlier layers for consumption by deeper layers. Such aggregation is critical to facilitate training of very deep networks in an end-to-end manner. This is a primary reason for the widespread adoption of residual networks, which aggregate outputs via cumulative summation. While subsequent works investigate alternative aggregation operations (e.g. concatenation), we focus on an orthogonal question: which outputs to aggregate at a particular point in the network. We propose a new internal connection structure which aggregates only a sparse set of previous outputs at any given depth. Our experiments demonstrate this simple design change offers superior performance with fewer parameters and lower computational requirements. Moreover, we show that sparse aggregation allows networks to scale more robustly to 1000+ layers, thereby opening future avenues for training long-running visual processes.Comment: Accepted to ECCV 201

arXiv.org e-Print Archive

Crossref