Search CORE

14,323 research outputs found

CrossNorm: Normalization for Off-Policy TD Reinforcement Learning

Author: Amiranashvili Artemij
Argus Max
Bhatt Aditya
Brox Thomas
Publication venue
Publication date: 17/10/2019
Field of study

Off-policy temporal difference (TD) methods are a powerful class of reinforcement learning (RL) algorithms. Intriguingly, deep off-policy TD algorithms are not commonly used in combination with feature normalization techniques, despite positive effects of normalization in other domains. We show that naive application of existing normalization techniques is indeed not effective, but that well-designed normalization improves optimization stability and removes the necessity of target networks. In particular, we introduce a normalization based on a mixture of on- and off-policy transitions, which we call cross-normalization. It can be regarded as an extension of batch normalization that re-centers data for two different distributions, as present in off-policy learning. Applied to DDPG and TD3, cross-normalization improves over the state of the art across a range of MuJoCo benchmark tasks

arXiv.org e-Print Archive

Stabilizing Training of Generative Adversarial Networks through Regularization

Author: Hofmann Thomas
Lucchi Aurelien
Nowozin Sebastian
Roth Kevin
Publication venue
Publication date: 07/11/2017
Field of study

Deep generative models based on Generative Adversarial Networks (GANs) have demonstrated impressive sample quality but in order to work they require a careful choice of architecture, parameter initialization, and selection of hyper-parameters. This fragility is in part due to a dimensional mismatch or non-overlapping support between the model distribution and the data distribution, causing their density ratio and the associated f-divergence to be undefined. We overcome this fundamental limitation and propose a new regularization approach with low computational cost that yields a stable GAN training procedure. We demonstrate the effectiveness of this regularizer across several architectures trained on common benchmark image generation tasks. Our regularization turns GAN models into reliable building blocks for deep learning

arXiv.org e-Print Archive

Repository for Publications and Research Data

MIHash: Online Hashing with Mutual Information

Author: Bargal Sarah Adel
Cakir Fatih
He Kun
Sclaroff Stan
Publication venue
Publication date: 29/07/2017
Field of study

Learning-based hashing methods are widely used for nearest neighbor retrieval, and recently, online hashing methods have demonstrated good performance-complexity trade-offs by learning hash functions from streaming data. In this paper, we first address a key challenge for online hashing: the binary codes for indexed data must be recomputed to keep pace with updates to the hash functions. We propose an efficient quality measure for hash functions, based on an information-theoretic quantity, mutual information, and use it successfully as a criterion to eliminate unnecessary hash table updates. Next, we also show how to optimize the mutual information objective using stochastic gradient descent. We thus develop a novel hashing method, MIHash, that can be used in both online and batch settings. Experiments on image retrieval benchmarks (including a 2.5M image dataset) confirm the effectiveness of our formulation, both in reducing hash table recomputations and in learning high-quality hash functions.Comment: International Conference on Computer Vision (ICCV), 201

arXiv.org e-Print Archive

Crossref

Boston University Institutional Repository (OpenBU)