26 research outputs found
Towards Arbitrary Noise Augmentation - Deep Learning for Sampling from Arbitrary Probability Distributions
Accurate noise modelling is important for training of deep learning
reconstruction algorithms. While noise models are well known for traditional
imaging techniques, the noise distribution of a novel sensor may be difficult
to determine a priori. Therefore, we propose learning arbitrary noise
distributions. To do so, this paper proposes a fully connected neural network
model to map samples from a uniform distribution to samples of any explicitly
known probability density function. During the training, the Jensen-Shannon
divergence between the distribution of the model's output and the target
distribution is minimized. We experimentally demonstrate that our model
converges towards the desired state. It provides an alternative to existing
sampling methods such as inversion sampling, rejection sampling, Gaussian
mixture models and Markov-Chain-Monte-Carlo. Our model has high sampling
efficiency and is easily applied to any probability distribution, without the
need of further analytical or numerical calculations
Initialization of ReLUs for Dynamical Isometry
Deep learning relies on good initialization schemes and hyperparameter
choices prior to training a neural network. Random weight initializations
induce random network ensembles, which give rise to the trainability, training
speed, and sometimes also generalization ability of an instance. In addition,
such ensembles provide theoretical insights into the space of candidate models
of which one is selected during training. The results obtained so far rely on
mean field approximations that assume infinite layer width and that study
average squared signals. We derive the joint signal output distribution
exactly, without mean field assumptions, for fully-connected networks with
Gaussian weights and biases, and analyze deviations from the mean field
results. For rectified linear units, we further discuss limitations of the
standard initialization scheme, such as its lack of dynamical isometry, and
propose a simple alternative that overcomes these by initial parameter sharing.Comment: NeurIPS 201