35 research outputs found
Greedy Bayesian Posterior Approximation with Deep Ensembles
Ensembles of independently trained neural networks are a state-of-the-art
approach to estimate predictive uncertainty in Deep Learning, and can be
interpreted as an approximation of the posterior distribution via a mixture of
delta functions. The training of ensembles relies on non-convexity of the loss
landscape and random initialization of their individual members, making the
resulting posterior approximation uncontrolled. This paper proposes a novel and
principled method to tackle this limitation, minimizing an -divergence
between the true posterior and a kernel density estimator in a function space.
We analyze this objective from a combinatorial point of view, and show that it
is submodular with respect to mixture components for any . Subsequently, we
consider the problem of ensemble construction, and from the marginal gain of
the total objective, we derive a novel diversity term for training ensembles
greedily. The performance of our approach is demonstrated on computer vision
out-of-distribution detection benchmarks in a range of architectures trained on
multiple datasets. The source code of our method is publicly available at
https://github.com/MIPT-Oulu/greedy_ensembles_training
Improving Convolutional Neural Network Design via Variable Neighborhood Search
An unsupervised method for convolutional neural network (CNN) architecture design is proposed. The method relies on a variable neighborhood search-based approach for finding CNN architectures and hyperparameter values that improve classification performance. For this purpose, t-Distributed Stochastic Neighbor Embedding (t-SNE) is applied to effectively represent the solution space in 2D. Then, k-Means clustering divides this representation space having in account the relative distance between neighbors. The algorithm is tested in the CIFAR-10 image dataset. The obtained solution improves the CNN validation loss by over 15% and the respective accuracy by 5%. Moreover, the network shows higher predictive power and robustness, validating our method for the optimization of CNN design. © Springer International Publishing AG 2017
Neural Ordinary Differential Equation Control of Dynamics on Graphs
We study the ability of neural networks to calculate feedback control signals
that steer trajectories of continuous time non-linear dynamical systems on
graphs, which we represent with neural ordinary differential equations (neural
ODEs). To do so, we present a neural-ODE control (NODEC) framework and find
that it can learn feedback control signals that drive graph dynamical systems
into desired target states. While we use loss functions that do not constrain
the control energy, our results show, in accordance with related work, that
NODEC produces low energy control signals. Finally, we evaluate the performance
and versatility of NODEC against well-known feedback controllers and deep
reinforcement learning. We use NODEC to generate feedback controls for systems
of more than one thousand coupled, non-linear ODEs that represent epidemic
processes and coupled oscillators.Comment: Fifth version improves and clears notatio