Search CORE

35 research outputs found

Greedy Bayesian Posterior Approximation with Deep Ensembles

Author: Blaschko Matthew B.
Tiulpin Aleksei
Publication venue
Publication date: 10/10/2021
Field of study

Ensembles of independently trained neural networks are a state-of-the-art approach to estimate predictive uncertainty in Deep Learning, and can be interpreted as an approximation of the posterior distribution via a mixture of delta functions. The training of ensembles relies on non-convexity of the loss landscape and random initialization of their individual members, making the resulting posterior approximation uncontrolled. This paper proposes a novel and principled method to tackle this limitation, minimizing an

f

-divergence between the true posterior and a kernel density estimator in a function space. We analyze this objective from a combinatorial point of view, and show that it is submodular with respect to mixture components for any

f

. Subsequently, we consider the problem of ensemble construction, and from the marginal gain of the total objective, we derive a novel diversity term for training ensembles greedily. The performance of our approach is demonstrated on computer vision out-of-distribution detection benchmarks in a range of architectures trained on multiple datasets. The source code of our method is publicly available at https://github.com/MIPT-Oulu/greedy_ensembles_training

arXiv.org e-Print Archive

Improving Convolutional Neural Network Design via Variable Neighborhood Search

Author: Ana Maria Mendonça
Araujo T
Aresta G
Aurélio Campilho
Bernardo Almada Lobo
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2017
Field of study

An unsupervised method for convolutional neural network (CNN) architecture design is proposed. The method relies on a variable neighborhood search-based approach for finding CNN architectures and hyperparameter values that improve classification performance. For this purpose, t-Distributed Stochastic Neighbor Embedding (t-SNE) is applied to effectively represent the solution space in 2D. Then, k-Means clustering divides this representation space having in account the relative distance between neighbors. The algorithm is tested in the CIFAR-10 image dataset. The obtained solution improves the CNN validation loss by over 15% and the respective accuracy by 5%. Moreover, the network shows higher predictive power and robustness, validating our method for the optimization of CNN design. © Springer International Publishing AG 2017

Crossref

Repositório Aberto da Universidade do Porto

Neural Ordinary Differential Equation Control of Dynamics on Graphs

Author: Antulov-Fantulin Nino
Asikis Thomas
Böttcher Lucas
Publication venue
Publication date: 14/10/2021
Field of study

We study the ability of neural networks to calculate feedback control signals that steer trajectories of continuous time non-linear dynamical systems on graphs, which we represent with neural ordinary differential equations (neural ODEs). To do so, we present a neural-ODE control (NODEC) framework and find that it can learn feedback control signals that drive graph dynamical systems into desired target states. While we use loss functions that do not constrain the control energy, our results show, in accordance with related work, that NODEC produces low energy control signals. Finally, we evaluate the performance and versatility of NODEC against well-known feedback controllers and deep reinforcement learning. We use NODEC to generate feedback controls for systems of more than one thousand coupled, non-linear ODEs that represent epidemic processes and coupled oscillators.Comment: Fifth version improves and clears notatio

arXiv.org e-Print Archive

Repository for Publications and Research Data