Search CORE

256 research outputs found

Principled Weight Initialisation for Input-Convex Neural Networks

Author: Hoedt Pieter-Jan
Klambauer Günter
Publication venue
Publication date: 19/12/2023
Field of study

Input-Convex Neural Networks (ICNNs) are networks that guarantee convexity in their input-output mapping. These networks have been successfully applied for energy-based modelling, optimal transport problems and learning invariances. The convexity of ICNNs is achieved by using non-decreasing convex activation functions and non-negative weights. Because of these peculiarities, previous initialisation strategies, which implicitly assume centred weights, are not effective for ICNNs. By studying signal propagation through layers with non-negative weights, we are able to derive a principled weight initialisation for ICNNs. Concretely, we generalise signal propagation theory by removing the assumption that weights are sampled from a centred distribution. In a set of experiments, we demonstrate that our principled initialisation effectively accelerates learning in ICNNs and leads to better generalisation. Moreover, we find that, in contrast to common belief, ICNNs can be trained without skip-connections when initialised correctly. Finally, we apply ICNNs to a real-world drug discovery task and show that they allow for more effective molecular latent space exploration.Comment: Presented at NeurIPS 202

arXiv.org e-Print Archive

Data-Driven Mirror Descent with Input-Convex Neural Networks

Author: Mukherjee Subhadip
Schönlieb Carola-Bibiane
Tan Hong Ye
Tang Junqi
Publication venue
Publication date: 14/06/2022
Field of study

Learning-to-optimize is an emerging framework that seeks to speed up the solution of certain optimization problems by leveraging training data. Learned optimization solvers have been shown to outperform classical optimization algorithms in terms of convergence speed, especially for convex problems. Many existing data-driven optimization methods are based on parameterizing the update step and learning the optimal parameters (typically scalars) from the available data. We propose a novel functional parameterization approach for learned convex optimization solvers based on the classical mirror descent (MD) algorithm. Specifically, we seek to learn the optimal Bregman distance in MD by modeling the underlying convex function using an input-convex neural network (ICNN). The parameters of the ICNN are learned by minimizing the target objective function evaluated at the MD iterate after a predetermined number of iterations. The inverse of the mirror map is modeled approximately using another neural network, as the exact inverse is intractable to compute. We derive convergence rate bounds for the proposed learned mirror descent (LMD) approach with an approximate inverse mirror map and perform extensive numerical evaluation on various convex problems such as image inpainting, denoising, learning a two-class support vector machine (SVM) classifier and a multi-class linear classifier on fixed features

arXiv.org e-Print Archive

OPUS

University of Birmingham Research Portal

Data-Driven Mirror Descent with Input-Convex Neural Networks

Author: Mukherjee Subhadip
Schönlieb Carola-Bibiane
Tan Hong Ye
Tang Junqi
Publication venue
Publication date: 14/06/2022
Field of study

OPUS

A computational framework for nanotrusses: input convex neural networks approach

Author: Košmerl Valentina
Munjas Neven
Vrtovšnik Domagoj
Zlatić Martin
Čanađija Marko
Publication venue
Publication date: 28/11/2023
Field of study

The present research aims to provide a practical numerical tool for the mechanical analysis of nanoscale trusses with similar accuracy to molecular dynamics (MD). As a first step, MD simulations of uniaxial tensile and compression tests of all possible chiralities of single-walled carbon nanotubes up to 4 nm in diameter were performed using the AIREBO potential. The results represent a dataset consisting of stress/strain curves that were then used to develop a neural network that serves as a surrogate for a constitutive model for all nanotubes considered. The cornerstone of the new framework is a partially input convex integrable neural network. It turns out that convexity enables favorable convergence properties required for implementation in the classical nonlinear truss finite element available in Abaqus. This completes a molecular dynamics-machine learning-finite element framework suitable for the static analysis of large, nanoscale, truss-like structures. The performance is verified through a comprehensive set of examples that demonstrate ease of use, accuracy, and robustness

arXiv.org e-Print Archive