256 research outputs found

    Principled Weight Initialisation for Input-Convex Neural Networks

    Full text link
    Input-Convex Neural Networks (ICNNs) are networks that guarantee convexity in their input-output mapping. These networks have been successfully applied for energy-based modelling, optimal transport problems and learning invariances. The convexity of ICNNs is achieved by using non-decreasing convex activation functions and non-negative weights. Because of these peculiarities, previous initialisation strategies, which implicitly assume centred weights, are not effective for ICNNs. By studying signal propagation through layers with non-negative weights, we are able to derive a principled weight initialisation for ICNNs. Concretely, we generalise signal propagation theory by removing the assumption that weights are sampled from a centred distribution. In a set of experiments, we demonstrate that our principled initialisation effectively accelerates learning in ICNNs and leads to better generalisation. Moreover, we find that, in contrast to common belief, ICNNs can be trained without skip-connections when initialised correctly. Finally, we apply ICNNs to a real-world drug discovery task and show that they allow for more effective molecular latent space exploration.Comment: Presented at NeurIPS 202

    Data-Driven Mirror Descent with Input-Convex Neural Networks

    Get PDF
    Learning-to-optimize is an emerging framework that seeks to speed up the solution of certain optimization problems by leveraging training data. Learned optimization solvers have been shown to outperform classical optimization algorithms in terms of convergence speed, especially for convex problems. Many existing data-driven optimization methods are based on parameterizing the update step and learning the optimal parameters (typically scalars) from the available data. We propose a novel functional parameterization approach for learned convex optimization solvers based on the classical mirror descent (MD) algorithm. Specifically, we seek to learn the optimal Bregman distance in MD by modeling the underlying convex function using an input-convex neural network (ICNN). The parameters of the ICNN are learned by minimizing the target objective function evaluated at the MD iterate after a predetermined number of iterations. The inverse of the mirror map is modeled approximately using another neural network, as the exact inverse is intractable to compute. We derive convergence rate bounds for the proposed learned mirror descent (LMD) approach with an approximate inverse mirror map and perform extensive numerical evaluation on various convex problems such as image inpainting, denoising, learning a two-class support vector machine (SVM) classifier and a multi-class linear classifier on fixed features

    A computational framework for nanotrusses: input convex neural networks approach

    Full text link
    The present research aims to provide a practical numerical tool for the mechanical analysis of nanoscale trusses with similar accuracy to molecular dynamics (MD). As a first step, MD simulations of uniaxial tensile and compression tests of all possible chiralities of single-walled carbon nanotubes up to 4 nm in diameter were performed using the AIREBO potential. The results represent a dataset consisting of stress/strain curves that were then used to develop a neural network that serves as a surrogate for a constitutive model for all nanotubes considered. The cornerstone of the new framework is a partially input convex integrable neural network. It turns out that convexity enables favorable convergence properties required for implementation in the classical nonlinear truss finite element available in Abaqus. This completes a molecular dynamics-machine learning-finite element framework suitable for the static analysis of large, nanoscale, truss-like structures. The performance is verified through a comprehensive set of examples that demonstrate ease of use, accuracy, and robustness
    corecore