47,304 research outputs found

    The star-shaped space of solutions of the spherical negative perceptron

    Full text link
    Empirical studies on the landscape of neural networks have shown that low-energy configurations are often found in complex connected structures, where zero-energy paths between pairs of distant solutions can be constructed. Here we consider the spherical negative perceptron, a prototypical non-convex neural network model framed as a continuous constraint satisfaction problem. We introduce a general analytical method for computing energy barriers in the simplex with vertex configurations sampled from the equilibrium. We find that in the over-parameterized regime the solution manifold displays simple connectivity properties. There exists a large geodesically convex component that is attractive for a wide range of optimization dynamics. Inside this region we identify a subset of atypical high-margin solutions that are geodesically connected with most other solutions, giving rise to a star-shaped geometry. We analytically characterize the organization of the connected space of solutions and show numerical evidence of a transition, at larger constraint densities, where the aforementioned simple geodesic connectivity breaks down.Comment: 27 pages, 16 figures, comments are welcom

    Modification of the mean-square error principle to double the convergence speed of a special case of Hopfield neural network used to segment pathological liver color images

    Get PDF
    BACKGROUND: This paper analyzes the effect of the mean-square error principle on the optimization process using a Special Case of Hopfield Neural Network (SCHNN). METHODS: The segmentation of multidimensional medical and colour images can be formulated as an energy function composed of two terms: the sum of squared errors, and a noise term used to avoid the network to be stacked in early local minimum points of the energy landscape. RESULTS: Here, we show that the sum of weighted error, higher than simple squared error, leads the SCHNN classifier to reach faster a local minimum closer to the global minimum with the assurance of acceptable segmentation results. CONCLUSIONS: The proposed segmentation method is used to segment 20 pathological liver colour images, and is shown to be efficient and very effective to be implemented for use in clinics

    Proving Linear Mode Connectivity of Neural Networks via Optimal Transport

    Full text link
    The energy landscape of high-dimensional non-convex optimization problems is crucial to understanding the effectiveness of modern deep neural network architectures. Recent works have experimentally shown that two different solutions found after two runs of a stochastic training are often connected by very simple continuous paths (e.g., linear) modulo a permutation of the weights. In this paper, we provide a framework theoretically explaining this empirical observation. Based on convergence rates in Wasserstein distance of empirical measures, we show that, with high probability, two wide enough two-layer neural networks trained with stochastic gradient descent are linearly connected. Additionally, we express upper and lower bounds on the width of each layer of two deep neural networks with independent neuron weights to be linearly connected. Finally, we empirically demonstrate the validity of our approach by showing how the dimension of the support of the weight distribution of neurons, which dictates Wasserstein convergence rates is correlated with linear mode connectivity

    Artificial Neural Network in Cosmic Landscape

    Get PDF
    In this paper we propose that artificial neural network, the basis of machine learning, is useful to generate the inflationary landscape from a cosmological point of view. Traditional numerical simulations of a global cosmic landscape typically need an exponential complexity when the number of fields is large. However, a basic application of artificial neural network could solve the problem based on the universal approximation theorem of the multilayer perceptron. A toy model in inflation with multiple light fields is investigated numerically as an example of such an application.Comment: v2, add some new content
    • …
    corecore