26 research outputs found

    Representation mitosis in wide neural networks

    Full text link
    Deep neural networks (DNNs) defy the classical bias-variance trade-off: adding parameters to a DNN that interpolates its training data will typically improve its generalization performance. Explaining the mechanism behind this ``benign overfitting'' in deep networks remains an outstanding challenge. Here, we study the last hidden layer representations of various state-of-the-art convolutional neural networks and find evidence for an underlying mechanism that we call "representation mitosis": if the last hidden representation is wide enough, its neurons tend to split into groups which carry identical information, and differ from each other only by a statistically independent noise. Like in a mitosis process, the number of such groups, or ``clones'', increases linearly with the width of the layer, but only if the width is above a critical value. We show that a key ingredient to activate mitosis is continuing the training process until the training error is zero

    Intrinsic dimension estimation for discrete metrics

    Full text link
    Real world-datasets characterized by discrete features are ubiquitous: from categorical surveys to clinical questionnaires, from unweighted networks to DNA sequences. Nevertheless, the most common unsupervised dimensional reduction methods are designed for continuous spaces, and their use for discrete spaces can lead to errors and biases. In this letter we introduce an algorithm to infer the intrinsic dimension (ID) of datasets embedded in discrete spaces. We demonstrate its accuracy on benchmark datasets, and we apply it to analyze a metagenomic dataset for species fingerprinting, finding a surprisingly small ID, of order 2. This suggests that evolutive pressure acts on a low-dimensional manifold despite the high-dimensionality of sequences' space.Comment: RevTeX4.2, 13 pages, 10 figure

    Efficient nonparametric n-body force fields from machine learning

    Get PDF
    We provide a definition and explicit expressions for nn-body Gaussian Process (GP) kernels which can learn any interatomic interaction occurring in a physical system, up to nn-body contributions, for any value of nn. The series is complete, as it can be shown that the "universal approximator" squared exponential kernel can be written as a sum of nn-body kernels. These recipes enable the choice of optimally efficient force models for each target system, as confirmed by extensive testing on various materials. We furthermore describe how the nn-body kernels can be "mapped" on equivalent representations that provide database-size-independent predictions and are thus crucially more efficient. We explicitly carry out this mapping procedure for the first non-trivial (3-body) kernel of the series, and show that this reproduces the GP-predicted forces with meV/AËš\text{meV/} \AA accuracy while being orders of magnitude faster. These results open the way to using novel force models (here named "M-FFs") that are computationally as fast as their corresponding standard parametrised nn-body force fields, while retaining the nonparametric character, the ease of training and validation, and the accuracy of the best recently proposed machine learning potentials.Comment: 13 pages, 8 captioned figure

    Accurate interatomic force fields via machine learning with covariant kernels

    Get PDF
    We present a novel scheme to accurately predict atomic forces as vector quantities, rather than sets of scalar components, by Gaussian Process (GP) Regression. This is based on matrix-valued kernel functions, on which we impose the requirements that the predicted force rotates with the target configuration and is independent of any rotations applied to the configuration database entries. We show that such covariant GP kernels can be obtained by integration over the elements of the rotation group SO(d) for the relevant dimensionality d. Remarkably, in specific cases the integration can be carried out analytically and yields a conservative force field that can be recast into a pair interaction form. Finally, we show that restricting the integration to a summation over the elements of a finite point group relevant to the target system is sufficient to recover an accurate GP. The accuracy of our kernels in predicting quantum-mechanical forces in real materials is investigated by tests on pure and defective Ni, Fe and Si crystalline systems

    Ab initio Molecular Dynamics Trajectories of Metallic Systems - Positions and Forces

    No full text
    The files consist of picoseconds-long canonical (and thermalised) trajectories of 4 metallic crystalline systems. Within each file, positions and forces of all the atom are saved.The time-step was chosen to be 2 fs.The temperature was controlled by a loosely coupled Langevin thermostat.The periodic cell was taken of dimension 4x4x4.Details of each file:Ni_500K.xyz: Nickel, 500K.Ni_1700K.xyz: Nickel, 1700K.Fe_500K.xyz: Iron, 500K.Fe_500K_vac.xyz: Iron, 500K, with a single vacancy.Utility:The data can be used to reproduce the results of the associated publication and for further developments of closely related research.A. Glielmo, P. Sollich, A. De Vita, “Accurate Interatomic Force Fields via Machine Learning with Covariant Kernels”, Physical Review B. Submitte
    corecore