3,542 research outputs found

    A jamming transition from under- to over-parametrization affects loss landscape and generalization

    Full text link
    We argue that in fully-connected networks a phase transition delimits the over- and under-parametrized regimes where fitting can or cannot be achieved. Under some general conditions, we show that this transition is sharp for the hinge loss. In the whole over-parametrized regime, poor minima of the loss are not encountered during training since the number of constraints to satisfy is too small to hamper minimization. Our findings support a link between this transition and the generalization properties of the network: as we increase the number of parameters of a given model, starting from an under-parametrized network, we observe that the generalization error displays three phases: (i) initial decay, (ii) increase until the transition point --- where it displays a cusp --- and (iii) slow decay toward a constant for the rest of the over-parametrized regime. Thereby we identify the region where the classical phenomenon of over-fitting takes place, and the region where the model keeps improving, in line with previous empirical observations for modern neural networks.Comment: arXiv admin note: text overlap with arXiv:1809.0934

    Structured Training for Neural Network Transition-Based Parsing

    Full text link
    We present structured perceptron training for neural network transition-based dependency parsing. We learn the neural network representation using a gold corpus augmented by a large number of automatically parsed sentences. Given this fixed network representation, we learn a final layer using the structured perceptron with beam-search decoding. On the Penn Treebank, our parser reaches 94.26% unlabeled and 92.41% labeled attachment accuracy, which to our knowledge is the best accuracy on Stanford Dependencies to date. We also provide in-depth ablative analysis to determine which aspects of our model provide the largest gains in accuracy

    Machine learning plasma-surface interface for coupling sputtering and gas-phase transport simulations

    Full text link
    Thin film processing by means of sputter deposition inherently depends on the interaction of energetic particles with a target surface and the subsequent particle transport. The length and time scales of the underlying physical phenomena span orders of magnitudes. A theoretical description which bridges all time and length scales is not practically possible. Advantage can be taken particularly from the well-separated time scales of the fundamental surface and plasma processes. Initially, surface properties may be calculated from a surface model and stored for a number of representative cases. Subsequently, the surface data may be provided to gas-phase transport simulations via appropriate model interfaces (e.g., analytic expressions or look-up tables) and utilized to define insertion boundary conditions. During run-time evaluation, however, the maintained surface data may prove to be not sufficient. In this case, missing data may be obtained by interpolation (common), extrapolation (inaccurate), or be supplied on-demand by the surface model (computationally inefficient). In this work, a potential alternative is established based on machine learning techniques using artificial neural networks. As a proof of concept, a multilayer perceptron network is trained and verified with sputtered particle distributions obtained from transport of ions in matter based simulations for Ar projectiles bombarding a Ti-Al composite. It is demonstrated that the trained network is able to predict the sputtered particle distributions for unknown, arbitrarily shaped incident ion energy distributions. It is consequently argued that the trained network may be readily used as a machine learning based model interface (e.g., by quasi-continuously sampling the desired sputtered particle distributions from the network), which is sufficiently accurate also in scenarios which have not been previously trained

    Flood. An open source neural networks C++ library

    Get PDF
    The multilayer perceptron is an important model of neural network, and much of the literature in the eld is referred to that model. The multilayer perceptron has found a wide range of applications, which include function re- gression, pattern recognition, time series prediction, optimal control, optimal shape design or inverse problems. All these problems can be formulated as variational problems. That neural network can learn either from databases or from mathematical models. Flood is a comprehensive class library which implements the multilayer perceptron in the C++ programming language. It has been developed follow- ing the functional analysis and calculus of variations theories. In this regard, this software tool can be used for the whole range of applications mentioned above. Flood also provides a workaround for the solution of function opti- mization problems
    corecore