3,542 research outputs found
A jamming transition from under- to over-parametrization affects loss landscape and generalization
We argue that in fully-connected networks a phase transition delimits the
over- and under-parametrized regimes where fitting can or cannot be achieved.
Under some general conditions, we show that this transition is sharp for the
hinge loss. In the whole over-parametrized regime, poor minima of the loss are
not encountered during training since the number of constraints to satisfy is
too small to hamper minimization. Our findings support a link between this
transition and the generalization properties of the network: as we increase the
number of parameters of a given model, starting from an under-parametrized
network, we observe that the generalization error displays three phases: (i)
initial decay, (ii) increase until the transition point --- where it displays a
cusp --- and (iii) slow decay toward a constant for the rest of the
over-parametrized regime. Thereby we identify the region where the classical
phenomenon of over-fitting takes place, and the region where the model keeps
improving, in line with previous empirical observations for modern neural
networks.Comment: arXiv admin note: text overlap with arXiv:1809.0934
Structured Training for Neural Network Transition-Based Parsing
We present structured perceptron training for neural network transition-based
dependency parsing. We learn the neural network representation using a gold
corpus augmented by a large number of automatically parsed sentences. Given
this fixed network representation, we learn a final layer using the structured
perceptron with beam-search decoding. On the Penn Treebank, our parser reaches
94.26% unlabeled and 92.41% labeled attachment accuracy, which to our knowledge
is the best accuracy on Stanford Dependencies to date. We also provide in-depth
ablative analysis to determine which aspects of our model provide the largest
gains in accuracy
Machine learning plasma-surface interface for coupling sputtering and gas-phase transport simulations
Thin film processing by means of sputter deposition inherently depends on the
interaction of energetic particles with a target surface and the subsequent
particle transport. The length and time scales of the underlying physical
phenomena span orders of magnitudes. A theoretical description which bridges
all time and length scales is not practically possible. Advantage can be taken
particularly from the well-separated time scales of the fundamental surface and
plasma processes. Initially, surface properties may be calculated from a
surface model and stored for a number of representative cases. Subsequently,
the surface data may be provided to gas-phase transport simulations via
appropriate model interfaces (e.g., analytic expressions or look-up tables) and
utilized to define insertion boundary conditions. During run-time evaluation,
however, the maintained surface data may prove to be not sufficient. In this
case, missing data may be obtained by interpolation (common), extrapolation
(inaccurate), or be supplied on-demand by the surface model (computationally
inefficient). In this work, a potential alternative is established based on
machine learning techniques using artificial neural networks. As a proof of
concept, a multilayer perceptron network is trained and verified with sputtered
particle distributions obtained from transport of ions in matter based
simulations for Ar projectiles bombarding a Ti-Al composite. It is demonstrated
that the trained network is able to predict the sputtered particle
distributions for unknown, arbitrarily shaped incident ion energy
distributions. It is consequently argued that the trained network may be
readily used as a machine learning based model interface (e.g., by
quasi-continuously sampling the desired sputtered particle distributions from
the network), which is sufficiently accurate also in scenarios which have not
been previously trained
Flood. An open source neural networks C++ library
The multilayer perceptron is an important model of neural network, and
much of the literature in the eld is referred to that model. The multilayer
perceptron has found a wide range of applications, which include function re-
gression, pattern recognition, time series prediction, optimal control, optimal
shape design or inverse problems. All these problems can be formulated as
variational problems. That neural network can learn either from databases
or from mathematical models.
Flood is a comprehensive class library which implements the multilayer
perceptron in the C++ programming language. It has been developed follow-
ing the functional analysis and calculus of variations theories. In this regard,
this software tool can be used for the whole range of applications mentioned
above. Flood also provides a workaround for the solution of function opti-
mization problems
- …