178,672 research outputs found
Generalization Properties and Implicit Regularization for Multiple Passes SGM
We study the generalization properties of stochastic gradient methods for
learning with convex loss functions and linearly parameterized functions. We
show that, in the absence of penalizations or constraints, the stability and
approximation properties of the algorithm can be controlled by tuning either
the step-size or the number of passes over the data. In this view, these
parameters can be seen to control a form of implicit regularization. Numerical
results complement the theoretical findings.Comment: 26 pages, 4 figures. To appear in ICML 201
Bayesian Free Energy of Deep ReLU Neural Network in Overparametrized Cases
In many research fields in artificial intelligence, it has been shown that
deep neural networks are useful to estimate unknown functions on high
dimensional input spaces. However, their generalization performance is not yet
completely clarified from the theoretical point of view because they are
nonidentifiable and singular learning machines. Moreover, a ReLU function is
not differentiable, to which algebraic or analytic methods in singular learning
theory cannot be applied. In this paper, we study a deep ReLU neural network in
overparametrized cases and prove that the Bayesian free energy, which is equal
to the minus log marginal likelihoodor the Bayesian stochastic complexity, is
bounded even if the number of layers are larger than necessary to estimate an
unknown data-generating function. Since the Bayesian generalization error is
equal to the increase of the free energy as a function of a sample size, our
result also shows that the Bayesian generalization error does not increase even
if a deep ReLU neural network is designed to be sufficiently large or in an
opeverparametrized state.Comment: 20pages, 2figur
Learning Multipicity Tree Automata
International audienceIn this paper, we present a theoretical approach for the problem of learning multiplicity tree automata. These automata allows one to define functions which compute a number for each tree. They can be seen as a strict generalization of stochastic tree automata since they allow to define functions over any field K. A multiplicity automaton admits a support which is a non deterministic automaton. From a grammatical inference point of view, this paper presents a contribution which is original due to the combination of two important aspects. This is the first time, as far as we now, that a learning method focuses on non deterministic tree automata which computes functions over a field. The algorithm proposed in this paper stands in Angluin's exact model where a learner is allowed to use membership and equivalence queries. We show that this algorithm is polynomial in time in function of the size of the representation
- …