9,761 research outputs found

    A Kernel Perspective for Regularizing Deep Neural Networks

    Get PDF
    We propose a new point of view for regularizing deep neural networks by using the norm of a reproducing kernel Hilbert space (RKHS). Even though this norm cannot be computed, it admits upper and lower approximations leading to various practical strategies. Specifically, this perspective (i) provides a common umbrella for many existing regularization principles, including spectral norm and gradient penalties, or adversarial training, (ii) leads to new effective regularization penalties, and (iii) suggests hybrid strategies combining lower and upper bounds to get better approximations of the RKHS norm. We experimentally show this approach to be effective when learning on small datasets, or to obtain adversarially robust models.Comment: ICM

    Estimating labels from label proportions

    Get PDF
    Consider the following problem: given sets of unlabeled observations, each set with known label proportions, predict the labels of another set of observations, also with known label proportions. This problem appears in areas like e-commerce, spam filtering and improper content detection. We present consistent estimators which can reconstruct the correct labels with high probability in a uniform convergence sense. Experiments show that our method works well in practice.

    A system of ODEs for a Perturbation of a Minimal Mass Soliton

    Full text link
    We study soliton solutions to a nonlinear Schrodinger equation with a saturated nonlinearity. Such nonlinearities are known to possess minimal mass soliton solutions. We consider a small perturbation of a minimal mass soliton, and identify a system of ODEs similar to those from Comech and Pelinovsky (2003), which model the behavior of the perturbation for short times. We then provide numerical evidence that under this system of ODEs there are two possible dynamical outcomes, which is in accord with the conclusions of Pelinovsky, Afanasjev, and Kivshar (1996). For initial data which supports a soliton structure, a generic initial perturbation oscillates around the stable family of solitons. For initial data which is expected to disperse, the finite dimensional dynamics follow the unstable portion of the soliton curve.Comment: Minor edit

    Spectral Dimensionality Reduction

    Get PDF
    In this paper, we study and put under a common framework a number of non-linear dimensionality reduction methods, such as Locally Linear Embedding, Isomap, Laplacian Eigenmaps and kernel PCA, which are based on performing an eigen-decomposition (hence the name 'spectral'). That framework also includes classical methods such as PCA and metric multidimensional scaling (MDS). It also includes the data transformation step used in spectral clustering. We show that in all of these cases the learning algorithm estimates the principal eigenfunctions of an operator that depends on the unknown data density and on a kernel that is not necessarily positive semi-definite. This helps to generalize some of these algorithms so as to predict an embedding for out-of-sample examples without having to retrain the model. It also makes it more transparent what these algorithm are minimizing on the empirical data and gives a corresponding notion of generalization error. Dans cet article, nous étudions et développons un cadre unifié pour un certain nombre de méthodes non linéaires de réduction de dimensionalité, telles que LLE, Isomap, LE (Laplacian Eigenmap) et ACP à noyaux, qui font de la décomposition en valeurs propres (d'où le nom "spectral"). Ce cadre inclut également des méthodes classiques telles que l'ACP et l'échelonnage multidimensionnel métrique (MDS). Il inclut aussi l'étape de transformation de données utilisée dans l'agrégation spectrale. Nous montrons que, dans tous les cas, l'algorithme d'apprentissage estime les fonctions propres principales d'un opérateur qui dépend de la densité inconnue de données et d'un noyau qui n'est pas nécessairement positif semi-défini. Ce cadre aide à généraliser certains modèles pour prédire les coordonnées des exemples hors-échantillons sans avoir à réentraîner le modèle. Il aide également à rendre plus transparent ce que ces algorithmes minimisent sur les données empiriques et donne une notion correspondante d'erreur de généralisation.non-parametric models, non-linear dimensionality reduction, kernel models, modèles non paramétriques, réduction de dimensionalité non linéaire, modèles à noyau

    On the net reproduction rate of continuous structured populations with distributed states at birth

    Full text link
    We consider a nonlinear structured population model with a distributed recruitment term. The question of the existence of non-trivial steady states can be treated (at least!) in three different ways. One approach is to study spectral properties of a parametrized family of unbounded operators. The alternative approach, on which we focus here, is based on the reformulation of the problem as an integral equation. In this context we introduce a density dependent net reproduction rate and discuss its relationship to a biologically meaningful quantity. Finally, we briefly discuss a third approach, which is based on the finite rank approximation of the recruitment operator.Comment: To appear in Computers and Mathematics with Application
    corecore