27 research outputs found

    A note concerning polyhyperbolic and related splines

    Full text link
    This note concerns the finite interpolation problem with two parametrized families of splines related to polynomial spline interpolation. We address the questions of uniqueness and establish basic convergence rates for splines of the form sα=pcosh(α)+qsinh(α) s_\alpha = p\cosh(\alpha\cdot)+q\sinh(\alpha \cdot) and tα=p+qtanh(α)t_\alpha = p+q\tanh(\alpha \cdot) between the nodes where p,qΠk1p,q\in\Pi_{k-1}.Comment: 13 pages, updated to include fundin

    A stochastic optimization approach to train non-linear neural networks with regularization of higher-order total variation

    Full text link
    While highly expressive parametric models including deep neural networks have an advantage to model complicated concepts, training such highly non-linear models is known to yield a high risk of notorious overfitting. To address this issue, this study considers a kkth order total variation (kk-TV) regularization, which is defined as the squared integral of the kkth order derivative of the parametric models to be trained; penalizing the kk-TV is expected to yield a smoother function, which is expected to avoid overfitting. While the kk-TV terms applied to general parametric models are computationally intractable due to the integration, this study provides a stochastic optimization algorithm, that can efficiently train general models with the kk-TV regularization without conducting explicit numerical integration. The proposed approach can be applied to the training of even deep neural networks whose structure is arbitrary, as it can be implemented by only a simple stochastic gradient descent algorithm and automatic differentiation. Our numerical experiments demonstrate that the neural networks trained with the KK-TV terms are more ``resilient'' than those with the conventional parameter regularization. The proposed algorithm also can be extended to the physics-informed training of neural networks (PINNs).Comment: 13 pages, 24 figures, in preparation for submission; comments are welcome

    Deep Neural Networks with Trainable Activations and Controlled Lipschitz Constant

    Full text link
    We introduce a variational framework to learn the activation functions of deep neural networks. Our aim is to increase the capacity of the network while controlling an upper-bound of the actual Lipschitz constant of the input-output relation. To that end, we first establish a global bound for the Lipschitz constant of neural networks. Based on the obtained bound, we then formulate a variational problem for learning activation functions. Our variational problem is infinite-dimensional and is not computationally tractable. However, we prove that there always exists a solution that has continuous and piecewise-linear (linear-spline) activations. This reduces the original problem to a finite-dimensional minimization where an l1 penalty on the parameters of the activations favors the learning of sparse nonlinearities. We numerically compare our scheme with standard ReLU network and its variations, PReLU and LeakyReLU and we empirically demonstrate the practical aspects of our framework

    Stable interpolation with exponential-polynomial splines and node selection via greedy algorithms

    Get PDF
    In this work we extend some ideas about greedy algorithms, which are well-established tools for, e.g., kernel bases, and exponential-polynomial splines whose main drawback consists in possible overfitting and consequent oscillations of the approximant. To partially overcome this issue, we develop some results on theoretically optimal interpolation points. Moreover, we introduce two algorithms which perform an adaptive selection of the spline interpolation points based on the minimization either of the sample residuals (f-greedy), or of an upper bound for the approximation error based on the spline Lebesgue function (λ-greedy). Both methods allow us to obtain an adaptive selection of the sampling points, i.e., the spline nodes. While the f-greedy selection is tailored to one specific target function, the λ-greedy algorithm enables us to define target-data-independent interpolation nodes

    Duality for Neural Networks through Reproducing Kernel Banach Spaces

    Get PDF
    Reproducing Kernel Hilbert spaces (RKHS) have been a very successful tool in various areas of machine learning. Recently, Barron spaces have been used to prove bounds on the generalisation error for neural networks. Unfortunately, Barron spaces cannot be understood in terms of RKHS due to the strong nonlinear coupling of the weights. This can be solved by using the more general Reproducing Kernel Banach spaces (RKBS). We show that these Barron spaces belong to a class of integral RKBS. This class can also be understood as an infinite union of RKHS spaces. Furthermore, we show that the dual space of such RKBSs, is again an RKBS where the roles of the data and parameters are interchanged, forming an adjoint pair of RKBSs including a reproducing kernel. This allows us to construct the saddle point problem for neural networks, which can be used in the whole field of primal-dual optimisation
    corecore