Search CORE

714 research outputs found

Towards fast hybrid deep kernel learning methods

Author: Lara Miquel Miquel
Publication venue: Universitat Politècnica de Catalunya
Publication date: 01/01/2019
Field of study

El treball estudia la millor manera de crear xarxes neuronals híbrides amb mètodes kernel mitjançant dues aproximacions de kernel diferents, random Fourier features i el mètode Nystrom, i la millor manera d'entrenar-les, amb RMSprop i stochastic gradient descent

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

The Neural Tangent Link Between CNN Denoisers and Non-Local Filters

Author: Davies Mike
Tachella Julián
Tang Junqi
Publication venue
Publication date: 03/06/2020
Field of study

Convolutional Neural Networks (CNNs) are now a well-established tool for solving computational imaging problems. Modern CNN-based algorithms obtain state-of-the-art performance in diverse image restoration problems. Furthermore, it has been recently shown that, despite being highly overparameterized, networks trained with a single corrupted image can still perform as well as fully trained networks. We introduce a formal link between such networks through their neural tangent kernel (NTK), and well-known non-local filtering techniques, such as non-local means or BM3D. The filtering function associated with a given network architecture can be obtained in closed form without need to train the network, being fully characterized by the random initialization of the network weights. While the NTK theory accurately predicts the filter associated with networks trained using standard gradient descent, our analysis shows that it falls short to explain the behaviour of networks trained using the popular Adam optimizer. The latter achieves a larger change of weights in hidden layers, adapting the non-local filtering function during training. We evaluate our findings via extensive image denoising experiments

arXiv.org e-Print Archive

HAL-ENS-LYON

University of Birmingham Research Portal

Edinburgh Research Explorer

End-to-End Kernel Learning with Supervised Convolutional Kernel Networks

Author: Mairal Julien
Publication venue
Publication date: 25/10/2016
Field of study

In this paper, we introduce a new image representation based on a multilayer kernel machine. Unlike traditional kernel methods where data representation is decoupled from the prediction task, we learn how to shape the kernel with supervision. We proceed by first proposing improvements of the recently-introduced convolutional kernel networks (CKNs) in the context of unsupervised learning; then, we derive backpropagation rules to take advantage of labeled training data. The resulting model is a new type of convolutional neural network, where optimizing the filters at each layer is equivalent to learning a linear subspace in a reproducing kernel Hilbert space (RKHS). We show that our method achieves reasonably competitive performance for image classification on some standard "deep learning" datasets such as CIFAR-10 and SVHN, and also for image super-resolution, demonstrating the applicability of our approach to a large variety of image-related tasks.Comment: to appear in Advances in Neural Information Processing Systems (NIPS

arXiv.org e-Print Archive

Hal - Université Grenoble Alpes

INRIA a CCSD electronic archive server

NySALT: Nystr\"{o}m-type inference-based schemes adaptive to large time-stepping

Author: Li Xingjie
Lu Fei
Tao Molei
Ye Felix
Publication venue: 'Elsevier BV'
Publication date: 13/07/2022
Field of study

Large time-stepping is important for efficient long-time simulations of deterministic and stochastic Hamiltonian dynamical systems. Conventional structure-preserving integrators, while being successful for generic systems, have limited tolerance to time step size due to stability and accuracy constraints. We propose to use data to innovate classical integrators so that they can be adaptive to large time-stepping and are tailored to each specific system. In particular, we introduce NySALT, Nystr\"{o}m-type inference-based schemes adaptive to large time-stepping. The NySALT has optimal parameters for each time step learnt from data by minimizing the one-step prediction error. Thus, it is tailored for each time step size and the specific system to achieve optimal performance and tolerate large time-stepping in an adaptive fashion. We prove and numerically verify the convergence of the estimators as data size increases. Furthermore, analysis and numerical tests on the deterministic and stochastic Fermi-Pasta-Ulam (FPU) models show that NySALT enlarges the maximal admissible step size of linear stability, and quadruples the time step size of the St\"{o}rmer--Verlet and the BAOAB when maintaining similar levels of accuracy.Comment: 26 pages, 7 figure

arXiv.org e-Print Archive

Complex-valued embeddings of generic proximity data

Author: Biehl Michael
Münch Maximilian
Schleif Frank-Michael
Straat Michiel
Publication venue
Publication date: 31/08/2020
Field of study

Proximities are at the heart of almost all machine learning methods. If the input data are given as numerical vectors of equal lengths, euclidean distance, or a Hilbertian inner product is frequently used in modeling algorithms. In a more generic view, objects are compared by a (symmetric) similarity or dissimilarity measure, which may not obey particular mathematical properties. This renders many machine learning methods invalid, leading to convergence problems and the loss of guarantees, like generalization bounds. In many cases, the preferred dissimilarity measure is not metric, like the earth mover distance, or the similarity measure may not be a simple inner product in a Hilbert space but in its generalization a Krein space. If the input data are non-vectorial, like text sequences, proximity-based learning is used or ngram embedding techniques can be applied. Standard embeddings lead to the desired fixed-length vector encoding, but are costly and have substantial limitations in preserving the original data's full information. As an information preserving alternative, we propose a complex-valued vector embedding of proximity data. This allows suitable machine learning algorithms to use these fixed-length, complex-valued vectors for further processing. The complex-valued data can serve as an input to complex-valued machine learning algorithms. In particular, we address supervised learning and use extensions of prototype-based learning. The proposed approach is evaluated on a variety of standard benchmarks and shows strong performance compared to traditional techniques in processing non-metric or non-psd proximity data.Comment: Proximity learning, embedding, complex values, complex-valued embedding, learning vector quantizatio

arXiv.org e-Print Archive

Proceedings - University of Groningen

University of Groningen

Dissertations of the University of Groningen