714 research outputs found
Towards fast hybrid deep kernel learning methods
El treball estudia la millor manera de crear xarxes neuronals hÃbrides amb mètodes kernel mitjançant dues aproximacions de kernel diferents, random Fourier features i el mètode Nystrom, i la millor manera d'entrenar-les, amb RMSprop i stochastic gradient descent
The Neural Tangent Link Between CNN Denoisers and Non-Local Filters
Convolutional Neural Networks (CNNs) are now a well-established tool for
solving computational imaging problems. Modern CNN-based algorithms obtain
state-of-the-art performance in diverse image restoration problems.
Furthermore, it has been recently shown that, despite being highly
overparameterized, networks trained with a single corrupted image can still
perform as well as fully trained networks. We introduce a formal link between
such networks through their neural tangent kernel (NTK), and well-known
non-local filtering techniques, such as non-local means or BM3D. The filtering
function associated with a given network architecture can be obtained in closed
form without need to train the network, being fully characterized by the random
initialization of the network weights. While the NTK theory accurately predicts
the filter associated with networks trained using standard gradient descent,
our analysis shows that it falls short to explain the behaviour of networks
trained using the popular Adam optimizer. The latter achieves a larger change
of weights in hidden layers, adapting the non-local filtering function during
training. We evaluate our findings via extensive image denoising experiments
End-to-End Kernel Learning with Supervised Convolutional Kernel Networks
In this paper, we introduce a new image representation based on a multilayer
kernel machine. Unlike traditional kernel methods where data representation is
decoupled from the prediction task, we learn how to shape the kernel with
supervision. We proceed by first proposing improvements of the
recently-introduced convolutional kernel networks (CKNs) in the context of
unsupervised learning; then, we derive backpropagation rules to take advantage
of labeled training data. The resulting model is a new type of convolutional
neural network, where optimizing the filters at each layer is equivalent to
learning a linear subspace in a reproducing kernel Hilbert space (RKHS). We
show that our method achieves reasonably competitive performance for image
classification on some standard "deep learning" datasets such as CIFAR-10 and
SVHN, and also for image super-resolution, demonstrating the applicability of
our approach to a large variety of image-related tasks.Comment: to appear in Advances in Neural Information Processing Systems (NIPS
NySALT: Nystr\"{o}m-type inference-based schemes adaptive to large time-stepping
Large time-stepping is important for efficient long-time simulations of
deterministic and stochastic Hamiltonian dynamical systems. Conventional
structure-preserving integrators, while being successful for generic systems,
have limited tolerance to time step size due to stability and accuracy
constraints. We propose to use data to innovate classical integrators so that
they can be adaptive to large time-stepping and are tailored to each specific
system. In particular, we introduce NySALT, Nystr\"{o}m-type inference-based
schemes adaptive to large time-stepping. The NySALT has optimal parameters for
each time step learnt from data by minimizing the one-step prediction error.
Thus, it is tailored for each time step size and the specific system to achieve
optimal performance and tolerate large time-stepping in an adaptive fashion. We
prove and numerically verify the convergence of the estimators as data size
increases. Furthermore, analysis and numerical tests on the deterministic and
stochastic Fermi-Pasta-Ulam (FPU) models show that NySALT enlarges the maximal
admissible step size of linear stability, and quadruples the time step size of
the St\"{o}rmer--Verlet and the BAOAB when maintaining similar levels of
accuracy.Comment: 26 pages, 7 figure
Complex-valued embeddings of generic proximity data
Proximities are at the heart of almost all machine learning methods. If the
input data are given as numerical vectors of equal lengths, euclidean distance,
or a Hilbertian inner product is frequently used in modeling algorithms. In a
more generic view, objects are compared by a (symmetric) similarity or
dissimilarity measure, which may not obey particular mathematical properties.
This renders many machine learning methods invalid, leading to convergence
problems and the loss of guarantees, like generalization bounds. In many cases,
the preferred dissimilarity measure is not metric, like the earth mover
distance, or the similarity measure may not be a simple inner product in a
Hilbert space but in its generalization a Krein space. If the input data are
non-vectorial, like text sequences, proximity-based learning is used or ngram
embedding techniques can be applied. Standard embeddings lead to the desired
fixed-length vector encoding, but are costly and have substantial limitations
in preserving the original data's full information. As an information
preserving alternative, we propose a complex-valued vector embedding of
proximity data. This allows suitable machine learning algorithms to use these
fixed-length, complex-valued vectors for further processing. The complex-valued
data can serve as an input to complex-valued machine learning algorithms. In
particular, we address supervised learning and use extensions of
prototype-based learning. The proposed approach is evaluated on a variety of
standard benchmarks and shows strong performance compared to traditional
techniques in processing non-metric or non-psd proximity data.Comment: Proximity learning, embedding, complex values, complex-valued
embedding, learning vector quantizatio
- …