3 research outputs found
Quantized Fourier and Polynomial Features for more Expressive Tensor Network Models
In the context of kernel machines, polynomial and Fourier features are
commonly used to provide a nonlinear extension to linear models by mapping the
data to a higher-dimensional space. Unless one considers the dual formulation
of the learning problem, which renders exact large-scale learning unfeasible,
the exponential increase of model parameters in the dimensionality of the data
caused by their tensor-product structure prohibits to tackle high-dimensional
problems. One of the possible approaches to circumvent this exponential scaling
is to exploit the tensor structure present in the features by constraining the
model weights to be an underparametrized tensor network. In this paper we
quantize, i.e. further tensorize, polynomial and Fourier features. Based on
this feature quantization we propose to quantize the associated model weights,
yielding quantized models. We show that, for the same number of model
parameters, the resulting quantized models have a higher bound on the
VC-dimension as opposed to their non-quantized counterparts, at no additional
computational cost while learning from identical features. We verify
experimentally how this additional tensorization regularizes the learning
problem by prioritizing the most salient features in the data and how it
provides models with increased generalization capabilities. We finally
benchmark our approach on large regression task, achieving state-of-the-art
results on a laptop computer
Sparse Gaussian Processes in the Longstaff-Schwartz algorithm
In financial applications it is often necessary to determine conditional expectations in Monte Carlo type of simulations. The industry standard at the moment relies on linear regression, which is characterized by the inconvenient problem of having to choose the type and number of basis functions used to build the model, task which is made harder by the frequent impossibility to use an alternative numerical method to evaluate the "ground truth". In this thesis Gaussian Process Regression is investigated as potential substitute for linear regression, as it is a flexible Bayesian non-parametric regression model, which requires little tuning to be used. Its downfall is the computational complexity related to its "training" phase, namely cubic, which requires the use of algorithmic approximations. The most prominent approximations are reviewed and tested in different scenarios requiring the approximationof conditional expectation by regression, among which the Longstaff-Schwartz algorithm for the pricing of Bermudan options. This thesis was carried out in cooperation with ABN-AMRO bank
Towards Green AI with tensor networks -- Sustainability and innovation enabled by efficient algorithms
The current standard to compare the performance of AI algorithms is mainly
based on one criterion: the model's accuracy. In this context, algorithms with
a higher accuracy (or similar measures) are considered as better. To achieve
new state-of-the-art results, algorithmic development is accompanied by an
exponentially increasing amount of compute. While this has enabled AI research
to achieve remarkable results, AI progress comes at a cost: it is
unsustainable. In this paper, we present a promising tool for sustainable and
thus Green AI: tensor networks (TNs). Being an established tool from
multilinear algebra, TNs have the capability to improve efficiency without
compromising accuracy. Since they can reduce compute significantly, we would
like to highlight their potential for Green AI. We elaborate in both a kernel
machine and deep learning setting how efficiency gains can be achieved with
TNs. Furthermore, we argue that better algorithms should be evaluated in terms
of both accuracy and efficiency. To that end, we discuss different efficiency
criteria and analyze efficiency in an exemplifying experimental setting for
kernel ridge regression. With this paper, we want to raise awareness about
Green AI and showcase its positive impact on sustainability and AI research.
Our key contribution is to demonstrate that TNs enable efficient algorithms and
therefore contribute towards Green AI. In this sense, TNs pave the way for
better algorithms in AI