11 research outputs found
A representer theorem for deep kernel learning
In this paper we provide a finite-sample and an infinite-sample representer
theorem for the concatenation of (linear combinations of) kernel functions of
reproducing kernel Hilbert spaces. These results serve as mathematical
foundation for the analysis of machine learning algorithms based on
compositions of functions. As a direct consequence in the finite-sample case,
the corresponding infinite-dimensional minimization problems can be recast into
(nonlinear) finite-dimensional minimization problems, which can be tackled with
nonlinear optimization algorithms. Moreover, we show how concatenated machine
learning problems can be reformulated as neural networks and how our
representer theorem applies to a broad class of state-of-the-art deep learning
methods
Structured Deep Kernel Networks for Data-Driven Closure Terms of Turbulent Flows
Standard kernel methods for machine learning usually struggle when dealing
with large datasets. We review a recently introduced Structured Deep Kernel
Network (SDKN) approach that is capable of dealing with high-dimensional and
huge datasets - and enjoys typical standard machine learning approximation
properties. We extend the SDKN to combine it with standard machine learning
modules and compare it with Neural Networks on the scientific challenge of
data-driven prediction of closure terms of turbulent flows. We show
experimentally that the SDKNs are capable of dealing with large datasets and
achieve near-perfect accuracy on the given application
Counterfactual Learning with Multioutput Deep Kernels
In this paper, we address the challenge of performing counterfactual
inference with observational data via Bayesian nonparametric regression
adjustment, with a focus on high-dimensional settings featuring multiple
actions and multiple correlated outcomes. We present a general class of
counterfactual multi-task deep kernels models that estimate causal effects and
learn policies proficiently thanks to their sample efficiency gains, while
scaling well with high dimensions. In the first part of the work, we rely on
Structural Causal Models (SCM) to formally introduce the setup and the problem
of identifying counterfactual quantities under observed confounding. We then
discuss the benefits of tackling the task of causal effects estimation via
stacked coregionalized Gaussian Processes and Deep Kernels. Finally, we
demonstrate the use of the proposed methods on simulated experiments that span
individual causal effects estimation, off-policy evaluation and optimization
On the composition of neural and kernel layers for machine learning
Deep Learning architectures in which neural layers alternate with mappings to infinitedimensional feature spaces have been proposed in recent years, showing improvements on the results obtained when using either technique separately. However, these new algorithms have been presented without delving into the rich mathematical structure that sustains kernel methods. The main focus of this thesis is not only to review these advances in the field of Deep Learning, but to extend and generalize them by defining a broader family of models that operate under the mathematical framework defined by the composition of a neural layerwith a kernel mapping, all of which operate in reproducing kernel Hilbert spaces thatare then concatenated. Each of these spaces has a specific reproducing kernel that we can characterize. Together all of this defines a regularization-based learning optimization problem, for which we prove that minimizers exist. This strong mathematical background is complemented by the presentation of a new a model, the Kernel Network, which manages to produce successful results on many classification problems
Be greedy and learn: efficient and certified algorithms for parametrized optimal control problems
We consider parametrized linear-quadratic optimal control problems and
provide their online-efficient solutions by combining greedy reduced basis
methods and machine learning algorithms. To this end, we first extend the
greedy control algorithm, which builds a reduced basis for the manifold of
optimal final time adjoint states, to the setting where the objective
functional consists of a penalty term measuring the deviation from a desired
state and a term describing the control energy. Afterwards, we apply machine
learning surrogates to accelerate the online evaluation of the reduced model.
The error estimates proven for the greedy procedure are further transferred to
the machine learning models and thus allow for efficient a posteriori error
certification. We discuss the computational costs of all considered methods in
detail and show by means of two numerical examples the tremendous potential of
the proposed methodology
Representer Theorems in Banach Spaces: Minimum Norm Interpolation, Regularized Learning and Semi-Discrete Inverse Problems
Learning a function from a finite number of sampled data points (measurements) is a fundamental problem in science and engineering. This is often formulated as a minimum norm interpolation (MNI) problem, a regularized learning problem or, in general, a semi discrete inverse problem (SDIP), in either Hilbert spaces or Banach spaces. The goal of this paper is to systematically study solutions of these problems in Banach spaces. We aim at obtaining explicit representer theorems for their solutions, on which convenient solution methods can then be developed. For the MNI problem, the explicit representer theorems enable us to express the infimum in terms of the norm of the linear combination of the interpolation functionals. For the purpose of developing efficient computational algorithms, we establish the fixed-point equation formulation of solutions of these problems. We reveal that unlike in a Hilbert space, in general, solutions of these problems in a Banach space may not be able to be reduced to truly finite dimensional problems (with certain infinite dimensional components hidden). We demonstrate how this obstacle can be removed, reducing the original problem to a truly finite dimensional one, in the special case when the Banach space is â„“1(N)