102,530 research outputs found
Successor features for transfer in reinforcement learning
Transfer in reinforcement learning refers to the notion that generalization should occur not only within a task but also across tasks. Our focus is on transfer where the reward functions vary across tasks while the environment's dynamics remain the same. The method we propose rests on two key ideas: "successor features," a value function representation that decouples the dynamics of the environment from the rewards, and "generalized policy improvement," a generalization of dynamic programming's policy improvement step that considers a set of policies rather than a single one. Put together, the two ideas lead to an approach that integrates seamlessly within the reinforcement learning framework and allows transfer to take place between tasks without any restriction. The proposed method also provides performance guarantees for the transferred policy even before any learning has taken place. We derive two theorems that set our approach in firm theoretical ground and present experiments that show that it successfully promotes transfer in practice
Transductive conformal inference with adaptive scores
Conformal inference is a fundamental and versatile tool that provides
distribution-free guarantees for many machine learning tasks. We consider the
transductive setting, where decisions are made on a test sample of new
points, giving rise to conformal -values. {While classical results only
concern their marginal distribution, we show that their joint distribution
follows a P\'olya urn model, and establish a concentration inequality for their
empirical distribution function.} The results hold for arbitrary exchangeable
scores, including {\it adaptive} ones that can use the covariates of the
test+calibration samples at training stage for increased accuracy. We
demonstrate the usefulness of these theoretical results through uniform,
in-probability guarantees for two machine learning tasks of current interest:
interval prediction for transductive transfer learning and novelty detection
based on two-class classification.Comment: 27 pages, 6 Figure
Hypothesis Transfer Learning with Surrogate Classification Losses: Generalization Bounds through Algorithmic Stability
Hypothesis transfer learning (HTL) contrasts domain adaptation by allowing
for a previous task leverage, named the source, into a new one, the target,
without requiring access to the source data. Indeed, HTL relies only on a
hypothesis learnt from such source data, relieving the hurdle of expansive data
storage and providing great practical benefits. Hence, HTL is highly beneficial
for real-world applications relying on big data. The analysis of such a method
from a theoretical perspective faces multiple challenges, particularly in
classification tasks. This paper deals with this problem by studying the
learning theory of HTL through algorithmic stability, an attractive theoretical
framework for machine learning algorithms analysis. In particular, we are
interested in the statistical behaviour of the regularized empirical risk
minimizers in the case of binary classification. Our stability analysis
provides learning guarantees under mild assumptions. Consequently, we derive
several complexity-free generalization bounds for essential statistical
quantities like the training error, the excess risk and cross-validation
estimates. These refined bounds allow understanding the benefits of transfer
learning and comparing the behaviour of standard losses in different scenarios,
leading to valuable insights for practitioners
Minimax Optimal Transfer Learning for Kernel-based Nonparametric Regression
In recent years, transfer learning has garnered significant attention in the
machine learning community. Its ability to leverage knowledge from related
studies to improve generalization performance in a target study has made it
highly appealing. This paper focuses on investigating the transfer learning
problem within the context of nonparametric regression over a reproducing
kernel Hilbert space. The aim is to bridge the gap between practical
effectiveness and theoretical guarantees. We specifically consider two
scenarios: one where the transferable sources are known and another where they
are unknown. For the known transferable source case, we propose a two-step
kernel-based estimator by solely using kernel ridge regression. For the unknown
case, we develop a novel method based on an efficient aggregation algorithm,
which can automatically detect and alleviate the effects of negative sources.
This paper provides the statistical properties of the desired estimators and
establishes the minimax optimal rate. Through extensive numerical experiments
on synthetic data and real examples, we validate our theoretical findings and
demonstrate the effectiveness of our proposed method
- …