204,190 research outputs found
Hypothesis Transfer Learning with Surrogate Classification Losses: Generalization Bounds through Algorithmic Stability
Hypothesis transfer learning (HTL) contrasts domain adaptation by allowing
for a previous task leverage, named the source, into a new one, the target,
without requiring access to the source data. Indeed, HTL relies only on a
hypothesis learnt from such source data, relieving the hurdle of expansive data
storage and providing great practical benefits. Hence, HTL is highly beneficial
for real-world applications relying on big data. The analysis of such a method
from a theoretical perspective faces multiple challenges, particularly in
classification tasks. This paper deals with this problem by studying the
learning theory of HTL through algorithmic stability, an attractive theoretical
framework for machine learning algorithms analysis. In particular, we are
interested in the statistical behaviour of the regularized empirical risk
minimizers in the case of binary classification. Our stability analysis
provides learning guarantees under mild assumptions. Consequently, we derive
several complexity-free generalization bounds for essential statistical
quantities like the training error, the excess risk and cross-validation
estimates. These refined bounds allow understanding the benefits of transfer
learning and comparing the behaviour of standard losses in different scenarios,
leading to valuable insights for practitioners
Data-Dependent Stability of Stochastic Gradient Descent
We establish a data-dependent notion of algorithmic stability for Stochastic
Gradient Descent (SGD), and employ it to develop novel generalization bounds.
This is in contrast to previous distribution-free algorithmic stability results
for SGD which depend on the worst-case constants. By virtue of the
data-dependent argument, our bounds provide new insights into learning with SGD
on convex and non-convex problems. In the convex case, we show that the bound
on the generalization error depends on the risk at the initialization point. In
the non-convex case, we prove that the expected curvature of the objective
function around the initialization point has crucial influence on the
generalization error. In both cases, our results suggest a simple data-driven
strategy to stabilize SGD by pre-screening its initialization. As a corollary,
our results allow us to show optimistic generalization bounds that exhibit fast
convergence rates for SGD subject to a vanishing empirical risk and low noise
of stochastic gradient
Stable Feature Selection for Biomarker Discovery
Feature selection techniques have been used as the workhorse in biomarker
discovery applications for a long time. Surprisingly, the stability of feature
selection with respect to sampling variations has long been under-considered.
It is only until recently that this issue has received more and more attention.
In this article, we review existing stable feature selection methods for
biomarker discovery using a generic hierarchal framework. We have two
objectives: (1) providing an overview on this new yet fast growing topic for a
convenient reference; (2) categorizing existing methods under an expandable
framework for future research and development
- …