Search CORE

3 research outputs found

Scaling-up Empirical Risk Minimization: Optimization of Incomplete U-statistics

Author: Bellet Aurélien
Clémençon Stéphan
Colin Igor
Publication venue
Publication date: 01/01/2016
Field of study

In a wide range of statistical learning problems such as ranking, clustering or metric learning among others, the risk is accurately estimated by

U

-statistics of degree

d\geq 1

, i.e. functionals of the training data with low variance that take the form of averages over

k

-tuples. From a computational perspective, the calculation of such statistics is highly expensive even for a moderate sample size

n

, as it requires averaging

O(n^d)

terms. This makes learning procedures relying on the optimization of such data functionals hardly feasible in practice. It is the major goal of this paper to show that, strikingly, such empirical risks can be replaced by drastically computationally simpler Monte-Carlo estimates based on

O(n)

terms only, usually referred to as incomplete

U

-statistics, without damaging the

O_{\mathbb{P}}(1/\sqrt{n})

learning rate of Empirical Risk Minimization (ERM) procedures. For this purpose, we establish uniform deviation results describing the error made when approximating a

U

-process by its incomplete version under appropriate complexity assumptions. Extensions to model selection, fast rate situations and various sampling techniques are also considered, as well as an application to stochastic gradient descent for ERM. Finally, numerical examples are displayed in order to provide strong empirical evidence that the approach we promote largely surpasses more naive subsampling techniques.Comment: To appear in Journal of Machine Learning Research. 34 pages. v2: minor correction to Theorem 4 and its proof, added 1 reference. v3: typo corrected in Proposition 3. v4: improved presentation, added experiments on model selection for clustering, fixed minor typo

arXiv.org e-Print Archive

HAL - Lille 3

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Trade-offs in Large-Scale Distributed Tuplewise Estimation and Learning

Author: A Lee
A Van Der Vaart
EP Xing
G Blom
J Dean
M Jordan
P Bertail
P Carbone
R Bekkerman
S Bubeck
S Clémençon
S Clémençon
S Clémençon
S Clémençon
SP Boyd
V de la Pena
V Smith
W Hoeffding
Publication venue
Publication date: 21/06/2019
Field of study

The development of cluster computing frameworks has allowed practitioners to scale out various statistical estimation and machine learning algorithms with minimal programming effort. This is especially true for machine learning problems whose objective function is nicely separable across individual data points, such as classification and regression. In contrast, statistical learning tasks involving pairs (or more generally tuples) of data points - such as metric learning, clustering or ranking do not lend themselves as easily to data-parallelism and in-memory computing. In this paper, we investigate how to balance between statistical performance and computational efficiency in such distributed tuplewise statistical problems. We first propose a simple strategy based on occasionally repartitioning data across workers between parallel computation stages, where the number of repartitioning steps rules the trade-off between accuracy and runtime. We then present some theoretical results highlighting the benefits brought by the proposed method in terms of variance reduction, and extend our results to design distributed stochastic gradient descent algorithms for tuplewise empirical risk minimization. Our results are supported by numerical experiments in pairwise statistical estimation and learning on synthetic and real-world datasets.Comment: 23 pages, 6 figures, ECML 201

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

Building confidence regions for the ROC surface

Author: Clémençon Stéphan
Robbiano Sylvain
Publication venue: 'Elsevier BV'
Publication date: 01/09/2014
Field of study

International audienc