Search CORE

15 research outputs found

Explicit Learning Curves for Transduction and Application to Clustering and Compression Algorithms

Author: Derbeko P.
El-Yaniv R.
Meir R.
Publication venue: 'AI Access Foundation'
Publication date: 30/06/2011
Field of study

Inductive learning is based on inferring a general rule from a finite data set and using it to label new data. In transduction one attempts to solve the problem of using a labeled training set to label a set of unlabeled points, which are given to the learner prior to learning. Although transduction seems at the outset to be an easier task than induction, there have not been many provably useful algorithms for transduction. Moreover, the precise relation between induction and transduction has not yet been determined. The main theoretical developments related to transduction were presented by Vapnik more than twenty years ago. One of Vapnik's basic results is a rather tight error bound for transductive classification based on an exact computation of the hypergeometric tail. While tight, this bound is given implicitly via a computational routine. Our first contribution is a somewhat looser but explicit characterization of a slightly extended PAC-Bayesian version of Vapnik's transductive bound. This characterization is obtained using concentration inequalities for the tail of sums of random variables obtained by sampling without replacement. We then derive error bounds for compression schemes such as (transductive) support vector machines and for transduction algorithms based on clustering. The main observation used for deriving these new error bounds and algorithms is that the unlabeled test points, which in the transductive setting are known in advance, can be used in order to construct useful data dependent prior distributions over the hypothesis space

arXiv.org e-Print Archive

Crossref

Large margin vs. large volume in transductive learning

Author: C. Bennett
D. Coppersmith
Dmitry Pechyony
F. Collobert
G. Forsythe
L. Lovasz
O. Bousquet
O. Chapelle
P. Derbeko
R. Horn
Ran El-Yaniv
S. Tong
V. N. Vapnik
V. N. Vapnik
Vladimir Vapnik
W. Gander
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Secure and Efficient Matrix Multiplication with MapReduce

Author: A Krizhevsky
G Chartrand
HD Macedo
I Wang
J Dumas
J Leskovec
O Goldreich
P Derbeko
P Paillier
S Dolev
V Strassen
X Bultel
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 26/07/2019
Field of study

International audienceMapReduce is one of the most popular distributed programming paradigms that allows processing big data sets in parallel on a cluster. MapReduce users often outsource data and computations to a public cloud, which yields inherent security concerns. In this paper, we consider the problem of matrix multiplication and one of the most efficient matrix multiplication algorithms: the Strassen-Winograd (SW) algorithm. Our first contribution is a distributed MapReduce algorithm based on SW. Then, we tackle the security concerns that occur when outsourcing matrix multiplication computation to a honest-but-curious cloud i.e., that executes tasks dutifully, but tries to learn as much information as possible. Our main contribution is a secure distributed MapReduce algorithm called S2M3 (Secure Strassen-Winograd Matrix Multiplication with MapReduce) that enjoys security guarantees such as: none of the cloud nodes can learn the input or the output data. We formally prove the security properties of S2M3 and we present an empirical evaluation devoted to show its efficiency

Crossref

HAL Clermont Université

Secure Joins with MapReduce

Author: A Shamir
E-O Blass
J Daemen
J Leskovec
M Bellare
P Derbeko
T ElGamal
T Mayberry
TD Vo-Huu
Publication venue: HAL CCSD
Publication date: 13/11/2018
Field of study

International audienceMapReduce is one of the most popular programming paradigms that allows a user to process Big data sets. Our goal is to add privacy guarantees to the two standard algorithms of join computation for MapReduce: the cascade algorithm and the hypercube algorithm. We assume that the data is externalized in an honest-but-curious server and a user is allowed to query the join result. We design, implement, and prove the security of two approaches: (i) Secure-Private, assuming that the public cloud and the user do not collude, (ii) Collision-Resistant-Secure-Private, which resists to collusions between the public cloud and the user i.e., when the public cloud knows the secret key of the user

HAL-CentraleSupelec

Crossref

HAL Clermont Université

INRIA a CCSD electronic archive server

HAL-Rennes 1

Stable transductive learning

Author: A. Blum
A. Rakhlin
K. Azuma
L. Devroye
M. Belkin
M. Kearns
M. Talagrand
O. Bousquet
P. Derbeko
V.N. Vapnik
V.N. Vapnik
Publication venue
Publication date: 01/01/2006
Field of study

Abstract. We develop a new error bound for transductive learning algorithms. The slack term in the new bound is a function of a relaxed notion of transductive stability, which measures the sensitivity of the algorithm to most pairwise exchanges of training and test set points. Our bound is based on a novel concentration inequality for symmetric functions of permutations. We also present a simple sampling technique that can estimate, with high probability, the weak stability of transductive learning algorithms with respect to a given dataset. We demonstrate the usefulness of our estimation technique on a well known transductive learning algorithm.

CiteSeerX

Crossref

First geochronological data on felsic lavas from the Ezop-Yamalin volcanoplutonic zone, Khingan-Okhotsk volcanogenic belt

Author: A. A. Sorokin
A. A. Sorokin
A. P. Sorokin
A. V. Travin
E. A. Radkevich
G. A. Gonevchuk
I. M. Derbeko
I. M. Derbeko
L. B. Gustafson
L. M. Parfenov
M. I. Itsikson
R. J. Fleck
S. G. Agafonenko
S. R. Taylor
V. A. Ponomarchuk
V. G. Gonevchuk
V. G. Sakhno
V. V. Akinin
W. F. McDonough
Publication venue: 'Pleiades Publishing Ltd'
Publication date
Field of study

Crossref

Differential privacy: its technological prescriptive using big data

Author: A Lemmens
A Shamir
AW Toga
C Dwork
CS Hien To
E Lima
E Olshannikova
G Jagannathan
J Han
J Wang
Jordi Soria-Comas
M Al-Zobbi
N Li
P Derbeko
P Jain
P Jain
P Samarati
S Sharma
X Hu
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref