Search CORE

16 research outputs found

Optimal Rates for the Random Fourier Feature Method

Author: Sriperumbudur BK
Szabó Z
Publication venue: invited talk at Carnegie Mellon University: Statistical ML Reading Group
Publication date: 01/12/2015
Field of study

Kernel methods represent one of the most powerful tools in machine learning to tackle problems expressed in terms of function values and derivatives. While these methods show good versatility, they are computationally intensive and have poor scalability to large data as they require operations on Gram matrices. In order to mitigate this serious computational limitation, recently randomized methods have been proposed in the literature, which allow the application of fast linear algorithms. Random Fourier features (RFF) are among the most popular and widely applied constructions: they provide an easily computable, low-dimensional feature representation for shift-invariant kernels. Despite the popularity of RFFs, very little is understood theoretically about their approximation quality. In this talk, I am going to present the main ideas and results of a detailed finite-sample theoretical analysis about the approximation quality of RFFs by (i) establishing optimal (in terms of the RFF dimension, and growing set size) performance guarantees in uniform norm, and (ii) providing guarantees in Lr (1 <= r < \infty) norms. I will also propose an RFF approximation to derivatives of kernel with a theoretical study on its approximation quality

UCL Discovery

Geometrical Insights for Implicit Generative Modeling

Author: A Auffinger
A Gretton
A Müller
AA Zinger
B Schölkopf
B Sriperumbudur
BK Sriperumbudur
BK Sriperumbudur
C Villani
D Sejdinovic
GJ Székely
H Cramér
IJ Schoenberg
JM Hammersley
MA Aizerman
N Aronszajn
N Fournier
P Milgrom
R Mises von
RJ Serfling
RM Neal
ST Rachev
Steffen Dereich
T Hastie
VS Borkar
X Nguyen
Publication venue
Publication date: 21/08/2019
Field of study

Learning algorithms for implicit generative models can optimize a variety of criteria that measure how the data distribution differs from the implicit model distribution, including the Wasserstein distance, the Energy distance, and the Maximum Mean Discrepancy criterion. A careful look at the geometries induced by these distances on the space of probability measures reveals interesting differences. In particular, we can establish surprising approximate global convergence guarantees for the

1

-Wasserstein distance,even when the parametric generator has a nonconvex parametrization.Comment: this version fixes a typo in a definitio

arXiv.org e-Print Archive

Crossref

ForestHash: Semantic Hashing With Shallow Random Forests and Tiny Convolutional Networks

Author: A Criminisi
A Krause
AL Yuille
BK Sriperumbudur
G Huang
G Nemhauser
GA Watson
J Masci
J Shotton
JR Quinlan
K He
L Breiman
M Aharon
ME Hellman
Q Qiu
Y Bengio
Y Lecun
Publication venue
Publication date: 27/07/2018
Field of study

Hash codes are efficient data representations for coping with the ever growing amounts of data. In this paper, we introduce a random forest semantic hashing scheme that embeds tiny convolutional neural networks (CNN) into shallow random forests, with near-optimal information-theoretic code aggregation among trees. We start with a simple hashing scheme, where random trees in a forest act as hashing functions by setting `1' for the visited tree leaf, and `0' for the rest. We show that traditional random forests fail to generate hashes that preserve the underlying similarity between the trees, rendering the random forests approach to hashing challenging. To address this, we propose to first randomly group arriving classes at each tree split node into two groups, obtaining a significantly simplified two-class classification problem, which can be handled using a light-weight CNN weak learner. Such random class grouping scheme enables code uniqueness by enforcing each class to share its code with different classes in different trees. A non-conventional low-rank loss is further adopted for the CNN weak learners to encourage code consistency by minimizing intra-class variations and maximizing inter-class distance for the two random class groups. Finally, we introduce an information-theoretic approach for aggregating codes of individual trees into a single hash code, producing a near-optimal unique hash for each class. The proposed approach significantly outperforms state-of-the-art hashing methods for image retrieval tasks on large-scale public datasets, while performing at the level of other state-of-the-art image classification techniques while utilizing a more compact and efficient scalable representation. This work proposes a principled and robust procedure to train and deploy in parallel an ensemble of light-weight CNNs, instead of simply going deeper.Comment: Accepted to ECCV 201

arXiv.org e-Print Archive

Crossref

Traffic demand-aware topology control for enhanced energy-efficiency of cellular networks

Author: AJ Fehske
B Efron
B Sriperumbudur
BK Sriperumbudur
C Han
CK Ho
D Lopez-Perez
D Willkomm
DR Hunter
E Oh
EJ Candes
Emmanuel Pollakis
I Siomina
I Yamada
M Schubert
P Santi
RD Yates
Renato L. G. Cavalcante
RLG Cavalcante
S Boyd
S Joshi
Slawomir Stanczak
Z Niu
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Principal component gene set enrichment (PCGSE)

Author: A Subramanian
B Efron
BK Sriperumbudur
D Wu
DB Allison
DM Witten
H Hotelling
H Zou
H. Robert Frost
IT Jolliffe
IT Jolliffe
J Lu
Jason H. Moore
JC Roden
JJ Goeman
JJ Goeman
JO Ramsay
K Pearson
KY Yeung
L Tian
M Ackermann
M Ashburner
M Grbovic
M Yuan
MA Hibbs
N Patterson
O Alter
P Khatri
PT Spellman
R Jenatton
S Ma
S Ma
S Vines
SA Armstrong
T Hastie
T Hastie
TW Anderson
WT Barry
WT Barry
X Chen
Y Kluger
YH Zhou
Zhigang Li
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Shape Classification Using Hilbert Space Embeddings and Kernel Adaptive Filtering

Author: B Chen
BK Sriperumbudur
C Keskin
CD Zuluaga
M Bicego
W Liu
Y Rathi
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2018
Field of study

Crossref

The University of Manchester - Institutional Repository

Universalities of reproducing kernels revisited

Author: Almira JM
Benxun Wang
Bishop CM
Bochner S
Caponnetto A
Conway JB
Haizhang Zhang
Mergelyan SN
Micchelli CA
Pinkus A
Pollard H
Rudin W
Schölkopf B
Sriperumbudur BK
Sriperumbudur BK
Steinwart I
Wenjian Chen
Publication venue: 'Informa UK Limited'
Publication date
Field of study

Crossref

Information potential variability for hyperparameter selection in the MMD distance

Author: A Berlinet
A Smola
AM Álvarez-Meza
BK Sriperumbudur
CD Zuluaga
I Steinwart
JS Blandon
W González-Vanegas
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2019
Field of study

Crossref

The University of Manchester - Institutional Repository

Admissible kernels for RKHS embedding of probability distributions

Author: A Berlinet
A Müller
BK Sriperumbudur
BK Sriperumbudur
EH Lieb
G Song
GR Shorack
H Wendland
H Zhang
I Steinwart
I Vajda
IJ Schoenberg
J Miller
J van Mill
K Fukumizu
N Aronszajn
N Weaver
NC Beaulieu
P Moulin
R Engelking
RM Dudley
S Bochner
S Bochner
SG Mallat
ST Rachev
W Chen
ZM Wu
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Domain Adaptation Transfer Learning by Kernel Representation Adaptation

Author: A Gretton
B Schölkopf
BK Sriperumbudur
C-H Huang
F Liang
I Steinwart
J Quionero-Candela
J Ren
L Maaten van der
Q Tan
RJ Serfling
RM Dudley
RM Dudley
S Si
S Uguroglu
S Yang
SJ Pan
SJ Pan
VI Paulsen
VM Patel
Z Wang
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 16/06/2018
Field of study

International audienceDomain adaptation, where no labeled target data is available, is a challenging task. To solve this problem, we first propose a new SVM based approach with a supplementary MaximumMean Discrepancy (MMD)-like constraint. With this heuristic, source and target data are projected onto a common subspace of a Reproducing Kernel Hilbert Space (RKHS) where both data distributions are expected to become similar. Therefore, a classifier trained on source data might perform well on target data, if the conditional probabilities of labels are similar for source and target data, which is the main assumption of this paper. We demonstrate that adding this constraint does not change the quadratic nature of the optimization problem, so we can use common quadratic optimization tools. Secondly, using the same idea that rendering source and target data similar might ensure efficient transfer learning, and with the same assumption, a Kernel Principal Component Analysis (KPCA) based transfer learning method is proposed. Different from the first heuristic, this second method ensures other higher order moments to be aligned in the RKHS, which leads to better performances. Here again, we select MMD as the similarity measure. Then, a linear transformation is also applied to further improve the alignment between source and target data. We finally compare both methods with other transfer learning methods from the literature to show their efficiency on synthetic and real datasets

Crossref

HAL Descartes

Hal-Diderot