Exploring Large Feature Spaces with Hierarchical Multiple Kernel
  Learning

Bach, Francis

research

Exploring Large Feature Spaces with Hierarchical Multiple Kernel Learning

Authors: Francis Bach
Publication date: 1 January 2008
Publisher

Abstract

For supervised and unsupervised learning, positive definite kernels allow to use large and potentially infinite dimensional feature spaces with a computational cost that only depends on the number of observations. This is usually done through the penalization of predictor functions by Euclidean or Hilbertian norms. In this paper, we explore penalizing by sparsity-inducing norms such as the l1-norm or the block l1-norm. We assume that the kernel decomposes into a large sum of individual basis kernels which can be embedded in a directed acyclic graph; we show that it is then possible to perform kernel selection through a hierarchical multiple kernel learning framework, in polynomial time in the number of selected kernels. This framework is naturally applied to non linear variable selection; our extensive simulations on synthetic datasets and datasets from the UCI repository show that efficiently exploring the large feature space through sparsity-inducing norms leads to state-of-the-art predictive performance

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

INRIA a CCSD electronic archive server

oai:HAL:hal-00319660v1

Last time updated on 09/11/2016