7,355 research outputs found
Exploring Large Feature Spaces with Hierarchical Multiple Kernel Learning
For supervised and unsupervised learning, positive definite kernels allow to
use large and potentially infinite dimensional feature spaces with a
computational cost that only depends on the number of observations. This is
usually done through the penalization of predictor functions by Euclidean or
Hilbertian norms. In this paper, we explore penalizing by sparsity-inducing
norms such as the l1-norm or the block l1-norm. We assume that the kernel
decomposes into a large sum of individual basis kernels which can be embedded
in a directed acyclic graph; we show that it is then possible to perform kernel
selection through a hierarchical multiple kernel learning framework, in
polynomial time in the number of selected kernels. This framework is naturally
applied to non linear variable selection; our extensive simulations on
synthetic datasets and datasets from the UCI repository show that efficiently
exploring the large feature space through sparsity-inducing norms leads to
state-of-the-art predictive performance
Inductive queries for a drug designing robot scientist
It is increasingly clear that machine learning algorithms need to be integrated in an iterative scientific discovery loop, in which data is queried repeatedly by means of inductive queries and where the computer provides guidance to the experiments that are being performed. In this chapter, we summarise several key challenges in achieving this integration of machine learning and data mining algorithms in methods for the discovery of Quantitative Structure Activity Relationships (QSARs). We introduce the concept of a robot scientist, in which all steps of the discovery process are automated; we discuss the representation of molecular data such that knowledge discovery tools can analyse it, and we discuss the adaptation of machine learning and data mining algorithms to guide QSAR experiments
Optimization with Sparsity-Inducing Penalties
Sparse estimation methods are aimed at using or obtaining parsimonious
representations of data or models. They were first dedicated to linear variable
selection but numerous extensions have now emerged such as structured sparsity
or kernel selection. It turns out that many of the related estimation problems
can be cast as convex optimization problems by regularizing the empirical risk
with appropriate non-smooth norms. The goal of this paper is to present from a
general perspective optimization tools and techniques dedicated to such
sparsity-inducing penalties. We cover proximal methods, block-coordinate
descent, reweighted -penalized techniques, working-set and homotopy
methods, as well as non-convex formulations and extensions, and provide an
extensive set of experiments to compare various algorithms from a computational
point of view
Cover Tree Bayesian Reinforcement Learning
This paper proposes an online tree-based Bayesian approach for reinforcement
learning. For inference, we employ a generalised context tree model. This
defines a distribution on multivariate Gaussian piecewise-linear models, which
can be updated in closed form. The tree structure itself is constructed using
the cover tree method, which remains efficient in high dimensional spaces. We
combine the model with Thompson sampling and approximate dynamic programming to
obtain effective exploration policies in unknown environments. The flexibility
and computational simplicity of the model render it suitable for many
reinforcement learning problems in continuous state spaces. We demonstrate this
in an experimental comparison with least squares policy iteration
- …