1,927,192 research outputs found
A Unifying View of Multiple Kernel Learning
Recent research on multiple kernel learning has lead to a number of
approaches for combining kernels in regularized risk minimization. The proposed
approaches include different formulations of objectives and varying
regularization strategies. In this paper we present a unifying general
optimization criterion for multiple kernel learning and show how existing
formulations are subsumed as special cases. We also derive the criterion's dual
representation, which is suitable for general smooth optimization algorithms.
Finally, we evaluate multiple kernel learning in this framework analytically
using a Rademacher complexity bound on the generalization error and empirically
in a set of experiments
A Map of Update Constraints in Inductive Inference
We investigate how different learning restrictions reduce learning power and
how the different restrictions relate to one another. We give a complete map
for nine different restrictions both for the cases of complete information
learning and set-driven learning. This completes the picture for these
well-studied \emph{delayable} learning restrictions. A further insight is
gained by different characterizations of \emph{conservative} learning in terms
of variants of \emph{cautious} learning.
Our analyses greatly benefit from general theorems we give, for example
showing that learners with exclusively delayable restrictions can always be
assumed total.Comment: fixed a mistake in Theorem 21, result is the sam
A Theory of Regularized Markov Decision Processes
Many recent successful (deep) reinforcement learning algorithms make use of
regularization, generally based on entropy or Kullback-Leibler divergence. We
propose a general theory of regularized Markov Decision Processes that
generalizes these approaches in two directions: we consider a larger class of
regularizers, and we consider the general modified policy iteration approach,
encompassing both policy iteration and value iteration. The core building
blocks of this theory are a notion of regularized Bellman operator and the
Legendre-Fenchel transform, a classical tool of convex optimization. This
approach allows for error propagation analyses of general algorithmic schemes
of which (possibly variants of) classical algorithms such as Trust Region
Policy Optimization, Soft Q-learning, Stochastic Actor Critic or Dynamic Policy
Programming are special cases. This also draws connections to proximal convex
optimization, especially to Mirror Descent.Comment: ICML 201
- …