Search CORE

883 research outputs found

Online Meta-Learning for Multi-Source and Semi-Supervised Domain Adaptation

Author: J Schmidhuber
J Schmidhuber
LVD Maaten
S Ben-David
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 27/07/2020
Field of study

Domain adaptation (DA) is the topical problem of adapting models from labelled source datasets so that they perform well on target datasets where only unlabelled or partially labelled data is available. Many methods have been proposed to address this problem through different ways to minimise the domain shift between source and target datasets. In this paper we take an orthogonal perspective and propose a framework to further enhance performance by meta-learning the initial conditions of existing DA algorithms. This is challenging compared to the more widely considered setting of few-shot meta-learning, due to the length of the computation graph involved. Therefore we propose an online shortest-path meta-learning framework that is both computationally tractable and practically effective for improving DA performance. We present variants for both multi-source unsupervised domain adaptation (MSDA), and semi-supervised domain adaptation (SSDA). Importantly, our approach is agnostic to the base adaptation algorithm, and can be applied to improve many techniques. Experimentally, we demonstrate improvements on classic (DANN) and recent (MCD and MME) techniques for MSDA and SSDA, and ultimately achieve state of the art results on several DA benchmarks including the largest scale DomainNet.Comment: ECCV 2020 CR versio

arXiv.org e-Print Archive

Crossref

Edinburgh Research Explorer

CFT Duals for Extreme Black Holes

Author: A.K. Zvonkin
J. Poland
J. Schmidhuber
J. Schmidhuber
M. Hutter
M. Hutter
M. Hutter
M. Hutter
M. Li
R. Cilibrasi
R.J. Solomonoff
R.J. Solomonoff
V.A. Uspensky
Publication venue
Publication date: 01/01/2005
Field of study

It is argued that the general four-dimensional extremal Kerr-Newman-AdS-dS black hole is holographically dual to a (chiral half of a) two-dimensional CFT, generalizing an argument given recently for the special case of extremal Kerr. Specifically, the asymptotic symmetries of the near-horizon region of the general extremal black hole are shown to be generated by a Virasoro algebra. Semiclassical formulae are derived for the central charge and temperature of the dual CFT as functions of the cosmological constant, Newton's constant and the black hole charges and spin. We then show, assuming the Cardy formula, that the microscopic entropy of the dual CFT precisely reproduces the macroscopic Bekenstein-Hawking area law. This CFT description becomes singular in the extreme Reissner-Nordstrom limit where the black hole has no spin. At this point a second dual CFT description is proposed in which the global part of the U(1) gauge symmetry is promoted to a Virasoro algebra. This second description is also found to reproduce the area law. Various further generalizations including higher dimensions are discussed.Comment: 18 pages; v2 minor change

arXiv.org e-Print Archive

CiteSeerX

Crossref

Harvard University - DASH

University of Brighton Research Portal

The Australian National University

Strings and Branes in Nonabelian Gauge Theory

Author: CHRISTOF SCHMIDHUBER
Gross D. J.
Wilson K. G.
Witten E.
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 01/01/2000
Field of study

It is an old speculation that SU(N) gauge theory can alternatively be formulated as a string theory. Recently this subject has been revived, in the wake of the discovery of D-branes. In particular, it has been argued that at least some conformally invariant cousins of the theory have such a string representation. This is a pedagogical introduction to these developments for non-string theorists. Some of the existing arguments are simplified.Comment: Reference adde

arXiv.org e-Print Archive

CiteSeerX

Crossref

CERN Document Server

An almost sure limit theorem for super-Brownian motion

Author: A.K. Zvonkin
B.S. Clarke
C.S. Wallace
J. Earman
J. Poland
J. Schmidhuber
J. Schmidhuber
M. Hutter
M. Li
P. Walley
R. Cilibrasi
R.E. Kass
R.J. Solomonoff
R.J. Solomonoff
Publication venue
Publication date: 01/01/2006
Field of study

We establish an almost sure scaling limit theorem for super-Brownian motion on

\mathbb{R}^d

associated with the semi-linear equation

u_t = {1/2}\Delta u +\beta u-\alpha u^2

, where

\alpha

and

\beta

are positive constants. In this case, the spectral theoretical assumptions that required in Chen et al (2008) are not satisfied. An example is given to show that the main results also hold for some sub-domains in

\mathbb{R}^d

.Comment: 14 page

arXiv.org e-Print Archive

Crossref

The Australian National University

Hierarchical Temporal Representation in Linear Reservoir Computing

Author: Claudio Gallicchio
Claudio Gallicchio
D Koryakin
D Verstraeten
G Holzmann
H Jaeger
H Jaeger
J Schmidhuber
J Schmidhuber
M Lukoševičius
M Čerňanskỳ
S Otte
Y Xue
Publication venue
Publication date: 10/07/2017
Field of study

Recently, studies on deep Reservoir Computing (RC) highlighted the role of layering in deep recurrent neural networks (RNNs). In this paper, the use of linear recurrent units allows us to bring more evidence on the intrinsic hierarchical temporal representation in deep RNNs through frequency analysis applied to the state signals. The potentiality of our approach is assessed on the class of Multiple Superimposed Oscillator tasks. Furthermore, our investigation provides useful insights to open a discussion on the main aspects that characterize the deep learning framework in the temporal domain.Comment: This is a pre-print of the paper submitted to the 27th Italian Workshop on Neural Networks, WIRN 201

arXiv.org e-Print Archive

Crossref

Towards a Universal Theory of Artificial Intelligence based on Algorithmic Probability and Sequential Decision Theory

Author: A. N. Kolmogorov
D. P. Bertsekas
D. P. Bertsekas
G. J. Chaitin
J. Schmidhuber
J. Schmidhuber
L. A. Levin
L. A. Levin
L. P. Kaelbling
M. Feder
P. Gács
R. Bellman
R. J. Solomonoff
R. J. Solomonoff
R. Sutton
S. J. Russell
Publication venue
Publication date: 01/01/2000
Field of study

Decision theory formally solves the problem of rational agents in uncertain worlds if the true environmental probability distribution is known. Solomonoff's theory of universal induction formally solves the problem of sequence prediction for unknown distribution. We unify both theories and give strong arguments that the resulting universal AIXI model behaves optimal in any computable environment. The major drawback of the AIXI model is that it is uncomputable. To overcome this problem, we construct a modified algorithm AIXI^tl, which is still superior to any other time t and space l bounded agent. The computation time of AIXI^tl is of the order t x 2^l.Comment: 8 two-column pages, latex2e, 1 figure, submitted to ijca

arXiv.org e-Print Archive

CiteSeerX

Crossref

The Australian National University

Learning to Learn with Variational Information Bottleneck for Domain Generalization

Author: A D’Innocente
BC Russell
J Schmidhuber
J Schmidhuber
L Maaten
M Everingham
R Vilalta
S Thrun
Y LeCun
Y Li
Publication venue
Publication date: 01/01/2020
Field of study

Domain generalization models learn to generalize to previously unseen domains, but suffer from prediction uncertainty and domain shift. In this paper, we address both problems. We introduce a probabilistic meta-learning model for domain generalization, in which classifier parameters shared across domains are modeled as distributions. This enables better handling of prediction uncertainty on unseen domains. To deal with domain shift, we learn domain-invariant representations by the proposed principle of meta variational information bottleneck, we call MetaVIB. MetaVIB is derived from novel variational bounds of mutual information, by leveraging the meta-learning setting of domain generalization. Through episodic training, MetaVIB learns to gradually narrow domain gaps to establish domain-invariant representations, while simultaneously maximizing prediction accuracy. We conduct experiments on three benchmarks for cross-domain visual recognition. Comprehensive ablation studies validate the benefits of MetaVIB for domain generalization. The comparison results demonstrate our method outperforms previous approaches consistently.Comment: 15 pages, 4 figures, ECCV202

arXiv.org e-Print Archive

ZU Scholars (Zayed University)

Crossref

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Hindsight policy gradients

Author: Mutz F
Rauber P
Schmidhuber J
Ummadisingu A
Publication venue
Publication date: 01/01/2019
Field of study

A reinforcement learning agent that needs to pursue different goals across episodes requires a goal-conditional policy. In addition to their potential to generalize desirable behavior to unseen goals, such policies may also enable higher-level planning based on subgoals. In sparse-reward environments, the capacity to exploit information about the degree to which an arbitrary goal has been achieved while another goal was intended appears crucial to enable sample efficient learning. However, reinforcement learning agents have only recently been endowed with such capacity for hindsight. In this paper, we demonstrate how hindsight can be introduced to policy gradient methods, generalizing this idea to a broad class of successful algorithms. Our experiments on a diverse selection of sparse-reward environments show that hindsight leads to a remarkable increase in sample efficiency.Comment: Accepted to ICLR 201

arXiv.org e-Print Archive

Queen Mary Research Online

Reinforcement Learning in Sparse-Reward Environments with Hindsight Policy Gradients

Author: Mutz F
Rauber P
Schmidhuber J
Ummadisingu A
Publication venue: 'MIT Press - Journals'
Publication date: 01/05/2021
Field of study

A reinforcement learning agent that needs to pursue different goals across episodes requires a goal-conditional policy. In addition to their potential to generalize desirable behavior to unseen goals, such policies may also enable higher-level planning based on subgoals. In sparse-reward environments, the capacity to exploit information about the degree to which an arbitrary goal has been achieved while another goal was intended appears crucial to enabling sample efficient learning. However, reinforcement learning agents have only recently been endowed with such capacity for hindsight. In this letter, we demonstrate how hindsight can be introduced to policy gradient methods, generalizing this idea to a broad class of successful algorithms. Our experiments on a diverse selection of sparse-reward environments show that hindsight leads to a remarkable increase in sample efficiency

Queen Mary Research Online

Recurrent Neural-Linear Posterior Sampling for Nonstationary Contextual Bandits

Author: Conserva M
Ramesh A
Rauber P
Schmidhuber J
Publication venue: 'MIT Press - Journals'
Publication date: 07/10/2022
Field of study

An agent in a nonstationary contextual bandit problem should balance between exploration and the exploitation of (periodic or structured) patterns present in its previous experiences. Handcrafting an appropriate historical context is an attractive alternative to transform a nonstationary problem into a stationary problem that can be solved efficiently. However, even a carefully designed historical context may introduce spurious relationships or lack a convenient representation of crucial information. In order to address these issues, we propose an approach that learns to represent the relevant context for a decision based solely on the raw history of interactions between the agent and the environment. This approach relies on a combination of features extracted by recurrent neural networks with a contextual linear bandit algorithm based on posterior sampling. Our experiments on a diverse selection of contextual and noncontextual nonstationary problems show that our recurrent approach consistently outperforms its feedforward counterpart, which requires handcrafted historical contexts, while being more widely applicable than conventional nonstationary bandit algorithms. Although it is very difficult to provide theoretical performance guarantees for our new approach, we also prove a novel regret bound for linear posterior sampling with measurement error that may serve as a foundation for future theoretical work

arXiv.org e-Print Archive

Queen Mary Research Online