Search CORE

707 research outputs found

A PAC-Bayesian bound for Lifelong Learning

Author: Jebara Tony
Lampert Christoph
Pentina Anastasia
Xing Eric
Publication venue
Publication date: 01/01/2014
Field of study

Transfer learning has received a lot of attention in the machine learning community over the last years, and several effective algorithms have been developed. However, relatively little is known about their theoretical properties, especially in the setting of lifelong learning, where the goal is to transfer information to tasks for which no data have been observed so far. In this work we study lifelong learning from a theoretical perspective. Our main result is a PAC-Bayesian generalization bound that offers a unified view on existing paradigms for transfer learning, such as the transfer of parameters or the transfer of low-dimensional representations. We also use the bound to derive two principled lifelong learning algorithms, and we show that these yield results comparable with existing methods.Comment: to appear at ICML 201

arXiv.org e-Print Archive

CiteSeerX

IST Austria: PubRep (Institute of Science and Technology)

Continual Reinforcement Learning in 3D Non-stationary Environments

Author: Culurciello Eugenio
Desai Karan
Lomonaco Vincenzo
Maltoni Davide
Publication venue
Publication date: 21/04/2020
Field of study

High-dimensional always-changing environments constitute a hard challenge for current reinforcement learning techniques. Artificial agents, nowadays, are often trained off-line in very static and controlled conditions in simulation such that training observations can be thought as sampled i.i.d. from the entire observations space. However, in real world settings, the environment is often non-stationary and subject to unpredictable, frequent changes. In this paper we propose and openly release CRLMaze, a new benchmark for learning continually through reinforcement in a complex 3D non-stationary task based on ViZDoom and subject to several environmental changes. Then, we introduce an end-to-end model-free continual reinforcement learning strategy showing competitive results with respect to four different baselines and not requiring any access to additional supervised signals, previously encountered environmental conditions or observations.Comment: Accepted in the CLVision Workshop at CVPR2020: 13 pages, 4 figures, 5 table

arXiv.org e-Print Archive

Crossref

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Learning Independent Causal Mechanisms

Author: Kilbertus Niki
Parascandolo Giambattista
Rojas-Carulla Mateo
Schölkopf Bernhard
Publication venue
Publication date: 01/01/2018
Field of study

Statistical learning relies upon data sampled from a distribution, and we usually do not care what actually generated it in the first place. From the point of view of causal modeling, the structure of each distribution is induced by physical mechanisms that give rise to dependences between observables. Mechanisms, however, can be meaningful autonomous modules of generative models that make sense beyond a particular entailed data distribution, lending themselves to transfer between problems. We develop an algorithm to recover a set of independent (inverse) mechanisms from a set of transformed data points. The approach is unsupervised and based on a set of experts that compete for data generated by the mechanisms, driving specialization. We analyze the proposed method in a series of experiments on image data. Each expert learns to map a subset of the transformed data back to a reference distribution. The learned mechanisms generalize to novel domains. We discuss implications for transfer learning and links to recent trends in generative modeling.Comment: ICML 201

arXiv.org e-Print Archive

MPG.PuRe

Domain Generalization by Marginal Transfer Learning

Author: Blanchard Gilles
Deshmukh Aniket Anand
Dogan Urun
Lee Gyemin
Scott Clayton
Publication venue
Publication date: 17/04/2020
Field of study

In the problem of domain generalization (DG), there are labeled training data sets from several related prediction problems, and the goal is to make accurate predictions on future unlabeled data sets that are not known to the learner. This problem arises in several applications where data distributions fluctuate because of environmental, technical, or other sources of variation. We introduce a formal framework for DG, and argue that it can be viewed as a kind of supervised learning problem by augmenting the original feature space with the marginal distribution of feature vectors. While our framework has several connections to conventional analysis of supervised learning algorithms, several unique aspects of DG require new methods of analysis. This work lays the learning theoretic foundations of domain generalization, building on our earlier conference paper where the problem of DG was introduced Blanchard et al., 2011. We present two formal models of data generation, corresponding notions of risk, and distribution-free generalization error analysis. By focusing our attention on kernel methods, we also provide more quantitative results and a universally consistent algorithm. An efficient implementation is provided for this algorithm, which is experimentally compared to a pooling strategy on one synthetic and three real-world data sets

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

HAL Descartes