Search CORE

272,408 research outputs found

Reinforcement learning algorithms that assimilate and accommodate skills with multiple tasks

Author: Baldassarre Gianluca
Caligiore Daniele
Mirolli Marco
Tommasino Paolo
Publication venue: IEEE
Publication date: 01/01/2012
Field of study

Children are capable of acquiring a large repertoire of motor skills and of efficiently adapting them to novel conditions. In a previous work we proposed a hierarchical modular reinforcement learning model (RANK) that can learn multiple motor skills in continuous action and state spaces. The model is based on a development of the mixture-of-expert model that has been suitably developed to work with reinforcement learning. In particular, the model uses a high-level gating network for assigning responsibilities for acting and for learning to a set of low-level expert networks. The model was also developed with the goal of exploiting the Piagetian mechanisms of assimilation and accommodation to support learning of multiple tasks. This paper proposes a new model (TERL - Transfer Expert Reinforcement Learning) that substantially improves RANK. The key difference with respect to the previous model is the decoupling of the mechanisms that generate the responsibility signals of experts for learning and for control. This made possible to satisfy different constraints for functioning and for learning. We test both the TERL and the RANK models with a two-DOFs dynamic arm engaged in solving multiple reaching tasks, and compare the two with a simple, flat reinforcement learning model. The results show that both models are capable of exploiting assimilation and accommodation processes in order to transfer knowledge between similar tasks, and at the same time to avoid catastrophic interference. Furthermore, the TERL model is shown to significantly outperform the RANK model thanks to its faster and more stable specialization of experts

PUblication MAnagement

Calliffusion: Chinese Calligraphy Generation and Style Transfer with Diffusion Modeling

Author: Liao Qisheng
Wang Zhinuo
Xia Gus
Publication venue
Publication date: 30/05/2023
Field of study

In this paper, we propose Calliffusion, a system for generating high-quality Chinese calligraphy using diffusion models. Our model architecture is based on DDPM (Denoising Diffusion Probabilistic Models), and it is capable of generating common characters in five different scripts and mimicking the styles of famous calligraphers. Experiments demonstrate that our model can generate calligraphy that is difficult to distinguish from real artworks and that our controls for characters, scripts, and styles are effective. Moreover, we demonstrate one-shot transfer learning, using LoRA (Low-Rank Adaptation) to transfer Chinese calligraphy art styles to unseen characters and even out-of-domain symbols such as English letters and digits.Comment: 5pages, International Conference on Computational Creativity, ICC

arXiv.org e-Print Archive

Frameworks for Learning from Multiple Tasks

Author: Stamos Dimitris
Publication venue: UCL (University College London)
Publication date: 28/01/2020
Field of study

In this thesis we study different machine learning frameworks for learning multiple tasks together. Depending on the motivations and goals of each learning framework we investigate their computational and statistical properties from both a theoretical and experimental standpoint. The first problem we tackle is low rank matrix learning which is a popular model assumption used in MTL. Trace norm regularization is a widely used approach for learning such models. A standard optimization strategy is based on formulating the problem as one of low rank matrix factorization which, however, leads to a non-convex problem. We show that it is possible to characterize the critical points of the non-convex problem. This allows us to provide an efficient criterion to determine whether a critical point is also a global minimizer. We extend this analysis to the case in which the objective is nonsmooth. The goal of the second problem we worked on is to infer a learning algorithm that works well on a class of tasks sampled from an unknown meta-distribution. As an extension of MTL our goal here is to train on a set of tasks and perform well on future, unseen tasks. We consider a scenario in which the tasks are presented sequentially, without keeping any of their information in memory. We study the statistical properties of that proposed algorithm and prove non-asymptotic bounds for the excess transfer risk. Lastly, a common practice in ML is concatenating many different datasets and applying a learning algorithm on this new dataset. However, training on a collection of heterogeneous datasets can cause issues due to the presence of bias. In this thesis we derive a MTL framework that can jointly learn subcategories within a dataset and undo the inherent bias existing within each of them

UCL Discovery