7 research outputs found

    Class-Incremental Learning with Repetition

    Full text link
    Real-world data streams naturally include the repetition of previous concepts. From a Continual Learning (CL) perspective, repetition is a property of the environment and, unlike replay, cannot be controlled by the agent. Nowadays, the Class-Incremental (CI) scenario represents the leading test-bed for assessing and comparing CL strategies. This scenario type is very easy to use, but it never allows revisiting previously seen classes, thus completely neglecting the role of repetition. We focus on the family of Class-Incremental with Repetition (CIR) scenario, where repetition is embedded in the definition of the stream. We propose two stochastic stream generators that produce a wide range of CIR streams starting from a single dataset and a few interpretable control parameters. We conduct the first comprehensive evaluation of repetition in CL by studying the behavior of existing CL strategies under different CIR streams. We then present a novel replay strategy that exploits repetition and counteracts the natural imbalance present in the stream. On both CIFAR100 and TinyImageNet, our strategy outperforms other replay approaches, which are not designed for environments with repetition.Comment: Accepted to the 2nd Conference on Lifelong Learning Agents (CoLLAs), 2023 19 page

    Continuous Coordination As a Realistic Scenario for Lifelong Learning

    Full text link
    Current deep reinforcement learning (RL) algorithms are still highly task-specific and lack the ability to generalize to new environments. Lifelong learning (LLL), however, aims at solving multiple tasks sequentially by efficiently transferring and using knowledge between tasks. Despite a surge of interest in lifelong RL in recent years, the lack of a realistic testbed makes robust evaluation of LLL algorithms difficult. Multi-agent RL (MARL), on the other hand, can be seen as a natural scenario for lifelong RL due to its inherent non-stationarity, since the agents' policies change over time. In this work, we introduce a multi-agent lifelong learning testbed that supports both zero-shot and few-shot settings. Our setup is based on Hanabi -- a partially-observable, fully cooperative multi-agent game that has been shown to be challenging for zero-shot coordination. Its large strategy space makes it a desirable environment for lifelong RL tasks. We evaluate several recent MARL methods, and benchmark state-of-the-art LLL algorithms in limited memory and computation regimes to shed light on their strengths and weaknesses. This continual learning paradigm also provides us with a pragmatic way of going beyond centralized training which is the most commonly used training protocol in MARL. We empirically show that the agents trained in our setup are able to coordinate well with unseen agents, without any additional assumptions made by previous works. The code and all pre-trained models are available at https://github.com/chandar-lab/Lifelong-Hanabi.Comment: 19 pages with supplementary materials. Added results for Lifelong RL methods and some future work. Accepted to ICML 202

    Does Continual Learning = Catastrophic Forgetting?

    Full text link
    Continual learning is known for suffering from catastrophic forgetting, a phenomenon where earlier learned concepts are forgotten at the expense of more recent samples. In this work, we challenge the assumption that continual learning is inevitably associated with catastrophic forgetting by presenting a set of tasks that surprisingly do not suffer from catastrophic forgetting when learned continually. We provide evidence that these reconstruction-type tasks exhibit positive forward transfer and that single-view 3D shape reconstruction improves the performance on learned and novel categories over time. We provide the novel analysis of knowledge transfer ability by looking at the output distribution shift across sequential learning tasks. Finally, we show that the robustness of these tasks leads to the potential of having a proxy representation learning task for continual classification. The codebase, dataset, and pre-trained models released with this article can be found at https://github.com/rehg-lab/CLRec

    Development and research of a neural network alternate incremental learning algorithm

    Get PDF
    Π’ Ρ€Π°Π±ΠΎΡ‚Π΅ показываСтся Π°ΠΊΡ‚ΡƒΠ°Π»ΡŒΠ½ΠΎΡΡ‚ΡŒ Ρ€Π°Π·Ρ€Π°Π±ΠΎΡ‚ΠΊΠΈ ΠΈΠ½ΠΊΡ€Π΅ΠΌΠ΅Π½Ρ‚Π½Ρ‹Ρ… ΠΌΠ΅Ρ‚ΠΎΠ΄ΠΎΠ² ΠΈ Π°Π»Π³ΠΎΡ€ΠΈΡ‚ΠΌΠΎΠ² обучСния Π½Π΅ΠΉΡ€ΠΎΠ½Π½ΠΎΠΉ сСти. ΠŸΡ€Π΅Π΄ΡΡ‚Π°Π²Π»Π΅Π½Ρ‹ сСмСйства Ρ‚Π΅Ρ…Π½ΠΈΠΊ ΠΈΠ½ΠΊΡ€Π΅ΠΌΠ΅Π½Ρ‚Π½ΠΎΠ³ΠΎ обучСния. ΠŸΡ€ΠΎΠ²Π΅Π΄Π΅Π½Π° ΠΎΡ†Π΅Π½ΠΊΠ° возмоТности примСнСния ΠΌΠ°ΡˆΠΈΠ½Ρ‹ ΡΠΊΡΡ‚Ρ€Π΅ΠΌΠ°Π»ΡŒΠ½ΠΎΠ³ΠΎ обучСния ΠΊΠ°ΠΊ ΠΈΠ½ΠΊΡ€Π΅ΠΌΠ΅Π½Ρ‚Π½ΠΎΠ³ΠΎ обучСния. ЭкспСримСнты ΠΏΠΎΠΊΠ°Π·Ρ‹Π²Π°ΡŽΡ‚ Π²ΠΎΠ·ΠΌΠΎΠΆΠ½ΠΎΡΡ‚ΡŒ использования ΠΌΠ°ΡˆΠΈΠ½Ρ‹ ΡΠΊΡΡ‚Ρ€Π΅ΠΌΠ°Π»ΡŒΠ½ΠΎΠ³ΠΎ обучСния ΠΊΠ°ΠΊ ΠΈΠ½ΠΊΡ€Π΅ΠΌΠ΅Π½Ρ‚Π½ΠΎΠ³ΠΎ обучСния, ΠΎΠ΄Π½Π°ΠΊΠΎ ΠΏΡ€ΠΈ ΡƒΠ²Π΅Π»ΠΈΡ‡Π΅Π½ΠΈΠΈ числа ΠΎΠ±ΡƒΡ‡Π°ΡŽΡ‰ΠΈΡ… ΠΏΡ€ΠΈΠΌΠ΅Ρ€ΠΎΠ² нСйронная ΡΠ΅Ρ‚ΡŒ становится Π½Π΅ΠΏΡ€ΠΈΠ³ΠΎΠ΄Π½Π° для дальнСйшСго обучСния. Для Ρ€Π΅ΡˆΠ΅Π½ΠΈΡ Π΄Π°Π½Π½ΠΎΠΉ ΠΏΡ€ΠΎΠ±Π»Π΅ΠΌΡ‹ ΠΏΡ€Π΅Π΄Π»ΠΎΠΆΠ΅Π½ Π°Π»Π³ΠΎΡ€ΠΈΡ‚ΠΌ ΠΈΠ½ΠΊΡ€Π΅ΠΌΠ΅Π½Ρ‚Π½ΠΎΠ³ΠΎ обучСния Π½Π΅ΠΉΡ€ΠΎΠ½Π½ΠΎΠΉ сСти, основанный Π½Π° ΠΏΠΎΠΎΡ‡Π΅Ρ€Π΅Π΄Π½ΠΎΠΌ ΠΏΡ€ΠΈΠΌΠ΅Π½Π΅Π½ΠΈΠΈ ΠΌΠ°ΡˆΠΈΠ½Ρ‹ ΡΠΊΡΡ‚Ρ€Π΅ΠΌΠ°Π»ΡŒΠ½ΠΎΠ³ΠΎ обучСния для ΠΊΠΎΡ€Ρ€Π΅ΠΊΡ‚ΠΈΡ€ΠΎΠ²ΠΊΠΈ вСсов Ρ‚ΠΎΠ»ΡŒΠΊΠΎ Π²Ρ‹Ρ…ΠΎΠ΄Π½ΠΎΠ³ΠΎ слоя сСти (состояниС функционирования) ΠΈ ΠΌΠ΅Ρ‚ΠΎΠ΄Π° ΠΎΠ±Ρ€Π°Ρ‚Π½ΠΎΠ³ΠΎ распространСния ошибки (Π³Π»ΡƒΠ±ΠΎΠΊΠΎΠ³ΠΎ обучСния) для ΠΊΠΎΡ€Ρ€Π΅ΠΊΡ‚ΠΈΡ€ΠΎΠ²ΠΊΠΈ всСх вСсов сСти (состояниС сна). ΠŸΠΎΠ»Π°Π³Π°Π΅Ρ‚ΡΡ, Ρ‡Ρ‚ΠΎ Π² Ρ…ΠΎΠ΄Π΅ состояния функционирования нСйронная ΡΠ΅Ρ‚ΡŒ Π²Ρ‹Π΄Π°Π΅Ρ‚ Ρ€Π΅Π·ΡƒΠ»ΡŒΡ‚Π°Ρ‚Ρ‹ ΠΈΠ»ΠΈ учится Π½Π° Π½ΠΎΠ²Ρ‹Ρ… Π·Π°Π΄Π°Ρ‡Π°Ρ…, Π° Π² состоянии сна ΠΎΠΏΡ‚ΠΈΠΌΠΈΠ·ΠΈΡ€ΡƒΠ΅Ρ‚ свои вСсовыС коэффициСнты. ΠžΡΠΎΠ±Π΅Π½Π½ΠΎΡΡ‚ΡŒΡŽ ΠΏΡ€Π΅Π΄Π»ΠΎΠΆΠ΅Π½Π½ΠΎΠ³ΠΎ Π°Π»Π³ΠΎΡ€ΠΈΡ‚ΠΌΠ° являСтся Π΅Π³ΠΎ Π²ΠΎΠ·ΠΌΠΎΠΆΠ½ΠΎΡΡ‚ΡŒ Π°Π΄Π°ΠΏΡ‚ΠΈΡ€ΠΎΠ²Π°Ρ‚ΡŒΡΡ Π² Ρ€Π΅Π°Π»ΡŒΠ½ΠΎΠΌ ΠΌΠ°ΡΡˆΡ‚Π°Π±Π΅ Π²Ρ€Π΅ΠΌΠ΅Π½ΠΈ ΠΏΠΎΠ΄ ΠΈΠ·ΠΌΠ΅Π½Π΅Π½ΠΈΠ΅ Π²Π½Π΅ΡˆΠ½ΠΈΡ… условий Π½Π° этапС функционирования. На ΠΏΡ€ΠΈΠΌΠ΅Ρ€Π΅ Ρ€Π΅ΡˆΠ΅Π½ΠΈΡ Π·Π°Π΄Π°Ρ‡ΠΈ аппроксимации ΠΏΠΎΠΊΠ°Π·Π°Π½Π° ΡΡ„Ρ„Π΅ΠΊΡ‚ΠΈΠ²Π½ΠΎΡΡ‚ΡŒ ΠΏΡ€Π΅Π΄Π»ΠΎΠΆΠ΅Π½Π½ΠΎΠ³ΠΎ Π°Π»Π³ΠΎΡ€ΠΈΡ‚ΠΌΠ°. ΠŸΡ€Π΅Π΄ΡΡ‚Π°Π²Π»Π΅Π½Ρ‹ Ρ€Π΅Π·ΡƒΠ»ΡŒΡ‚Π°Ρ‚Ρ‹ аппроксимации послС выполнСния ΠΊΠ°ΠΆΠ΄ΠΎΠ³ΠΎ шага Π°Π»Π³ΠΎΡ€ΠΈΡ‚ΠΌΠ°. ΠŸΡ€ΠΈΠ²Π΅Π΄Π΅Π½ΠΎ сравнСниС Π·Π½Π°Ρ‡Π΅Π½ΠΈΠΉ срСднСквадратичСской ошибки ΠΏΡ€ΠΈ использовании ΠΌΠ°ΡˆΠΈΠ½Ρ‹ ΡΠΊΡΡ‚Ρ€Π΅ΠΌΠ°Π»ΡŒΠ½ΠΎΠ³ΠΎ обучСния ΠΊΠ°ΠΊ ΠΈΠ½ΠΊΡ€Π΅ΠΌΠ΅Π½Ρ‚Π½ΠΎΠ³ΠΎ обучСния ΠΈ Ρ€Π°Π·Ρ€Π°Π±ΠΎΡ‚Π°Π½Π½ΠΎΠ³ΠΎ Π°Π»Π³ΠΎΡ€ΠΈΡ‚ΠΌΠ° посмСнного ΠΈΠ½ΠΊΡ€Π΅ΠΌΠ΅Π½Ρ‚Π½ΠΎΠ³ΠΎ обучСния Π½Π΅ΠΉΡ€ΠΎΠ½Π½ΠΎΠΉ сСти
    corecore