413 research outputs found

    Continuous coordination as a realistic scenario for lifelong learning

    Full text link
    Les algorithmes actuels d'apprentissage profond par renforcement (RL) sont encore très spécifiques à leur tâche et n'ont pas la capacité de généraliser à de nouveaux environnements. L'apprentissage tout au long de la vie (LLL), cependant, vise à résoudre plusieurs tâches de manière séquentielle en transférant et en utilisant efficacement les connaissances entre les tâches. Malgré un regain d'intérêt pour le RL tout au long de la vie ces dernières années, l'absence d'un banc de test réaliste rend difficile une évaluation robuste des algorithmes d'apprentissage tout au long de la vie. Le RL multi-agents (MARL), d'autre part, peut être considérée comme un scénario naturel pour le RL tout au long de la vie en raison de sa non-stationnarité inhérente, puisque les politiques des agents changent avec le temps. Dans cette thèse, nous présentons un banc de test multi-agents d'apprentissage tout au long de la vie qui prend en charge un paramétrage à la fois zéro et quelques-coups. Notre configuration est basée sur Hanabi - un jeu multi-agents partiellement observable et entièrement coopératif qui s'est avéré difficile pour la coordination zéro coup. Son vaste espace stratégique en fait un environnement souhaitable pour les tâches RL tout au long de la vie. Nous évaluons plusieurs méthodes MARL récentes et comparons des algorithmes d'apprentissage tout au long de la vie de pointe dans des régimes de mémoire et de calcul limités pour faire la lumière sur leurs forces et leurs faiblesses. Ce paradigme d'apprentissage continu nous fournit également une manière pragmatique d'aller au-delà de la formation centralisée qui est le protocole de formation le plus couramment utilisé dans MARL. Nous montrons empiriquement que les agents entraînés dans notre environnement sont capables de bien se coordonner avec des agents inconnus, sans aucune hypothèse supplémentaire faite par des travaux précédents. Mots-clés: le RL multi-agents, l'apprentissage tout au long de la vie.Current deep reinforcement learning (RL) algorithms are still highly task-specific and lack the ability to generalize to new environments. Lifelong learning (LLL), however, aims at solving multiple tasks sequentially by efficiently transferring and using knowledge between tasks. Despite a surge of interest in lifelong RL in recent years, the lack of a realistic testbed makes robust evaluation of lifelong learning algorithms difficult. Multi-agent RL (MARL), on the other hand, can be seen as a natural scenario for lifelong RL due to its inherent non-stationarity, since the agents' policies change over time. In this thesis, we introduce a multi-agent lifelong learning testbed that supports both zero-shot and few-shot settings. Our setup is based on Hanabi --- a partially-observable, fully cooperative multi-agent game that has been shown to be challenging for zero-shot coordination. Its large strategy space makes it a desirable environment for lifelong RL tasks. We evaluate several recent MARL methods, and benchmark state-of-the-art lifelong learning algorithms in limited memory and computation regimes to shed light on their strengths and weaknesses. This continual learning paradigm also provides us with a pragmatic way of going beyond centralized training which is the most commonly used training protocol in MARL. We empirically show that the agents trained in our setup are able to coordinate well with unknown agents, without any additional assumptions made by previous works. Key words: multi-agent reinforcement learning, lifelong learning

    A review of use of antibiotics in dentistry and recommendations for rational antibiotic usage by dentists.

    Get PDF
    Dentists commonly prescribe antibiotics for controlling and treating dental infections. But there is a widespread abuse of antibiotics in medical and dental field. The inappropriate use of antibiotics results in increased treatment costs, increased risk of adverse events related to the antibiotic used and most importantly development and propagation of antimicrobial resistance. The definitive indications for use of antibiotics in dentistry are limited and specific. This review discusses the various principles and rationale behind antibiotic therapy in different fields of dentistry with stress on rational antibiotic use in dentistry

    Continuous Coordination As a Realistic Scenario for Lifelong Learning

    Full text link
    Current deep reinforcement learning (RL) algorithms are still highly task-specific and lack the ability to generalize to new environments. Lifelong learning (LLL), however, aims at solving multiple tasks sequentially by efficiently transferring and using knowledge between tasks. Despite a surge of interest in lifelong RL in recent years, the lack of a realistic testbed makes robust evaluation of LLL algorithms difficult. Multi-agent RL (MARL), on the other hand, can be seen as a natural scenario for lifelong RL due to its inherent non-stationarity, since the agents' policies change over time. In this work, we introduce a multi-agent lifelong learning testbed that supports both zero-shot and few-shot settings. Our setup is based on Hanabi -- a partially-observable, fully cooperative multi-agent game that has been shown to be challenging for zero-shot coordination. Its large strategy space makes it a desirable environment for lifelong RL tasks. We evaluate several recent MARL methods, and benchmark state-of-the-art LLL algorithms in limited memory and computation regimes to shed light on their strengths and weaknesses. This continual learning paradigm also provides us with a pragmatic way of going beyond centralized training which is the most commonly used training protocol in MARL. We empirically show that the agents trained in our setup are able to coordinate well with unseen agents, without any additional assumptions made by previous works. The code and all pre-trained models are available at https://github.com/chandar-lab/Lifelong-Hanabi.Comment: 19 pages with supplementary materials. Added results for Lifelong RL methods and some future work. Accepted to ICML 202

    A novel scheduling algorithm to maximize the D2D spatial reuse in LTE networks

    Get PDF
    In order to offload base station (BS) traffic and to enhance efficiency of spectrum, operators can activate many Device-to-Device (D2D) pairs or links in LTE networks. This increases the overall spectral efficiency because the same Resource Blocks (RBs) are used across cellular UEs (CUEs) (i.e., all UEs connected to BS for both C-Plane and D-plane communication) and D2D links (i.e., where the UEs are connected to BS only for C-plane communication). However, significant interference problems can be caused by D2D communications as the same RBs are being shared. In our work, we address this problem by proposing a novel scheduling algorithm, Efficient Scheduling and Power control Algorithm for D2Ds (ESPAD), which reuses the same RBs and tries to maximize the overall network throughput without affecting the CUEs throughput. ESPAD algorithm also ensures that Signal to Noise plus Interference Ratio (SINR) for each of the D2D links is maintained above a certain predefined threshold. The aforementioned properties of ESPAD algorithm makes sure that the CUEs do not experience very high interference from the D2Ds. It is observed that even when the SINRdrop (i.e., maximum permissible drop in SINR of CUEs) is as high as 10 dB, there is no drastic decrease in CUEs throughput (only 3.78%). We also compare our algorithm against other algorithms and show that D2D throughput improves drastically without undermining CUEs throughput
    corecore