Search CORE

413 research outputs found

Continuous coordination as a realistic scenario for lifelong learning

Author: Badrinaaraayanan Akilesh
Publication venue
Publication date: 01/04/2021
Field of study

Les algorithmes actuels d'apprentissage profond par renforcement (RL) sont encore très spécifiques à leur tâche et n'ont pas la capacité de généraliser à de nouveaux environnements. L'apprentissage tout au long de la vie (LLL), cependant, vise à résoudre plusieurs tâches de manière séquentielle en transférant et en utilisant efficacement les connaissances entre les tâches. Malgré un regain d'intérêt pour le RL tout au long de la vie ces dernières années, l'absence d'un banc de test réaliste rend difficile une évaluation robuste des algorithmes d'apprentissage tout au long de la vie. Le RL multi-agents (MARL), d'autre part, peut être considérée comme un scénario naturel pour le RL tout au long de la vie en raison de sa non-stationnarité inhérente, puisque les politiques des agents changent avec le temps. Dans cette thèse, nous présentons un banc de test multi-agents d'apprentissage tout au long de la vie qui prend en charge un paramétrage à la fois zéro et quelques-coups. Notre configuration est basée sur Hanabi - un jeu multi-agents partiellement observable et entièrement coopératif qui s'est avéré difficile pour la coordination zéro coup. Son vaste espace stratégique en fait un environnement souhaitable pour les tâches RL tout au long de la vie. Nous évaluons plusieurs méthodes MARL récentes et comparons des algorithmes d'apprentissage tout au long de la vie de pointe dans des régimes de mémoire et de calcul limités pour faire la lumière sur leurs forces et leurs faiblesses. Ce paradigme d'apprentissage continu nous fournit également une manière pragmatique d'aller au-delà de la formation centralisée qui est le protocole de formation le plus couramment utilisé dans MARL. Nous montrons empiriquement que les agents entraînés dans notre environnement sont capables de bien se coordonner avec des agents inconnus, sans aucune hypothèse supplémentaire faite par des travaux précédents. Mots-clés: le RL multi-agents, l'apprentissage tout au long de la vie.Current deep reinforcement learning (RL) algorithms are still highly task-specific and lack the ability to generalize to new environments. Lifelong learning (LLL), however, aims at solving multiple tasks sequentially by efficiently transferring and using knowledge between tasks. Despite a surge of interest in lifelong RL in recent years, the lack of a realistic testbed makes robust evaluation of lifelong learning algorithms difficult. Multi-agent RL (MARL), on the other hand, can be seen as a natural scenario for lifelong RL due to its inherent non-stationarity, since the agents' policies change over time. In this thesis, we introduce a multi-agent lifelong learning testbed that supports both zero-shot and few-shot settings. Our setup is based on Hanabi --- a partially-observable, fully cooperative multi-agent game that has been shown to be challenging for zero-shot coordination. Its large strategy space makes it a desirable environment for lifelong RL tasks. We evaluate several recent MARL methods, and benchmark state-of-the-art lifelong learning algorithms in limited memory and computation regimes to shed light on their strengths and weaknesses. This continual learning paradigm also provides us with a pragmatic way of going beyond centralized training which is the most commonly used training protocol in MARL. We empirically show that the agents trained in our setup are able to coordinate well with unknown agents, without any additional assumptions made by previous works. Key words: multi-agent reinforcement learning, lifelong learning

Dépôt Institutionnel Numérique

Recommended from our members

Design of an FPGA-based Array Formatter for Casa Phase-Tilt Radar System

Author: Krishnamurthy Akilesh
Publication venue: ScholarWorks@UMass Amherst
Publication date: 01/01/2011
Field of study

Weather monitoring and forecasting systems have witnessed rapid advancement in recent years. However, one of the main challenges faced by these systems is poor coverage in lower atmospheric regions due to earth\u27s curvature. The Engineering Research Center for the Collaborative Adaptive Sensing of the Atmosphere (CASA) has developed a dense network of small low-power radars to improve the coverage of weather sensing systems. Traditional, mechanically-scanned antennas used in these radars are now being replaced with high-performance electronically-scanned phased-arrays. Phased-Array radars, however, require large number of active microwave components to scan electronically in both the azimuth and elevation planes, thus significantly increasing the cost of the entire radar system. To address this issue, CASA has designed a Phase-Tilt radar, that scans electronically in azimuth and mechanically in elevation. One of the core components of this system is the Phased-Array controller or the Array Formatter. The Array Formatter is a Field Programmable Gate Array (FPGA)-based master controller that translates user commands from a computer to control and timing signals for the radar system. The objective of this thesis is to design and test an FPGA-based Array Formatter for CASA\u27s Phase-Tilt radar system

ScholarWorks@UMass Amherst

A review of use of antibiotics in dentistry and recommendations for rational antibiotic usage by dentists.

Author: Ramasamy Akilesh
Publication venue: 'International Medical Publisher (Fundacion de Neurociencias)'
Publication date: 20/09/2014
Field of study

Dentists commonly prescribe antibiotics for controlling and treating dental infections. But there is a widespread abuse of antibiotics in medical and dental field. The inappropriate use of antibiotics results in increased treatment costs, increased risk of adverse events related to the antibiotic used and most importantly development and propagation of antimicrobial resistance. The definitive indications for use of antibiotics in dentistry are limited and specific. This review discusses the various principles and rationale behind antibiotic therapy in different fields of dentistry with stress on rational antibiotic use in dentistry

International Medical Publisher Journals (iMedPub)

Continuous Coordination As a Realistic Scenario for Lifelong Learning

Author: Badrinaaraayanan Akilesh
Chandar Sarath
Courville Aaron
Nekoei Hadi
Publication venue
Publication date: 01/01/2021
Field of study

Current deep reinforcement learning (RL) algorithms are still highly task-specific and lack the ability to generalize to new environments. Lifelong learning (LLL), however, aims at solving multiple tasks sequentially by efficiently transferring and using knowledge between tasks. Despite a surge of interest in lifelong RL in recent years, the lack of a realistic testbed makes robust evaluation of LLL algorithms difficult. Multi-agent RL (MARL), on the other hand, can be seen as a natural scenario for lifelong RL due to its inherent non-stationarity, since the agents' policies change over time. In this work, we introduce a multi-agent lifelong learning testbed that supports both zero-shot and few-shot settings. Our setup is based on Hanabi -- a partially-observable, fully cooperative multi-agent game that has been shown to be challenging for zero-shot coordination. Its large strategy space makes it a desirable environment for lifelong RL tasks. We evaluate several recent MARL methods, and benchmark state-of-the-art LLL algorithms in limited memory and computation regimes to shed light on their strengths and weaknesses. This continual learning paradigm also provides us with a pragmatic way of going beyond centralized training which is the most commonly used training protocol in MARL. We empirically show that the agents trained in our setup are able to coordinate well with unseen agents, without any additional assumptions made by previous works. The code and all pre-trained models are available at https://github.com/chandar-lab/Lifelong-Hanabi.Comment: 19 pages with supplementary materials. Added results for Lifelong RL methods and some future work. Accepted to ICML 202

arXiv.org e-Print Archive

PolyPublie

A novel scheduling algorithm to maximize the D2D spatial reuse in LTE networks

Author: Akilesh B
Ramamurthy A
Sathya V
Tamma Bheemarjuna Reddy
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2016
Field of study

In order to offload base station (BS) traffic and to enhance efficiency of spectrum, operators can activate many Device-to-Device (D2D) pairs or links in LTE networks. This increases the overall spectral efficiency because the same Resource Blocks (RBs) are used across cellular UEs (CUEs) (i.e., all UEs connected to BS for both C-Plane and D-plane communication) and D2D links (i.e., where the UEs are connected to BS only for C-plane communication). However, significant interference problems can be caused by D2D communications as the same RBs are being shared. In our work, we address this problem by proposing a novel scheduling algorithm, Efficient Scheduling and Power control Algorithm for D2Ds (ESPAD), which reuses the same RBs and tries to maximize the overall network throughput without affecting the CUEs throughput. ESPAD algorithm also ensures that Signal to Noise plus Interference Ratio (SINR) for each of the D2D links is maintained above a certain predefined threshold. The aforementioned properties of ESPAD algorithm makes sure that the CUEs do not experience very high interference from the D2Ds. It is observed that even when the SINRdrop (i.e., maximum permissible drop in SINR of CUEs) is as high as 10 dB, there is no drastic decrease in CUEs throughput (only 3.78%). We also compare our algorithm against other algorithms and show that D2D throughput improves drastically without undermining CUEs throughput

Research Archive of Indian Institute of Technology Hyderabad