7 research outputs found

    Many-agent Reinforcement Learning

    Get PDF
    Multi-agent reinforcement learning (RL) solves the problem of how each agent should behave optimally in a stochastic environment in which multiple agents are learning simultaneously. It is an interdisciplinary domain with a long history that lies in the joint area of psychology, control theory, game theory, reinforcement learning, and deep learning. Following the remarkable success of the AlphaGO series in single-agent RL, 2019 was a booming year that witnessed significant advances in multi-agent RL techniques; impressive breakthroughs have been made on developing AIs that outperform humans on many challenging tasks, especially multi-player video games. Nonetheless, one of the key challenges of multi-agent RL techniques is the scalability; it is still non-trivial to design efficient learning algorithms that can solve tasks including far more than two agents (N2N \gg 2), which I name by \emph{many-agent reinforcement learning} (MARL\footnote{I use the world of ``MARL" to denote multi-agent reinforcement learning with a particular focus on the cases of many agents; otherwise, it is denoted as ``Multi-Agent RL" by default.}) problems. In this thesis, I contribute to tackling MARL problems from four aspects. Firstly, I offer a self-contained overview of multi-agent RL techniques from a game-theoretical perspective. This overview fills the research gap that most of the existing work either fails to cover the recent advances since 2010 or does not pay adequate attention to game theory, which I believe is the cornerstone to solving many-agent learning problems. Secondly, I develop a tractable policy evaluation algorithm -- αα\alpha^\alpha-Rank -- in many-agent systems. The critical advantage of αα\alpha^\alpha-Rank is that it can compute the solution concept of α\alpha-Rank tractably in multi-player general-sum games with no need to store the entire pay-off matrix. This is in contrast to classic solution concepts such as Nash equilibrium which is known to be PPADPPAD-hard in even two-player cases. αα\alpha^\alpha-Rank allows us, for the first time, to practically conduct large-scale multi-agent evaluations. Thirdly, I introduce a scalable policy learning algorithm -- mean-field MARL -- in many-agent systems. The mean-field MARL method takes advantage of the mean-field approximation from physics, and it is the first provably convergent algorithm that tries to break the curse of dimensionality for MARL tasks. With the proposed algorithm, I report the first result of solving the Ising model and multi-agent battle games through a MARL approach. Fourthly, I investigate the many-agent learning problem in open-ended meta-games (i.e., the game of a game in the policy space). Specifically, I focus on modelling the behavioural diversity in meta-games, and developing algorithms that guarantee to enlarge diversity during training. The proposed metric based on determinantal point processes serves as the first mathematically rigorous definition for diversity. Importantly, the diversity-aware learning algorithms beat the existing state-of-the-art game solvers in terms of exploitability by a large margin. On top of the algorithmic developments, I also contribute two real-world applications of MARL techniques. Specifically, I demonstrate the great potential of applying MARL to study the emergent population dynamics in nature, and model diverse and realistic interactions in autonomous driving. Both applications embody the prospect that MARL techniques could achieve huge impacts in the real physical world, outside of purely video games

    LIPIcs, Volume 251, ITCS 2023, Complete Volume

    Get PDF
    LIPIcs, Volume 251, ITCS 2023, Complete Volum

    The Evolution of Galaxies and Their Environment

    Get PDF
    The Third Teton Summer School on Astrophysics discussed the formation of galaxies, star formation in galaxies, galaxies and quasars at high red shift, and the intergalactic and intercluster medium and cooling flows. Observation and theoretical research on these topics was presented at the meeting and summaries of the contributed papers are included in this volume

    Global population trajectories, life history strategies and vulnerability to fishing of scombrid species : implications for conservation and management

    Get PDF
    [Abstract] Fishing is the primary threat to marine species and ecosystems, but the details of the extent of overfishing remains fragmentary. Here, I provide new insights upon the global impacts of fishing on scombrids, which includes 51 species of tunas, Spanish mackerels, bonitos and mackerels, and advance our ability to identify, a priori, the characteristics of species that render them most vulnerable to overfishing. First, through a global meta-analysis of adult biomass trends, I show that scombrid populations have declined, on average, by 60% over the last half century. The decline in the total adult biomass is lower (52%) as it is buffered by a few larger sustainably fished populations. Second, I identify major gaps in biological knowledge and prioritize life history research needs, especially for the coastal scombrids. Then, I examine the diversity in their life histories, and reveal that most life history variation in scombrids can be simplified to three dimensions (governed by size, speed of life and reproductive schedule). Finally, I show that those scombrid populations with slowest life histories have experienced the largest declines in biomass and have a higher probability of being overfished. The speed of life traits - growth rate and longevity - are the best life history indicators of vulnerability to fishing. My thesis can be considered as a case-study in the importance of accounting for the varying life history strategies of species when planning conservation and management strategies.[Resumen] La pesca es la principal amenaza para las especies y ecosistemas marinos; sin embargo, la escala y el alcance de estos impactos siguen siendo inciertos. Esta tesis aporta nuevos conocimientos sobre los impactos globales de la pesca en las especies de escómbridos (51 especies de atunes, petos, bonitos y caballas), y avanza en nuestra capacidad para identificar a priori qué especies son más vulnerables a la sobrepesca. En primer lugar, un meta-análisis global de las tendencias en la biomasa de adultos muestra que las poblaciones de escómbridos han disminuido, en promedio, un 60% durante el último medio siglo. La disminución de la biomasa total de adultos es menor (52%), ya que está mitigada por las poblaciones más abundantes y mejor gestionadas. En segundo lugar, se identifican carencias y falta de datos biológicos para las 51 especies de escómbridos y se establecen prioridades en investigación para las especies que más lo necesitan. En tercer lugar, se examina la diversidad en las historias de vida en escómbridos, y se muestra que la mayor parte de la variación puede simplificarse en tres dimensiones (gobernadas por el tamaño máximo corporal, la velocidad de la vida y el calendario reproductivo). Por último, se muestra que las poblaciones de escómbridos con historias de vida más lentas han experimentado los mayores descensos en biomasa y tienen una mayor probabilidad de ser objeto de sobrepesca. La parámetros biológicos con unidades de medida de tiempo - la tasa de crecimiento y la longevidad - son los mejores indicadores de la vulnerabilidad de las especies a la pesca. Esta tesis se centra en los escómbridos como caso de estudio para resaltar la importancia de las distintas estrategias de vida de las especies a la hora de planificar estrategias de conservación y gestión.[Resumo] A pesca é a principal ameaza para as especies e ecosistemas mariños; con todo, a escala e o alcance destes impactos seguen sendo incertos. Esta tese aporta novos coñecementos sobre os impactos globais da pesca nas especies de escómbridos (51 especies de atúns, petos, bonitos e xardas), e avanza na nosa capacidade para identificar a priori qué especies son máis vulnerables á sobrepesca. En primeiro lugar, unha meta-análise global das tendencias na biomasa de adultos mostra que as poboacións de escómbridos diminuíron, en promedio, un 60% durante o último medio século. A diminución da biomasa total de adultos é menor (52%), xa que está mitigada polas poboacións máis abundantes e mellor xestionadas. En segundo lugar, identifícanse carencias e falta de datos biolóxicos para as 51 especies de escómbridos, e establécense prioridades en investigación para as especies que máis o necesitan. En terceiro lugar, examínase a diversidade nas historias de vida dos escómbridos, e móstrase que a maior parte da variación pode simplificarse en tres dimensións (gobernadas polo tamaño máximo corporal, a velocidade da vida e o calendario reprodutivo). Para rematar, móstrase que as poboacións de escómbridos con historias de vida máis lentas experimentaron os maiores descensos en biomasa, e teñen unha maior probabilidade de ser obxecto de sobrepesca. Os parámetros biolóxicos con unidades de medida de tempo - a taxa de crecemento e a lonxevidade - son os mellores indicadores da vulnerabilidade das especies á pesca. Esta tese resalta a importancia das estratexias de vida das especies para planificar estratexias de conservación e xestión
    corecore