149 research outputs found
Grounding Artificial Intelligence in the Origins of Human Behavior
Recent advances in Artificial Intelligence (AI) have revived the quest for
agents able to acquire an open-ended repertoire of skills. However, although
this ability is fundamentally related to the characteristics of human
intelligence, research in this field rarely considers the processes that may
have guided the emergence of complex cognitive capacities during the evolution
of the species.
Research in Human Behavioral Ecology (HBE) seeks to understand how the
behaviors characterizing human nature can be conceived as adaptive responses to
major changes in the structure of our ecological niche. In this paper, we
propose a framework highlighting the role of environmental complexity in
open-ended skill acquisition, grounded in major hypotheses from HBE and recent
contributions in Reinforcement learning (RL). We use this framework to
highlight fundamental links between the two disciplines, as well as to identify
feedback loops that bootstrap ecological complexity and create promising
research directions for AI researchers
Learning Curricula in Open-Ended Worlds
Deep reinforcement learning (RL) provides powerful methods for training optimal sequential decision-making agents. As collecting real-world interactions can entail additional costs and safety risks, the common paradigm of sim2real conducts training in a simulator, followed by real-world deployment. Unfortunately, RL agents easily overfit to the choice of simulated training environments, and worse still, learning ends when the agent masters the specific set of simulated environments. In contrast, the real-world is highly open-ended—featuring endlessly evolving environments and challenges, making such RL approaches unsuitable. Simply randomizing across a large space of simulated environments is insufficient, as it requires making arbitrary distributional assumptions, and as the design space grows, it can become combinatorially less likely to sample specific environment instances that are useful for learning. An ideal learning process should automatically adapt the training environment to maximize the learning potential of the agent over an open-ended task space that matches or surpasses the complexity of the real world. This thesis develops a class of methods called Unsupervised Environment Design (UED), which seeks to enable such an open-ended process via a principled approach for gradually improving the robustness and generality of the learning agent. Given a potentially open-ended environment design space, UED automatically generates an infinite sequence or curriculum of training environments at the frontier of the learning agent’s capabilities. Through both extensive empirical studies and theoretical arguments founded on minimax-regret decision theory and game theory, the findings in this thesis show that UED autocurricula can produce RL agents exhibiting significantly improved robustness and generalization to previously unseen environment instances. Such autocurricula are promising paths toward open-ended learning systems that approach general intelligence—a long sought-after ambition of artificial intelligence research—by continually generating and mastering additional challenges of their own design
Considerations for comparing video-game AI agents with humans
Video games are sometimes used as environments to evaluate AI agents’ ability to develop and execute complex action sequences to maximize a defined reward. However, humans cannot match the fine precision of the timed actions of AI agents; in games such as StarCraft, build orders take the place of chess opening gambits. However, unlike strategy games, such as chess and Go, video games also rely heavily on sensorimotor precision. If the “finding” was merely that AI agents have superhuman reaction times and precision, none would be surprised. The goal is rather to look at adaptive reasoning and strategies produced by AI agents that may replicate human approaches or even result in strategies not previously produced by humans. Here, I will provide: (1) an overview of observations where AI agents are perhaps not being fairly evaluated relative to humans, (2) a potential approach for making this comparison more appropriate, and (3) highlight some important recent advances in video game play provided by AI agent
RLupus:Cooperation through emergent communication in the Werewolf social deduction game
This paper focuses on the emergence of communication to support cooperation
in environments modeled as social deduction games (SDG), that are games where
players communicate freely to deduce each others' hidden intentions. We first
state the problem by giving a general formalization of SDG and a possible
solution framework based on reinforcement learning. Next, we focus on a
specific SDG, known as The Werewolf, and study if and how various forms of
communication influence the outcome of the game. Experimental results show that
introducing a communication signal greatly increases the winning chances of a
class of players. We also study the effect of the signal's length and range on
the overall performance showing a non-linear relationship
Automatic Curriculum Learning For Deep RL: A Short Survey
Automatic Curriculum Learning (ACL) has become a cornerstone of recent
successes in Deep Reinforcement Learning (DRL).These methods shape the learning
trajectories of agents by challenging them with tasks adapted to their
capacities. In recent years, they have been used to improve sample efficiency
and asymptotic performance, to organize exploration, to encourage
generalization or to solve sparse reward problems, among others. The ambition
of this work is dual: 1) to present a compact and accessible introduction to
the Automatic Curriculum Learning literature and 2) to draw a bigger picture of
the current state of the art in ACL to encourage the cross-breeding of existing
concepts and the emergence of new ideas.Comment: Accepted at IJCAI202
- …