6,801 research outputs found
NetHack is Hard to Hack
Neural policy learning methods have achieved remarkable results in various
control problems, ranging from Atari games to simulated locomotion. However,
these methods struggle in long-horizon tasks, especially in open-ended
environments with multi-modal observations, such as the popular dungeon-crawler
game, NetHack. Intriguingly, the NeurIPS 2021 NetHack Challenge revealed that
symbolic agents outperformed neural approaches by over four times in median
game score. In this paper, we delve into the reasons behind this performance
gap and present an extensive study on neural policy learning for NetHack. To
conduct this study, we analyze the winning symbolic agent, extending its
codebase to track internal strategy selection in order to generate one of the
largest available demonstration datasets. Utilizing this dataset, we examine
(i) the advantages of an action hierarchy; (ii) enhancements in neural
architecture; and (iii) the integration of reinforcement learning with
imitation learning. Our investigations produce a state-of-the-art neural agent
that surpasses previous fully neural policies by 127% in offline settings and
25% in online settings on median game score. However, we also demonstrate that
mere scaling is insufficient to bridge the performance gap with the best
symbolic models or even the top human players
Knowledge-Grounded Reinforcement Learning
Receiving knowledge, abiding by laws, and being aware of regulations are
common behaviors in human society. Bearing in mind that reinforcement learning
(RL) algorithms benefit from mimicking humanity, in this work, we propose that
an RL agent can act on external guidance in both its learning process and model
deployment, making the agent more socially acceptable. We introduce the
concept, Knowledge-Grounded RL (KGRL), with a formal definition that an agent
learns to follow external guidelines and develop its own policy. Moving towards
the goal of KGRL, we propose a novel actor model with an embedding-based
attention mechanism that can attend to either a learnable internal policy or
external knowledge. The proposed method is orthogonal to training algorithms,
and the external knowledge can be flexibly recomposed, rearranged, and reused
in both training and inference stages. Through experiments on tasks with
discrete and continuous action space, our KGRL agent is shown to be more sample
efficient and generalizable, and it has flexibly rearrangeable knowledge
embeddings and interpretable behaviors
Learning a Set of Interrelated Tasks by Using a Succession of Motor Policies for a Socially Guided Intrinsically Motivated Learner
We aim at a robot capable to learn sequences of actions to achieve a field of complex tasks. In this paper, we are considering the learning of a set of interrelated complex tasks hierarchically organized. To learn this high-dimensional mapping between a continuous high-dimensional space of tasks and an infinite dimensional space of unbounded sequences of actions, we introduce a new framework called “procedures”, which enables the autonomous discovery of how to combine previously learned skills in order to learn increasingly complex combinations of motor policies. We propose an active learning algorithmic architecture, capable of organizing its learning process in order to achieve a field of complex tasks by learning sequences of primitive motor policies. Based on heuristics of active imitation learning, goal-babbling and strategic learning using intrinsic motivation, our algorithmic architecture leverages our procedures framework to actively decide during its learning process which outcome to focus on and which exploration strategy to apply. We show on a simulated environment that our new architecture is capable of tackling the learning of complex motor policies by adapting the complexity of its policies to the task at hand. We also show that our “procedures” enable the learning agent to discover the task hierarchy and exploit his experience of previously learned skills to learn new complex tasks
Managament in a New and Experimentally Organized Economy
The parallel development of management theory and practice over three phases of economic development is surveyed; (1) the pre-oil crisis experience 1969-1975, (2) the post oil crisis sobering up through most of the 1990s and (3) the emergence of new global production organizations , blurring the notion of the firm to be managed. The external market circumstances of each period dictate different structures of business operations ; (a) a steady state and predictable environment, (b) crisis, inflation and disorderly markets and (c) new technology supporting a globally distributed production organization. As a consequence structural learning between the periods has been of limited value and often outright misleading. The influence of management theory on management practice and its origin in the received economic equilibrium model are discussed, and an alternative management theory based on the theory of the Experimentally Organized Economy (EOE) presented. The increased rate of failure among large firms is related to the increasing complexity of business decisions in globally distributed production and the decreased reliability of learning . It is concluded that successful management practice develops through experimentation in markets and that the best management education has been a varied career in many lines of business and in several companies.Competence bloc theory; Experimentally Organized Economy (EOE); Management theory; WAD theory; Firm Dynamics; Learning
Darwinism, probability and complexity : market-based organizational transformation and change explained through the theories of evolution
The study of transformation and change is one of the most important areas of social science research. This paper synthesizes and critically reviews the emerging traditions in the study of change dynamics. Three mainstream theories of evolution are introduced to explain change: the Darwinian concept of survival of the fittest, the Probability model and the Complexity approach. The literature review provides a basis for development of research questions that search for a more comprehensive understanding of organizational change. The paper concludes by arguing for the development of a complementary research tradition, which combines an evolutionary and organizational analysis of transformation and change
Developmental Bootstrapping of AIs
Although some current AIs surpass human abilities in closed artificial worlds
such as board games, their abilities in the real world are limited. They make
strange mistakes and do not notice them. They cannot be instructed easily, fail
to use common sense, and lack curiosity. They do not make good collaborators.
Mainstream approaches for creating AIs are the traditional manually-constructed
symbolic AI approach and generative and deep learning AI approaches including
large language models (LLMs). These systems are not well suited for creating
robust and trustworthy AIs. Although it is outside of the mainstream, the
developmental bootstrapping approach has more potential. In developmental
bootstrapping, AIs develop competences like human children do. They start with
innate competences. They interact with the environment and learn from their
interactions. They incrementally extend their innate competences with
self-developed competences. They interact and learn from people and establish
perceptual, cognitive, and common grounding. They acquire the competences they
need through bootstrapping. However, developmental robotics has not yet
produced AIs with robust adult-level competences. Projects have typically
stopped at the Toddler Barrier corresponding to human infant development at
about two years of age, before their speech is fluent. They also do not bridge
the Reading Barrier, to skillfully and skeptically draw on the socially
developed information resources that power current LLMs. The next competences
in human cognitive development involve intrinsic motivation, imitation
learning, imagination, coordination, and communication. This position paper
lays out the logic, prospects, gaps, and challenges for extending the practice
of developmental bootstrapping to acquire further competences and create
robust, resilient, and human-compatible AIs.Comment: 102 pages, 29 figure
- …