632 research outputs found
Expressivity in Natural and Artificial Systems
Roboticists are trying to replicate animal behavior in artificial systems.
Yet, quantitative bounds on capacity of a moving platform (natural or
artificial) to express information in the environment are not known. This paper
presents a measure for the capacity of motion complexity -- the expressivity --
of articulated platforms (both natural and artificial) and shows that this
measure is stagnant and unexpectedly limited in extant robotic systems. This
analysis indicates trends in increasing capacity in both internal and external
complexity for natural systems while artificial, robotic systems have increased
significantly in the capacity of computational (internal) states but remained
more or less constant in mechanical (external) state capacity. This work
presents a way to analyze trends in animal behavior and shows that robots are
not capable of the same multi-faceted behavior in rich, dynamic environments as
natural systems.Comment: Rejected from Nature, after review and appeal, July 4, 2018
(submitted May 11, 2018
Coherent Soft Imitation Learning
Imitation learning methods seek to learn from an expert either through
behavioral cloning (BC) of the policy or inverse reinforcement learning (IRL)
of the reward. Such methods enable agents to learn complex tasks from humans
that are difficult to capture with hand-designed reward functions. Choosing BC
or IRL for imitation depends on the quality and state-action coverage of the
demonstrations, as well as additional access to the Markov decision process.
Hybrid strategies that combine BC and IRL are not common, as initial policy
optimization against inaccurate rewards diminishes the benefit of pretraining
the policy with BC. This work derives an imitation method that captures the
strengths of both BC and IRL. In the entropy-regularized ('soft') reinforcement
learning setting, we show that the behaviour-cloned policy can be used as both
a shaped reward and a critic hypothesis space by inverting the regularized
policy update. This coherency facilities fine-tuning cloned policies using the
reward estimate and additional interactions with the environment. This approach
conveniently achieves imitation learning through initial behaviour cloning,
followed by refinement via RL with online or offline data sources. The
simplicity of the approach enables graceful scaling to high-dimensional and
vision-based tasks, with stable learning and minimal hyperparameter tuning, in
contrast to adversarial approaches.Comment: 51 pages, 47 figures. DeepMind internship repor
The Impact of a Character Posture Model on the Communication of Affect in an Immersive Virtual Environment
This paper presents the quantitative and qualitative findings from an experiment designed to evaluate a developing model of affective postures for full-body virtual characters in immersive virtual environments (IVEs). Forty-nine participants were each requested to explore a virtual environment by asking two virtual characters for instructions. The participants used a CAVE-like system to explore the environment. Participant responses and their impression of the virtual characters were evaluated through a wide variety of both quantitative and qualitative methods. Combining a controlled experimental approach with various data-collection methods provided a number of advantages such as providing a reason to the quantitative results. The quantitative results indicate that posture plays an important role in the communication of affect by virtual characters. The qualitative findings indicated that participants attribute a variety of psychological states to the behavioral cues displayed by virtual characters. In addition, participants tended to interpret the social context portrayed by the virtual characters in a holistic manner. This suggests that one aspect of the virtual scene colors the perception of the whole social context portrayed by the virtual characters. We conclude by discussing the importance of designing holistically congruent virtual characters especially in immersive settings
Information Theory, Developmental Psychology, and the Baldwin Effect
As part of the extended evolutionary synthesis, there has recently been a new emphasis on the effects of biological development on genetic inheritance and variation. The exciting new directions taken by those in the community have by a pre-history filled with related ideas that were never given a rigorous foundation or combined coherently. Part of the historical background of the extended synthesis is the work of James Mark Baldwin on his so-called “Baldwin Effect.” Many variant re-interpretations of his work obscure the original meaning of the Baldwin Effect. This paper emphasizes a new approach to the Baldwin Effect, focusing on his work in developmental psychology and how that would impact evolution. We propose a novel population genetics model of the Baldwin Effect. First, the impact of a kind of learning process motivated by motor babbling, in the developmental psychology literature, on evolution; second, that Information-theoretic phenotype reshaping speeds up evolution compared to populations without this kind of learning. The basic idea behind the model is to allow the organism to apply abstraction to his initial phenotype to situate it within one of a few different classes of phenotypes in the local neighborhood of a fitness maximum. The reshaping of the phenotype space thereby allows the organism to reach a nearby fitness maximum. By so doing, valleys in the fitness landscape are leveled out, making a rugged fitness landscape into a set of mesas and plateaus with increasing height. Using this model we can show the first sizeable speed-up for the Baldwin Effect compared to ordinary population genetics. We also introduce an information-theoretic foundation for the Baldwin Effect, which may be of independent interest
- …