632 research outputs found

    Expressivity in Natural and Artificial Systems

    Full text link
    Roboticists are trying to replicate animal behavior in artificial systems. Yet, quantitative bounds on capacity of a moving platform (natural or artificial) to express information in the environment are not known. This paper presents a measure for the capacity of motion complexity -- the expressivity -- of articulated platforms (both natural and artificial) and shows that this measure is stagnant and unexpectedly limited in extant robotic systems. This analysis indicates trends in increasing capacity in both internal and external complexity for natural systems while artificial, robotic systems have increased significantly in the capacity of computational (internal) states but remained more or less constant in mechanical (external) state capacity. This work presents a way to analyze trends in animal behavior and shows that robots are not capable of the same multi-faceted behavior in rich, dynamic environments as natural systems.Comment: Rejected from Nature, after review and appeal, July 4, 2018 (submitted May 11, 2018

    Coherent Soft Imitation Learning

    Full text link
    Imitation learning methods seek to learn from an expert either through behavioral cloning (BC) of the policy or inverse reinforcement learning (IRL) of the reward. Such methods enable agents to learn complex tasks from humans that are difficult to capture with hand-designed reward functions. Choosing BC or IRL for imitation depends on the quality and state-action coverage of the demonstrations, as well as additional access to the Markov decision process. Hybrid strategies that combine BC and IRL are not common, as initial policy optimization against inaccurate rewards diminishes the benefit of pretraining the policy with BC. This work derives an imitation method that captures the strengths of both BC and IRL. In the entropy-regularized ('soft') reinforcement learning setting, we show that the behaviour-cloned policy can be used as both a shaped reward and a critic hypothesis space by inverting the regularized policy update. This coherency facilities fine-tuning cloned policies using the reward estimate and additional interactions with the environment. This approach conveniently achieves imitation learning through initial behaviour cloning, followed by refinement via RL with online or offline data sources. The simplicity of the approach enables graceful scaling to high-dimensional and vision-based tasks, with stable learning and minimal hyperparameter tuning, in contrast to adversarial approaches.Comment: 51 pages, 47 figures. DeepMind internship repor

    The Impact of a Character Posture Model on the Communication of Affect in an Immersive Virtual Environment

    Get PDF
    This paper presents the quantitative and qualitative findings from an experiment designed to evaluate a developing model of affective postures for full-body virtual characters in immersive virtual environments (IVEs). Forty-nine participants were each requested to explore a virtual environment by asking two virtual characters for instructions. The participants used a CAVE-like system to explore the environment. Participant responses and their impression of the virtual characters were evaluated through a wide variety of both quantitative and qualitative methods. Combining a controlled experimental approach with various data-collection methods provided a number of advantages such as providing a reason to the quantitative results. The quantitative results indicate that posture plays an important role in the communication of affect by virtual characters. The qualitative findings indicated that participants attribute a variety of psychological states to the behavioral cues displayed by virtual characters. In addition, participants tended to interpret the social context portrayed by the virtual characters in a holistic manner. This suggests that one aspect of the virtual scene colors the perception of the whole social context portrayed by the virtual characters. We conclude by discussing the importance of designing holistically congruent virtual characters especially in immersive settings

    Information Theory, Developmental Psychology, and the Baldwin Effect

    Get PDF
    As part of the extended evolutionary synthesis, there has recently been a new emphasis on the effects of biological development on genetic inheritance and variation. The exciting new directions taken by those in the community have by a pre-history filled with related ideas that were never given a rigorous foundation or combined coherently. Part of the historical background of the extended synthesis is the work of James Mark Baldwin on his so-called “Baldwin Effect.” Many variant re-interpretations of his work obscure the original meaning of the Baldwin Effect. This paper emphasizes a new approach to the Baldwin Effect, focusing on his work in developmental psychology and how that would impact evolution. We propose a novel population genetics model of the Baldwin Effect. First, the impact of a kind of learning process motivated by motor babbling, in the developmental psychology literature, on evolution; second, that Information-theoretic phenotype reshaping speeds up evolution compared to populations without this kind of learning. The basic idea behind the model is to allow the organism to apply abstraction to his initial phenotype to situate it within one of a few different classes of phenotypes in the local neighborhood of a fitness maximum. The reshaping of the phenotype space thereby allows the organism to reach a nearby fitness maximum. By so doing, valleys in the fitness landscape are leveled out, making a rugged fitness landscape into a set of mesas and plateaus with increasing height. Using this model we can show the first sizeable speed-up for the Baldwin Effect compared to ordinary population genetics. We also introduce an information-theoretic foundation for the Baldwin Effect, which may be of independent interest
    corecore