16 research outputs found

    Self-organisation of internal models in autonomous robots

    Get PDF
    Internal Models (IMs) play a significant role in autonomous robotics. They are mechanisms able to represent the input-output characteristics of the sensorimotor loop. In developmental robotics, open-ended learning of skills and knowledge serves the purpose of reaction to unexpected inputs, to explore the environment and to acquire new behaviours. The development of the robot includes self-exploration of the state-action space and learning of the environmental dynamics. In this dissertation, we explore the properties and benefits of the self-organisation of robot behaviour based on the homeokinetic learning paradigm. A homeokinetic robot explores the environment in a coherent way without prior knowledge of its configuration or the environment itself. First, we propose a novel approach to self-organisation of behaviour by artificial curiosity in the sensorimotor loop. Second, we study how different forward models settings alter the behaviour of both exploratory and goal-oriented robots. Diverse complexity, size and learning rules are compared to assess the importance in the robot’s exploratory behaviour. We define the self-organised behaviour performance in terms of simultaneous environment coverage and best prediction of future sensori inputs. Among the findings, we have encountered that models with a fast response and a minimisation of the prediction error by local gradients achieve the best performance. Third, we study how self-organisation of behaviour can be exploited to learn IMs for goal-oriented tasks. An IM acquires coherent self-organised behaviours that are then used to achieve high-level goals by reinforcement learning (RL). Our results demonstrate that learning of an inverse model in this context yields faster reward maximisation and a higher final reward. We show that an initial exploration of the environment in a goal-less yet coherent way improves learning. In the same context, we analyse the self-organisation of central pattern generators (CPG) by reward maximisation. Our results show that CPGs can learn favourable reward behaviour on high-dimensional robots using the self-organised interaction between degrees of freedom. Finally, we examine an on-line dual control architecture where we combine an Actor-Critic RL and the homeokinetic controller. With this configuration, the probing signal is generated by the exertion of the embodied robot experience with the environment. This set-up solves the problem of designing task-dependant probing signals by the emergence of intrinsically motivated comprehensible behaviour. Faster improvement of the reward signal compared to classic RL is achievable with this configuration

    Higher coordination with less control - A result of information maximization in the sensorimotor loop

    Full text link
    This work presents a novel learning method in the context of embodied artificial intelligence and self-organization, which has as few assumptions and restrictions as possible about the world and the underlying model. The learning rule is derived from the principle of maximizing the predictive information in the sensorimotor loop. It is evaluated on robot chains of varying length with individually controlled, non-communicating segments. The comparison of the results shows that maximizing the predictive information per wheel leads to a higher coordinated behavior of the physically connected robots compared to a maximization per robot. Another focus of this paper is the analysis of the effect of the robot chain length on the overall behavior of the robots. It will be shown that longer chains with less capable controllers outperform those of shorter length and more complex controllers. The reason is found and discussed in the information-geometric interpretation of the learning process

    The motivational role of affect in an ecological model

    Get PDF
    Drawing from empirical literature on ecological psychology, affective neuroscience, and philosophy of mind, this article describes a model of affect-as-motivation in the intentional bond between organism and environment. An epistemological justification for the motivating role of emotions is provided through articulating the perceptual context of emotions as embodied, situated, and functional, and positing perceptual salience as a biasing signal in an affordance competition model. The motivational role of affect is pragmatically integrated into discussions of action selection in the neurosciences

    Counterfactual Explanation and Causal in Service of Robustness in Robot Control

    Get PDF

    Cognitive Dynamics: From Attractors to Active Inference

    Get PDF

    Using MapReduce Streaming for Distributed Life Simulation on the Cloud

    Get PDF
    Distributed software simulations are indispensable in the study of large-scale life models but often require the use of technically complex lower-level distributed computing frameworks, such as MPI. We propose to overcome the complexity challenge by applying the emerging MapReduce (MR) model to distributed life simulations and by running such simulations on the cloud. Technically, we design optimized MR streaming algorithms for discrete and continuous versions of Conway’s life according to a general MR streaming pattern. We chose life because it is simple enough as a testbed for MR’s applicability to a-life simulations and general enough to make our results applicable to various lattice-based a-life models. We implement and empirically evaluate our algorithms’ performance on Amazon’s Elastic MR cloud. Our experiments demonstrate that a single MR optimization technique called strip partitioning can reduce the execution time of continuous life simulations by 64%. To the best of our knowledge, we are the first to propose and evaluate MR streaming algorithms for lattice-based simulations. Our algorithms can serve as prototypes in the development of novel MR simulation algorithms for large-scale lattice-based a-life models.https://digitalcommons.chapman.edu/scs_books/1014/thumbnail.jp

    Self-Motivated Composition of Strategic Action Policies

    Get PDF
    In the last 50 years computers have made dramatic progress in their capabilities, but at the same time their failings have demonstrated that we, as designers, do not yet understand the nature of intelligence. Chess playing, for example, was long offered up as an example of the unassailability of the human mind to Artificial Intelligence, but now a chess engine on a smartphone can beat a grandmaster. Yet, at the same time, computers struggle to beat amateur players in simpler games, such as Stratego, where sheer processing power cannot substitute for a lack of deeper understanding. The task of developing that deeper understanding is overwhelming, and has previously been underestimated. There are many threads and all must be investigated. This dissertation explores one of those threads, namely asking the question “How might an artificial agent decide on a sensible course of action, without being told what to do?”. To this end, this research builds upon empowerment, a universal utility which provides an entirely general method for allowing an agent to measure the preferability of one state over another. Empowerment requires no explicit goals, and instead favours states that maximise an agent’s control over its environment. Several extensions to the empowerment framework are proposed, which drastically increase the array of scenarios to which it can be applied, and allow it to evaluate actions in addition to states. These extensions are motivated by concepts such as bounded rationality, sub-goals, and anticipated future utility. In addition, the novel concept of strategic affinity is proposed as a general method for measuring the strategic similarity between two (or more) potential sequences of actions. It does this in a general fashion, by examining how similar the distribution of future possible states would be in the case of enacting either sequence. This allows an agent to group action sequences, even in an unknown task space, into ‘strategies’. Strategic affinity is combined with the empowerment extensions to form soft-horizon empowerment, which is capable of composing action policies in a variety of unknown scenarios. A Pac-Man-inspired prey game and the Gambler’s Problem are used to demonstrate this selfmotivated action selection, and a Sokoban inspired box-pushing scenario is used to highlight the capability to pick strategically diverse actions. The culmination of this is that soft-horizon empowerment demonstrates a variety of ‘intuitive’ behaviours, which are not dissimilar to what we might expect a human to try. This line of thinking demonstrates compelling results, and it is suggested there are a couple of avenues for immediate further research. One of the most promising of these would be applying the self-motivated methodology and strategic affinity method to a wider range of scenarios, with a view to developing improved heuristic approximations that generate similar results. A goal of replicating similar results, whilst reducing the computational overhead, could help drive an improved understanding of how we may get closer to replicating a human-like approach
    corecore