25 research outputs found

    L’auto-exploration des espaces sensorimoteurs chez les robots

    Get PDF
    Developmental robotics has begun in the last fifteen years to study robots that havea childhood—crawling before trying to run, playing before being useful—and that are basing their decisions upon a lifelong and embodied experience of the real-world. In this context, this thesis studies sensorimotor exploration—the discovery of a robot’s own body and proximal environment—during the early developmental stages, when no prior experience of the world is available. Specifically, we investigate how to generate a diversity of effects in an unknown environment. This approach distinguishes itself by its lack of user-defined reward or fitness function, making it especially suited for integration in self-sufficient platforms. In a first part, we motivate our approach, formalize the exploration problem, define quantitative measures to assess performance, and propose an architectural framework to devise algorithms. through the extensive examination of a multi-joint arm example, we explore some of the fundamental challenges that sensorimotor exploration faces, such as high-dimensionality and sensorimotor redundancy, in particular through a comparison between motor and goal babbling exploration strategies. We propose several algorithms and empirically study their behaviour, investigating the interactions with developmental constraints, external demonstrations and biologicallyinspired motor synergies. Furthermore, because even efficient algorithms can provide disastrous performance when their learning abilities do not align with the environment’s characteristics, we propose an architecture that can dynamically discriminate among a set of exploration strategies. Even with good algorithms, sensorimotor exploration is still an expensive proposition— a problem since robots inherently face constraints on the amount of data they are able to gather; each observation takes a non-negligible time to collect. [...] Throughout this thesis, our core contributions are algorithms description and empirical results. In order to allow unrestricted examination and reproduction of all our results, the entire code is made available. Sensorimotor exploration is a fundamental developmental mechanism of biological systems. By decoupling it from learning and studying it in its own right in this thesis, we engage in an approach that casts light on important problems facing robots developing on their own.La robotique développementale a entrepris, au courant des quinze dernières années,d’étudier les processus développementaux, similaires à ceux des systèmes biologiques,chez les robots. Le but est de créer des robots qui ont une enfance—qui rampent avant d’essayer de courir, qui jouent avant de travailler—et qui basent leurs décisions sur l’expérience de toute une vie, incarnés dans le monde réel.Dans ce contexte, cette thèse étudie l’exploration sensorimotrice—la découverte pour un robot de son propre corps et de son environnement proche—pendant les premiers stage du développement, lorsque qu’aucune expérience préalable du monde n’est disponible. Plus spécifiquement, cette thèse se penche sur comment générer une diversité d’effets dans un environnement inconnu. Cette approche se distingue par son absence de fonction de récompense ou de fitness définie par un expert, la rendant particulièrement apte à être intégrée sur des robots auto-suffisants.Dans une première partie, l’approche est motivée et le problème de l’exploration est formalisé, avec la définition de mesures quantitatives pour évaluer le comportement des algorithmes et d’un cadre architectural pour la création de ces derniers. Via l’examen détaillé de l’exemple d’un bras robot à multiple degrés de liberté, la thèse explore quelques unes des problématiques fondamentales que l’exploration sensorimotrice pose, comme la haute dimensionnalité et la redondance sensorimotrice. Cela est fait en particulier via la comparaison entre deux stratégies d’exploration: le babillage moteur et le babillage dirigé par les objectifs. Plusieurs algorithmes sont proposés tour à tour et leur comportement est évalué empiriquement, étudiant les interactions qui naissent avec les contraintes développementales, les démonstrations externes et les synergies motrices. De plus, parce que même des algorithmes efficaces peuvent se révéler terriblement inefficaces lorsque leurs capacités d’apprentissage ne sont pas adaptés aux caractéristiques de leur environnement, une architecture est proposée qui peut dynamiquement choisir la stratégie d’exploration la plus adaptée parmi un ensemble de stratégies. Mais même avec de bons algorithmes, l’exploration sensorimotrice reste une entreprise coûteuse—un problème important, étant donné que les robots font face à des contraintes fortes sur la quantité de données qu’ils peuvent extraire de leur environnement;chaque observation prenant un temps non-négligeable à récupérer. [...] À travers cette thèse, les contributions les plus importantes sont les descriptions algorithmiques et les résultats expérimentaux. De manière à permettre la reproduction et la réexamination sans contrainte de tous les résultats, l’ensemble du code est mis à disposition. L’exploration sensorimotrice est un mécanisme fondamental du développement des systèmes biologiques. La séparer délibérément des mécanismes d’apprentissage et l’étudier pour elle-même dans cette thèse permet d’éclairer des problèmes importants que les robots se développant seuls seront amenés à affronter

    Learning Timescales in MTRNNs

    Get PDF
    We test the viability of having learnable timescales in multi-timescales recurrent neural networks

    Diversity-driven selection of exploration strategies in multi-armed bandits

    Get PDF
    International audienceWe consider a scenario where an agent has multiple available strategies to explore an unknown environment. For each new interaction with the environment, the agent must select which exploration strategy to use. We provide a new strategy-agnostic method that treat the situation as a Multi-Armed Bandits problem where the reward signal is the diversity of effects that each strategy produces. We test the method empirically on a simulated planar robotic arm, and establish that the method is both able discriminate between strategies of dissimilar quality, even when the differences are tenuous, and that the resulting performance is competitive with the best fixed mixture of strategies

    Autonomous Reuse of Motor Exploration Trajectories

    Get PDF
    International audienceWe present an algorithm for transferring exploration strategies between tasks that share a common motor space in the context of lifelong autonomous learning in robotics. The algorithm does not transfer observations, or make assumptions about how the learning is conducted. Instead, only selected motor commands are transferred between tasks, chosen autonomously according to an empirical measure of learning progress. We show that on a wide variety of variations from a source task, such as changing the object the robot is interacting with or altering the morphology of the robot, this simple and flexible transfer method increases early performance significantly in the new task. We also provide examples of situations where the transfer is not helpful

    Re-run, Repeat, Reproduce, Reuse, Replicate: Transforming Code into Scientific Contributions

    Get PDF
    International audienceScientific code is not production software. Scientific code participates in the evaluation of a scientific hypothesis. This imposes specific constraints on the code that are often overlooked in practice. We articulate, with a small example, five characteristics that a scientific code in computational science should possess: re-runnable, repeatable, reproducible, reusable and replicable

    Intrisically Motivated Goal Space Creation for Autonomous Goal-Directed Exploration in High-Dimensional Unbounded Sensorimotor Spaces

    Get PDF
    International audienceToday's robotic systems are given increasingly complex tasks in an increasing variety of situations such as object or social interaction. Many of those situations cannot be anticipated at design time : autonomous learning capacities are needed to adapt to novel, unexpected conditions. Yet, because of their complex bodies and multiple sensors, robots face highly-dimensional, unbounded, continuous sensorimotor spaces whose semantics are often unknown. Such spaces are too large to be explored exhaustively, an issue even more crucial in robotics given the expensive and slow nature of the physical interactions needed to gather training data. Learning in those spaces also raises other challenges, because robot's sensorimotors spaces are highly heterogeneous and multi-modal, with unreachable areas because of physical constraints, unlearnable areas because the actions of the agent do not have any influence on the sensors values, and yet other area where learning is made difficult by huge noise-to-signal ratios or requires the previous aquisition of other skills (e.g. learning reaching before grasping). This is why efficient explorations techniques are needed, where each interaction maximize the knowledge or competence gained through each interaction. To adress this issue, statistical learning techniques have focused on optimizing exploration policies to maximize various criteria in particular through active learning [1]-[3]. Another approach have stemmed from the field of developmental robotics, where inspiration from psychology and neuroscience research on animal and infant learning [4] [5] [6] have highlighted the importance of curiosity in skill acquisition. Several intrinsically motivated learning techniques have been proposed [10] [11] [12]. In this article, we will build on a particular intrinsically motivated, goal-oriented technique initiated by Baranes and Oudeyer [7], which defines the interest of an area of the sensorimotor space as the progress of the competence in reaching self-assigned goals in this area. This method has yielded excellent results in experiment with motor spaces of high dimension. Yet sensory spaces have remained limited to 2 or 3 dimensions, and the robot had only one type of action to consider. Moreover, the goal space was predefined by hand. We propose a broad expansion of the previous architecture, where the sensory space has 10+ dimensions, and relevant goal space are created and their interest evaluated by the algorithms through novel techniques. Additionally, we considers robotic agents that have several different actions at their disposal that can combine them temporally. To our knowledge, no existing work addresses both those challenges

    Habits That Contradict Rewards

    Get PDF
    International audienceDecision-making is a critical skill for animals and au- tonomous robots alike. Whether you are a rabbit or a driverless car, you constantly need to make appropriate decisions. This work stresses the importance of taking into account habit formation in decision-making and goal-directed behaviors such as intrinsic motivation, especially as it pertains to sensorimotor learning

    Reusing Motor Commands to Learn Object Interaction

    Get PDF
    International audienceWe propose the Reuse algorithm, that exploit data produced during the exploration of an first environment to efficiently bootstrap the exploration of second, different but related environment. The effect of the Reuse algorithm is to produce a high diversity of effects early during exploration. The algorithm only constrains the environments to share the same motor space, and makes no assumptions about learning algorithms or sensory modalities. We illustrate our algorithm on a 6-joints robotic arm interacting with a virtual object, and show that our algorithm is robust to dissimilar environments, and significantly improves the early exploration of similar ones
    corecore