3,797 research outputs found
Chaotic exploration and learning of locomotion behaviours
We present a general and fully dynamic neural system, which exploits intrinsic chaotic dynamics, for the real-time goal-directed exploration and learning of the possible locomotion patterns of an articulated robot of an arbitrary morphology in an unknown environment. The controller is modeled as a network of neural oscillators that are initially coupled only through physical embodiment, and goal-directed exploration of coordinated motor patterns is achieved by chaotic search using adaptive bifurcation. The phase space of the indirectly coupled neural-body-environment system contains multiple transient or permanent self-organized dynamics, each of which is a candidate for a locomotion behavior. The adaptive bifurcation enables the system orbit to wander through various phase-coordinated states, using its intrinsic chaotic dynamics as a driving force, and stabilizes on to one of the states matching the given goal criteria. In order to improve the sustainability of useful transient patterns, sensory homeostasis has been introduced, which results in an increased diversity of motor outputs, thus achieving multiscale exploration. A rhythmic pattern discovered by this process is memorized and sustained by changing the wiring between initially disconnected oscillators using an adaptive synchronization method. Our results show that the novel neurorobotic system is able to create and learn multiple locomotion behaviors for a wide range of body configurations and physical environments and can readapt in realtime after sustaining damage
Proximodistal Exploration in Motor Learning as an Emergent Property of Optimization
International audienceTo harness the complexity of their high-dimensional bodies during sensorimotor development , infants are guided by patterns of freezing and freeing of degrees of freedom. For instance, when learning to reach, infants free the degrees of freedom in their arm proximodis-tally, i.e. from joints that are closer to the body to those that are more distant. Here, we formulate and study computationally the hypothesis that such patterns can emerge spontaneously as the result of a family of stochastic optimization processes (evolution strategies with covariance-matrix adaptation), without an innate encoding of a maturational schedule. In particular, we present simulated experiments with an arm where a computational learner progressively acquires reaching skills through adaptive exploration, and we show that a proximodistal organization appears spontaneously, which we denote PDFF (ProximoDistal Freezing and Freeing of degrees of freedom). We also compare this emergent organization between different arm morphologies – from human-like to quite unnatural ones – to study the effect of different kinematic structures on the emergence of PDFF. Research highlights. • We propose a general, domain-independent hypothesis for the developmental organization of freezing and freeing of degrees of freedom observed both in infant development and adult skill acquisition, such as proximo-distal exploration in learning to reach
Robot Fast Adaptation to Changes in Human Engagement During Simulated Dynamic Social Interaction With Active Exploration in Parameterized Reinforcement Learning
International audienceDynamic uncontrolled human-robot interactions (HRIs) require robots to be able to adapt to changes in the human's behavior and intentions. Among relevant signals, non-verbal cues such as the human's gaze can provide the robot with important information about the human's current engagement in the task, and whether the robot should continue its current behavior or not. However, robot reinforcement learning (RL) abilities to adapt to these nonverbal cues are still underdeveloped. Here, we propose an active exploration algorithm for RL during HRI where the reward function is the weighted sum of the human's current engagement and variations of this engagement. We use a parameterized action space where a meta-learning algorithm is applied to simultaneously tune the exploration in discrete action space (e.g., moving an object) and in the space of continuous characteristics of movement (e.g., velocity, direction, strength, and expressivity). We first show that this algorithm reaches state-of-the-art performance in the nonstationary multiarmed bandit paradigm. We then apply it to a simulated HRI task, and show that it outper-forms continuous parameterized RL with either passive or active exploration based on different existing methods. We finally test the performance in a more realistic test of the same HRI task, where a practical approach is followed to estimate human engagement through visual cues of the head pose. The algorithm can detect and adapt to perturbations in human engagement with different durations. Altogether, these results suggest a novel efficient and robust framework for robot learning during dynamic HRI scenarios
Adaptive robot body learning and estimation through predictive coding
The predictive functions that permit humans to infer their body state by
sensorimotor integration are critical to perform safe interaction in complex
environments. These functions are adaptive and robust to non-linear actuators
and noisy sensory information. This paper introduces a computational perceptual
model based on predictive processing that enables any multisensory robot to
learn, infer and update its body configuration when using arbitrary sensors
with Gaussian additive noise. The proposed method integrates different sources
of information (tactile, visual and proprioceptive) to drive the robot belief
to its current body configuration. The motivation is to enable robots with the
embodied perception needed for self-calibration and safe physical human-robot
interaction.
We formulate body learning as obtaining the forward model that encodes the
sensor values depending on the body variables, and we solve it by Gaussian
process regression. We model body estimation as minimizing the discrepancy
between the robot body configuration belief and the observed posterior. We
minimize the variational free energy using the sensory prediction errors
(sensed vs expected).
In order to evaluate the model we test it on a real multisensory robotic arm.
We show how different sensor modalities contributions, included as additive
errors, improve the refinement of the body estimation and how the system adapts
itself to provide the most plausible solution even when injecting strong
sensory visuo-tactile perturbations. We further analyse the reliability of the
model when different sensor modalities are disabled. This provides grounded
evidence about the correctness of the perceptual model and shows how the robot
estimates and adjusts its body configuration just by means of sensory
information.Comment: Accepted for IEEE International Conference on Intelligent Robots and
Systems (IROS 2018
In silico case studies of compliant robots: AMARSI deliverable 3.3
In the deliverable 3.2 we presented how the morphological computing ap-
proach can significantly facilitate the control strategy in several scenarios,
e.g. quadruped locomotion, bipedal locomotion and reaching. In particular,
the Kitty experimental platform is an example of the use of morphological
computation to allow quadruped locomotion. In this deliverable we continue
with the simulation studies on the application of the different morphological
computation strategies to control a robotic system
Born to learn: The inspiration, progress, and future of evolved plastic artificial neural networks
Biological plastic neural networks are systems of extraordinary computational
capabilities shaped by evolution, development, and lifetime learning. The
interplay of these elements leads to the emergence of adaptive behavior and
intelligence. Inspired by such intricate natural phenomena, Evolved Plastic
Artificial Neural Networks (EPANNs) use simulated evolution in-silico to breed
plastic neural networks with a large variety of dynamics, architectures, and
plasticity rules: these artificial systems are composed of inputs, outputs, and
plastic components that change in response to experiences in an environment.
These systems may autonomously discover novel adaptive algorithms, and lead to
hypotheses on the emergence of biological adaptation. EPANNs have seen
considerable progress over the last two decades. Current scientific and
technological advances in artificial neural networks are now setting the
conditions for radically new approaches and results. In particular, the
limitations of hand-designed networks could be overcome by more flexible and
innovative solutions. This paper brings together a variety of inspiring ideas
that define the field of EPANNs. The main methods and results are reviewed.
Finally, new opportunities and developments are presented
Slowness learning for curiosity-driven agents
In the absence of external guidance, how can a robot learn to map the many raw pixels of high-dimensional visual inputs to useful action sequences? I study methods that achieve this by making robots self-motivated (curious) to continually build compact representations of sensory inputs that encode different aspects of the changing environment. Previous curiosity-based agents acquired skills by associating intrinsic rewards with world model improvements, and used reinforcement learning (RL) to learn how to get these intrinsic rewards. But unlike in previous implementations, I consider streams of high-dimensional visual inputs, where the world model is a set of compact low-dimensional representations of the high-dimensional inputs. To learn these representations, I use the slowness learning principle, which states that the underlying causes of the changing sensory inputs vary on a much slower time scale than the observed sensory inputs. The representations learned through the slowness learning principle are called slow features (SFs). Slow features have been shown to be useful for RL, since they capture the underlying transition process by extracting spatio-temporal regularities in the raw sensory inputs. However, existing techniques that learn slow features are not readily applicable to curiosity-driven online learning agents, as they estimate computationally expensive covariance matrices from the data via batch processing. The first contribution called the incremental SFA (IncSFA), is a low-complexity, online algorithm that extracts slow features without storing any input data or estimating costly covariance matrices, thereby making it suitable to be used for several online learning applications. However, IncSFA gradually forgets previously learned representations whenever the statistics of the input change. In open-ended online learning, it becomes essential to store learned representations to avoid re- learning previously learned inputs. The second contribution is an online active modular IncSFA algorithm called the curiosity-driven modular incremental slow feature analysis (Curious Dr. MISFA). Curious Dr. MISFA addresses the forgetting problem faced by IncSFA and learns expert slow feature abstractions in order from least to most costly, with theoretical guarantees. The third contribution uses the Curious Dr. MISFA algorithm in a continual curiosity-driven skill acquisition framework that enables robots to acquire, store, and re-use both abstractions and skills in an online and continual manner. I provide (a) a formal analysis of the working of the proposed algorithms; (b) compare them to the existing methods; and (c) use the iCub humanoid robot to demonstrate their application in real-world environments. These contributions together demonstrate that the online implementations of slowness learning make it suitable for an open-ended curiosity-driven RL agent to acquire a repertoire of skills that map the many raw pixels of high-dimensional images to multiple sets of action sequences
Ecological psychology, radical enactivism and behavior: an evolutionary perspective
Ecological psychology and enactivism are close relatives in that they share an interest in positioning the behaving organism as an active agent and in interpreting this with reference to ecological and evolutionary ideas. But they also differ in their uses of biology and the concept of information. I review these uses, relate them to ideas in behaviorism, and conclude that a version of enactivism, championed by Daniel Hutto and colleagues, is the more viable hypothesis. I extend this radical enactivist effort into evolutionary enactivism as an exercise in parsimonious theory building that aims to avoid essentialism
- …