Search CORE

124 research outputs found

A novel plasticity rule can explain the development of sensorimotor intelligence

Author: Der Ralf
Martius Georg
Publication venue: 'Proceedings of the National Academy of Sciences'
Publication date: 01/01/2015
Field of study

Grounding autonomous behavior in the nervous system is a fundamental challenge for neuroscience. In particular, the self-organized behavioral development provides more questions than answers. Are there special functional units for curiosity, motivation, and creativity? This paper argues that these features can be grounded in synaptic plasticity itself, without requiring any higher level constructs. We propose differential extrinsic plasticity (DEP) as a new synaptic rule for self-learning systems and apply it to a number of complex robotic systems as a test case. Without specifying any purpose or goal, seemingly purposeful and adaptive behavior is developed, displaying a certain level of sensorimotor intelligence. These surprising results require no system specific modifications of the DEP rule but arise rather from the underlying mechanism of spontaneous symmetry breaking due to the tight brain-body-environment coupling. The new synaptic rule is biologically plausible and it would be an interesting target for a neurobiolocal investigation. We also argue that this neuronal mechanism may have been a catalyst in natural evolution.Comment: 18 pages, 5 figures, 7 video

arXiv.org e-Print Archive

PubMed Central

IST Austria: PubRep (Institute of Science and Technology)

Fast Non-Parametric Learning to Accelerate Mixed-Integer Programming for Online Hybrid Model Predictive Control

Author: Martius Georg
Zhu Jia-Jie
Publication venue
Publication date: 07/05/2020
Field of study

Today's fast linear algebra and numerical optimization tools have pushed the frontier of model predictive control (MPC) forward, to the efficient control of highly nonlinear and hybrid systems. The field of hybrid MPC has demonstrated that exact optimal control law can be computed, e.g., by mixed-integer programming (MIP) under piecewise-affine (PWA) system models. Despite the elegant theory, online solving hybrid MPC is still out of reach for many applications. We aim to speed up MIP by combining geometric insights from hybrid MPC, a simple-yet-effective learning algorithm, and MIP warm start techniques. Following a line of work in approximate explicit MPC, the proposed learning-control algorithm, LNMS, gains computational advantage over MIP at little cost and is straightforward for practitioners to implement

arXiv.org e-Print Archive

MPG.PuRe

Linear combination of one-step predictive information with an external reward in an episodic policy gradient setting: a critical analysis

Author: Ay Nihat
Martius Georg
Zahedi Keyan
Publication venue
Publication date: 01/01/2013
Field of study

One of the main challenges in the field of embodied artificial intelligence is the open-ended autonomous learning of complex behaviours. Our approach is to use task-independent, information-driven intrinsic motivation(s) to support task-dependent learning. The work presented here is a preliminary step in which we investigate the predictive information (the mutual information of the past and future of the sensor stream) as an intrinsic drive, ideally supporting any kind of task acquisition. Previous experiments have shown that the predictive information (PI) is a good candidate to support autonomous, open-ended learning of complex behaviours, because a maximisation of the PI corresponds to an exploration of morphology- and environment-dependent behavioural regularities. The idea is that these regularities can then be exploited in order to solve any given task. Three different experiments are presented and their results lead to the conclusion that the linear combination of the one-step PI with an external reward function is not generally recommended in an episodic policy gradient setting. Only for hard tasks a great speed-up can be achieved at the cost of an asymptotic performance lost

arXiv.org e-Print Archive

Frontiers - Publisher Connector

Information driven self-organization of complex robotic behaviors

Author: Ay Nihat
Der Ralf
Martius Georg
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 27/03/2013
Field of study

Information theory is a powerful tool to express principles to drive autonomous systems because it is domain invariant and allows for an intuitive interpretation. This paper studies the use of the predictive information (PI), also called excess entropy or effective measure complexity, of the sensorimotor process as a driving force to generate behavior. We study nonlinear and nonstationary systems and introduce the time-local predicting information (TiPI) which allows us to derive exact results together with explicit update rules for the parameters of the controller in the dynamical systems framework. In this way the information principle, formulated at the level of behavior, is translated to the dynamics of the synapses. We underpin our results with a number of case studies with high-dimensional robotic systems. We show the spontaneous cooperativity in a complex physical system with decentralized control. Moreover, a jointly controlled humanoid robot develops a high behavioral variety depending on its physics and the environment it is dynamically embedded into. The behavior can be decomposed into a succession of low-dimensional modes that increasingly explore the behavior space. This is a promising way to avoid the curse of dimensionality which hinders learning systems to scale well.Comment: 29 pages, 12 figure

arXiv.org e-Print Archive

Directory of Open Access Journals

PubMed Central

FigShare

Active Learning in the Sensorimotor Loop

Author: Martius Georg
Publication venue
Publication date: 20/10/2017
Field of study

In this thesis we study a novel approach to on-line learning of artificial neural networks, called backward modelling, and apply it to active learning in the sensorimotor loop. At first the mathematic foundations of this approach are elaborated. We observe effects like spontaneous symmetry breaking, response increasing, and generalisation improvement at a theoretical level. We then justify the theory with experimental results on some synthetic problems, in order to understand the phenomena clearly. Finally we consider a simple robot with an adaptive world model. In the case the controller of the robot is just covering a sub-space of the actuator space we realise degenerated world representations in the world model with passive learning and standard learning algorithms. We show that backward modelling and active learning point out degeneracies in the world model and correct them with direct exploration. A special kind of active learning evolves from the use of backward modelling which directly queries patterns on the fly. Additionally, different strategies are investigated in order to control the interplay of controller based and active learning based behaviour

Qucosa - Publikationsserver der Universität Leipzig

Learning Equations for Extrapolation and Control

Author: Lampert Christoph H.
Martius Georg
Sahoo Subham S.
Publication venue
Publication date: 01/01/2018
Field of study

We present an approach to identify concise equations from data using a shallow neural network approach. In contrast to ordinary black-box regression, this approach allows understanding functional relations and generalizing them from observed data to unseen parts of the parameter space. We show how to extend the class of learnable equations for a recently proposed equation learning network to include divisions, and we improve the learning and model selection strategy to be useful for challenging real-world data. For systems governed by analytical expressions, our method can in many cases identify the true underlying equation and extrapolate to unseen domains. We demonstrate its effectiveness by experiments on a cart-pendulum system, where only 2 random rollouts are required to learn the forward dynamics and successfully achieve the swing-up task.Comment: 9 pages, 9 figures, ICML 201

arXiv.org e-Print Archive

IST Austria: PubRep (Institute of Science and Technology)

MPG.PuRe

Deep Reinforcement Learning for Event-Triggered Control

Author: Baumann Dominik
Martius Georg
Trimpe Sebastian
Zhu Jia-Jie
Publication venue
Publication date: 01/01/2018
Field of study

Event-triggered control (ETC) methods can achieve high-performance control with a significantly lower number of samples compared to usual, time-triggered methods. These frameworks are often based on a mathematical model of the system and specific designs of controller and event trigger. In this paper, we show how deep reinforcement learning (DRL) algorithms can be leveraged to simultaneously learn control and communication behavior from scratch, and present a DRL approach that is particularly suitable for ETC. To our knowledge, this is the first work to apply DRL to ETC. We validate the approach on multiple control tasks and compare it to model-based event-triggering frameworks. In particular, we demonstrate that it can, other than many model-based ETC designs, be straightforwardly applied to nonlinear systems

arXiv.org e-Print Archive

Crossref

MPG.PuRe

L4: Practical loss-based stepsize adaptation for deep learning

Author: Martius Georg
Rolinek Michal
Publication venue
Publication date: 30/11/2018
Field of study

We propose a stepsize adaptation scheme for stochastic gradient descent. It operates directly with the loss function and rescales the gradient in order to make fixed predicted progress on the loss. We demonstrate its capabilities by conclusively improving the performance of Adam and Momentum optimizers. The enhanced optimizers with default hyperparameters consistently outperform their constant stepsize counterparts, even the best ones, without a measurable increase in computational cost. The performance is validated on multiple architectures including dense nets, CNNs, ResNets, and the recurrent Differential Neural Computer on classical datasets MNIST, fashion MNIST, CIFAR10 and others.Comment: NeurIPS, 201

arXiv.org e-Print Archive

MPG.PuRe

Goal-conditioned Offline Planning from Curious Exploration

Author: Bagatella Marco
Martius Georg
Publication venue
Publication date: 28/11/2023
Field of study

Curiosity has established itself as a powerful exploration strategy in deep reinforcement learning. Notably, leveraging expected future novelty as intrinsic motivation has been shown to efficiently generate exploratory trajectories, as well as a robust dynamics model. We consider the challenge of extracting goal-conditioned behavior from the products of such unsupervised exploration techniques, without any additional environment interaction. We find that conventional goal-conditioned reinforcement learning approaches for extracting a value function and policy fall short in this difficult offline setting. By analyzing the geometry of optimal goal-conditioned value functions, we relate this issue to a specific class of estimation artifacts in learned values. In order to mitigate their occurrence, we propose to combine model-based planning over learned value landscapes with a graph-based value aggregation scheme. We show how this combination can correct both local and global artifacts, obtaining significant improvements in zero-shot goal-reaching performance across diverse simulated environments

arXiv.org e-Print Archive