37 research outputs found

    Aktionenlernen mit Selbstorganisierenden Karten und Reinforcement Learning

    No full text
    This doctoral thesis deals with the development of a function approximator and its application to methods for learning discrete and continuous actions: 1. A general function approximator ? Locally Weighted Interpolating Growing Neural Gas (LWIGNG) ? is developed from Growing Neural Gas (GNG). The topological neighbourhood structure is used for calculating interpolations between neighbouring neurons and for applying a local weighting scheme. The capabilities of this method are shown in several experiments, with special considerations given to changing target functions and changing input distributions. 2. To learn discrete actions LWIGNG is combined with Q-Learning forming the Q-LWIGNG method. The underlying GNG-algorithm has to be changed to take care of the special order of the input data in action learning. Q-LWIGNG achieves very good results in experiments with the pole balancing and the mountain car problems, and good results with the acrobot problem. 3. To learn continuous actions a REINFORCE algorithm is combined with LWIGNG forming the ReinforceGNG method. An actor-critic architecture is used for learning from delayed rewards. LWIGNG approximates both the state-value function and the policy. The policy is given by the situation dependent parameters of a normal distribution. ReinforceGNG is applied successfully to learn continuous actions of a simulated 2-wheeled robot which has to intercept a rolling ball under certain conditions

    Aktionenlernen mit Selbstorganisierenden Karten und Reinforcement Learning

    No full text
    This doctoral thesis deals with the development of a function approximator and its application to methods for learning discrete and continuous actions: 1. A general function approximator ? Locally Weighted Interpolating Growing Neural Gas (LWIGNG) ? is developed from Growing Neural Gas (GNG). The topological neighbourhood structure is used for calculating interpolations between neighbouring neurons and for applying a local weighting scheme. The capabilities of this method are shown in several experiments, with special considerations given to changing target functions and changing input distributions. 2. To learn discrete actions LWIGNG is combined with Q-Learning forming the Q-LWIGNG method. The underlying GNG-algorithm has to be changed to take care of the special order of the input data in action learning. Q-LWIGNG achieves very good results in experiments with the pole balancing and the mountain car problems, and good results with the acrobot problem. 3. To learn continuous actions a REINFORCE algorithm is combined with LWIGNG forming the ReinforceGNG method. An actor-critic architecture is used for learning from delayed rewards. LWIGNG approximates both the state-value function and the policy. The policy is given by the situation dependent parameters of a normal distribution. ReinforceGNG is applied successfully to learn continuous actions of a simulated 2-wheeled robot which has to intercept a rolling ball under certain conditions

    Agentenbasierte Simulation der Entstehung von Eigentumsnormen

    No full text

    Online Coach Description Mainz Rolling Brains 2002

    No full text

    Learning to Approach a Moving Ball with a Simulated Two-Wheeled Robot

    No full text
    We show how a two-wheeled robot can learn to approach a moving ball using Reinforcement Learning. The robot is controlled by setting the velocities of its two wheels. It has to reach the ball under certain conditions to be able to kick it towards a given target. In order to kick, the ball has to be in front of the robot. The robot also has to reach the ball at a certain angle in relation to the target, because the ball is always kicked in the direction from the center of the robot to the ball. The robot learns which velocity differences should be applied to the wheels: one of the wheels is set to the maximum velocity, the other one according to this difference.We apply a REINFORCE algorithm [1] in combination with some kind of extended Growing Neural Gas (GNG) [2] to learn these continuous actions. The resulting algorithm, called ReinforceGNG, is tested in a simulated environment with and without noise

    Implementation -- Service -- Effect: The ISE Metamodel of Critical Infrastructures

    No full text
    The ISE (Implementation - Service - Effect) metamodel is a general modelling framework for systems of critical infrastructures taking the various viewpoints from different sectors and professions into account. While not neglecting the technical basis, it provides the necessary abstractions needed for risk or emergency management of critical infrastructures in a complex environment. ISE supports an iterative modelling approach that allows ongoing refinement steps based on the analysis of the current model. This iterative approach is able to minimise some of the existing problems commonly found in critical infrastructure modelling and simulation. By focusing on the services provided by critical infrastructures it is possible to bridge the gap between the business view and the engineering view on critical infrastructures. The technical realisation of services is described in the implementation layer; the effects of the successful or unsuccessful delivery of services are described using the effect layer. A sound mathematical foundation provides the basis for all kinds of analysis starting with topological analysis of the dependency structures up to statistical analysis of results obtained by the simulation of complex agent-based models

    Aktionenlernen mit Selbstorganisierenden Karten und Reinforcement Learning

    No full text
    Die vorliegende Arbeit beschäftigt sich mit der Entwicklung eines Funktionsapproximators und dessen Verwendung in Verfahren zum Lernen von diskreten und kontinuierlichen Aktionen: 1. Ein allgemeiner Funktionsapproximator – Locally Weighted Interpolating Growing Neural Gas (LWIGNG) – wird auf Basis eines Wachsenden Neuralen Gases (GNG) entwickelt. Die topologische Nachbarschaft in der Neuronenstruktur wird verwendet, um zwischen benachbarten Neuronen zu interpolieren und durch lokale Gewichtung die Approximation zu berechnen. Die Leistungsfähigkeit des Ansatzes, insbesondere in Hinsicht auf sich verändernde Zielfunktionen und sich verändernde Eingabeverteilungen, wird in verschiedenen Experimenten unter Beweis gestellt. 2. Zum Lernen diskreter Aktionen wird das LWIGNG-Verfahren mit Q-Learning zur Q-LWIGNG-Methode verbunden. Dafür muss der zugrunde liegende GNG-Algorithmus abgeändert werden, da die Eingabedaten beim Aktionenlernen eine bestimmte Reihenfolge haben. Q-LWIGNG erzielt sehr gute Ergebnisse beim Stabbalance- und beim Mountain-Car-Problem und gute Ergebnisse beim Acrobot-Problem. 3. Zum Lernen kontinuierlicher Aktionen wird ein REINFORCE-Algorithmus mit LWIGNG zur ReinforceGNG-Methode verbunden. Dabei wird eine Actor-Critic-Architektur eingesetzt, um aus zeitverzögerten Belohnungen zu lernen. LWIGNG approximiert sowohl die Zustands-Wertefunktion als auch die Politik, die in Form von situationsabhängigen Parametern einer Normalverteilung repräsentiert wird. ReinforceGNG wird erfolgreich zum Lernen von Bewegungen für einen simulierten 2-rädrigen Roboter eingesetzt, der einen rollenden Ball unter bestimmten Bedingungen abfangen soll.This doctoral thesis deals with the development of a function approximator and its application to methods for learning discrete and continuous actions: 1. A general function approximator – Locally Weighted Interpolating Growing Neural Gas (LWIGNG) – is developed from Growing Neural Gas (GNG). The topological neighbourhood structure is used for calculating interpolations between neighbouring neurons and for applying a local weighting scheme. The capabilities of this method are shown in several experiments, with special considerations given to changing target functions and changing input distributions. 2. To learn discrete actions LWIGNG is combined with Q-Learning forming the Q-LWIGNG method. The underlying GNG-algorithm has to be changed to take care of the special order of the input data in action learning. Q-LWIGNG achieves very good results in experiments with the pole balancing and the mountain car problems, and good results with the acrobot problem. 3. To learn continuous actions a REINFORCE algorithm is combined with LWIGNG forming the ReinforceGNG method. An actor-critic architecture is used for learning from delayed rewards. LWIGNG approximates both the state-value function and the policy. The policy is given by the situation dependent parameters of a normal distribution. ReinforceGNG is applied successfully to learn continuous actions of a simulated 2-wheeled robot which has to intercept a rolling ball under certain conditions

    Towards a Holistic Metamodel for Systems of Critical Infrastructures

    No full text
    The Implementation-Service-Effect (ISE) metamodel describes Critical Infrastructures from different perspectives in a well-defined way to provide a sound basis for the analysis of their dependencies and interdependencie

    Implementation - service -effect : the ISE metamodel of critical infrastructures

    No full text
    The ISE (Implementation - Service - Effect) metamodel is a general modelling framework for systems of critical infrastructures taking the various viewpoints from different sectors and professions into account. While not neglecting the technical basis, it provides the necessary abstractions needed for risk or emergency management of critical infrastructures in a complex environment. ISE supports an iterative modelling approach that allows ongoing refinement steps based on the analysis of the current model. This iterative approach is able to minimise some of the existing problems commonly found in critical infrastructure modelling and simulation. By focusing on the services provided by critical infrastructures it is possible to bridge the gap between the business view and the engineering view on critical infrastructures. The technical realisation of services is described in the implementation layer; the effects of the successful or unsuccessful delivery of services are described using the effect layer. A sound mathematical foundation provides the basis for all kinds of analysis starting with topological analysis of the dependency structures up to statistical analysis of results obtained by the simulation of complex agent-based models
    corecore