292 research outputs found

    Viability in State-Action Space: Connecting Morphology, Control, and Learning

    Get PDF
    Wie können wir Robotern ermöglichen, modellfrei und direkt auf der Hardware zu lernen? Das maschinelle Lernen nimmt als Standardwerkzeug im Arsenal des Robotikers seinen Platz ein. Es gibt jedoch einige offene Fragen, wie man die Kontrolle über physikalische Systeme lernen kann. Diese Arbeit gibt zwei Antworten auf diese motivierende Frage. Das erste ist ein formales Mittel, um die inhärente Robustheit eines gegebenen Systemdesigns zu quantifizieren, bevor der Controller oder das Lernverfahren entworfen wird. Dies unterstreicht die Notwendigkeit, sowohl das Hardals auch das Software-Design eines Roboters zu berücksichtigen, da beide Aspekte in der Systemdynamik untrennbar miteinander verbunden sind. Die zweite ist die Formalisierung einer Sicherheitsmass, die modellfrei erlernt werden kann. Intuitiv zeigt diese Mass an, wie leicht ein Roboter Fehlschläge vermeiden kann. Auf diese Weise können Roboter unbekannte Umgebungen erkunden und gleichzeitig Ausfälle vermeiden. Die wichtigsten Beiträge dieser Dissertation basieren sich auf der Viabilitätstheorie. Viabilität bietet eine alternative Sichtweise auf dynamische Systeme: Anstatt sich auf die Konvergenzeigenschaften eines Systems in Richtung Gleichgewichte zu konzentrieren, wird der Fokus auf Menge von Fehlerzuständen und die Fähigkeit des Systems, diese zu vermeiden, verlagert. Diese Sichtweise eignet sich besonders gut für das Studium der Lernkontrolle an Robotern, da Stabilität im Sinne einer Konvergenz während des Lernprozesses selten gewährleistet werden kann. Der Begriff der Viabilität wird formal auf den Zustand-Aktion-Raum erweitert, mit Viabilitätsmengen von Staat-Aktionspaaren. Eine über diese Mengen definierte Mass ermöglicht eine quantifizierte Bewertung der Robustheit, die für die Familie aller fehlervermeidenden Regler gilt, und ebnet den Weg für ein sicheres, modellfreies Lernen. Die Arbeit beinhaltet auch zwei kleinere Beiträge. Der erste kleine Beitrag ist eine empirische Demonstration der Shaping durch ausschliessliche Modifikation der Systemdynamik. Diese Demonstration verdeutlicht die Bedeutung der Robustheit gegenüber Fehlern für die Lernkontrolle: Ausfälle können nicht nur Schäden verursachen, sondern liefern in der Regel auch keine nützlichen Gradienteninformationen für den Lernprozess. Der zweite kleine Beitrag ist eine Studie über die Wahl der Zustandsinitialisierungen. Entgegen der Intuition und der üblichen Praxis zeigt diese Studie, dass es zuverlässiger sein kann, das System gelegentlich aus einem Zustand zu initialisieren, der bekanntermassen unkontrollierbar ist.How can we enable robots to learn control model-free and directly on hardware? Machine learning is taking its place as a standard tool in the roboticist’s arsenal. However, there are several open questions on how to learn control for physical systems. This thesis provides two answers to this motivating question. The first is a formal means to quantify the inherent robustness of a given system design, prior to designing the controller or learning agent. This emphasizes the need to consider both the hardware and software design of a robot, which are inseparably intertwined in the system dynamics. The second is the formalization of a safety-measure, which can be learned model-free. Intuitively, this measure indicates how easily a robot can avoid failure, and enables robots to explore unknown environments while avoiding failures. The main contributions of this dissertation are based on viability theory. Viability theory provides a slightly unconventional view of dynamical systems: instead of focusing on a system’s convergence properties towards equilibria, the focus is shifted towards sets of failure states and the system’s ability to avoid these sets. This view is particularly well suited to studying learning control in robots, since stability in the sense of convergence can rarely be guaranteed during the learning process. The notion of viability is formally extended to state-action space, with viable sets of state-action pairs. A measure defined over these sets allows a quantified evaluation of robustness valid for the family of all failure-avoiding control policies, and also paves the way for enabling safe model-free learning. The thesis also includes two minor contributions. The first minor contribution is an empirical demonstration of shaping by exclusively modifying the system dynamics. This demonstration highlights the importance of robustness to failures for learning control: not only can failures cause damage, but they typically do not provide useful gradient information for the learning process. The second minor contribution is a study on the choice of state initializations. Counter to intuition and common practice, this study shows it can be more reliable to occasionally initialize the system from a state that is known to be uncontrollable

    Using MapReduce Streaming for Distributed Life Simulation on the Cloud

    Get PDF
    Distributed software simulations are indispensable in the study of large-scale life models but often require the use of technically complex lower-level distributed computing frameworks, such as MPI. We propose to overcome the complexity challenge by applying the emerging MapReduce (MR) model to distributed life simulations and by running such simulations on the cloud. Technically, we design optimized MR streaming algorithms for discrete and continuous versions of Conway’s life according to a general MR streaming pattern. We chose life because it is simple enough as a testbed for MR’s applicability to a-life simulations and general enough to make our results applicable to various lattice-based a-life models. We implement and empirically evaluate our algorithms’ performance on Amazon’s Elastic MR cloud. Our experiments demonstrate that a single MR optimization technique called strip partitioning can reduce the execution time of continuous life simulations by 64%. To the best of our knowledge, we are the first to propose and evaluate MR streaming algorithms for lattice-based simulations. Our algorithms can serve as prototypes in the development of novel MR simulation algorithms for large-scale lattice-based a-life models.https://digitalcommons.chapman.edu/scs_books/1014/thumbnail.jp

    Sample-Efficient Reinforcement Learning of Robot Control Policies in the Real World

    Get PDF
    abstract: The goal of reinforcement learning is to enable systems to autonomously solve tasks in the real world, even in the absence of prior data. To succeed in such situations, reinforcement learning algorithms collect new experience through interactions with the environment to further the learning process. The behaviour is optimized by maximizing a reward function, which assigns high numerical values to desired behaviours. Especially in robotics, such interactions with the environment are expensive in terms of the required execution time, human involvement, and mechanical degradation of the system itself. Therefore, this thesis aims to introduce sample-efficient reinforcement learning methods which are applicable to real-world settings and control tasks such as bimanual manipulation and locomotion. Sample efficiency is achieved through directed exploration, either by using dimensionality reduction or trajectory optimization methods. Finally, it is demonstrated how data-efficient reinforcement learning methods can be used to optimize the behaviour and morphology of robots at the same time.Dissertation/ThesisDoctoral Dissertation Computer Science 201

    Bio-inspired Dynamic Control Systems with Time Delays

    Get PDF
    The world around us exhibits a rich and ever changing environment of startling, bewildering and fascinating complexity. Almost everything is never as simple as it seems, but through the chaos we may catch fleeting glimpses of the mechanisms within. Throughout the history of human endeavour we have mimicked nature to harness it for our own ends. Our attempts to develop truly autonomous and intelligent machines have however struggled with the limitations of our human ability. This has encouraged some to shirk this responsibility and instead model biological processes and systems to do it for us. This Thesis explores the introduction of continuous time delays into biologically inspired dynamic control systems. We seek to exploit rich temporal dynamics found in physical and biological systems for modelling complex or adaptive behaviour through the artificial evolution of networks to control robots. Throughout, arguments have been presented for the modelling of delays not only to better represent key facets of physical and biological systems, but to increase the computational potential of such systems for the synthesis of control. The thorough investigation of the dynamics of small delayed networks with a wide range of time delays has been undertaken, with a detailed mathematical description of the fixed points of the system and possible oscillatory modes developed to fully describe the behaviour of a single node. Exploration of the behaviour for even small delayed networks illustrates the range of complex behaviour possible and guides the development of interesting solutions. To further exploit the potential of the rich dynamics in such systems, a novel approach to the 3D simulation of locomotory robots has been developed focussing on minimising the computational cost. To verify this simulation tool a simple quadruped robot was developed and the motion of the robot when undergoing a manually designed gait evaluated. The results displayed a high degree of agreement between the simulation and laser tracker data, verifying the accuracy of the model developed. A new model of a dynamic system which includes continuous time delays has been introduced, and its utility demonstrated in the evolution of networks for the solution of simple learning behaviours. A range of methods has been developed for determining the time delays, including the novel concept of representing the time delays as related to the distance between nodes in a spatial representation of the network. The application of these tools to a range of examples has been explored, from Gene Regulatory Networks (GRNs) to robot control and neural networks. The performance of these systems has been compared and contrasted with the efficacy of evolutionary runs for the same task over the whole range of network and delay types. It has been shown that delayed dynamic neural systems are at least as capable as traditional Continuous Time Recurrent Neural Networks (CTRNNs) and show significant performance improvements in the control of robot gaits. Experiments in adaptive behaviour, where there is not such a direct link between the enhanced system dynamics and performance, showed no such discernible improvement. Whilst we hypothesise that the ability of such delayed networks to generate switched pattern generating nodes may be useful in Evolutionary Robotics (ER) this was not borne out here. The spatial representation of delays was shown to be more efficient for larger networks, however these techniques restricted the search to lower complexity solutions or led to a significant falloff as the network structure becomes more complex. This would suggest that for anything other than a simple genotype, the direct method for encoding delays is likely most appropriate. With proven benefits for robot locomotion and the open potential for adaptive behaviour delayed dynamic systems for evolved control remain an interesting and promising field in complex systems research

    Natural Selection, Adaptive Evolution and Diversity in Computational Ecosystems

    Get PDF
    The central goal of this thesis is to provide additional criteria towards implementing open-ended evolution in an artificial system. Methods inspired by biological evolution are frequently applied to generate autonomous agents too complex to design by hand. Despite substantial progress in the area of evolutionary computation, additional efforts are needed to identify a coherent set of requirements for a system capable of exhibiting open-ended evolutionary dynamics. The thesis provides an extensive discussion of existing models and of the major considerations for designing a computational model of evolution by natural selection. Thus, the work in this thesis constitutes a further step towards determining the requirements for such a system and introduces a concrete implementation of an artificial evolution system to evaluate the developed suggestions. The proposed system improves upon existing models with respect to easy interpretability of agent behaviour, high structural freedom, and a low-level sensor and effector model to allow numerous long-term evolutionary gradients. In a series of experiments, the evolutionary dynamics of the system are examined against the set objectives and, where appropriate, compared with existing systems. Typical agent behaviours are introduced to convey a general overview of the system dynamics. These behaviours are related to properties of the respective agent populations and their evolved morphologies. It is shown that an intuitive classification of observed behaviours coincides with a more formal classification based on morphology. The evolutionary dynamics of the system are evaluated and shown to be unbounded according to the classification provided by Bedau and Packard’s measures of evolutionary activity. Further, it is analysed how observed behavioural complexity relates to the complexity of the agent-side mechanisms subserving these behaviours. It is shown that for the concrete definition of complexity applied, the average complexity continually increases for extended periods of evolutionary time. In combination, these two findings show how the observed behaviours are the result of an ongoing and lasting adaptive evolutionary process as opposed to being artifacts of the seeding process. Finally, the effect of variation in the system on the diversity of evolved behaviour is investigated. It is shown that coupling individual survival and reproductive success can restrict the available evolutionary trajectories in more than the trivial sense of removing another dimension, and conversely, decoupling individual survival from reproductive success can increase the number of evolutionary trajectories. The effect of different reproductive mechanisms is contrasted with that of variation in environmental conditions. The diversity of evolved strategies turns out to be sensitive to the reproductive mechanism while being remarkably robust to the variation of environmental conditions. These findings emphasize the importance of being explicit about the abstractions and assumptions underlying an artificial evolution system, particularly if the system is intended to model aspects of biological evolution

    Co-evolution of morphology and control in developing structures

    Get PDF
    The continuous need to increase the efficiency of technical systems requires the utilization of complex adaptive systems which operate in environments which are not completely predictable. Reasons are often random nature of the environment and the fact that not all phenomena which influence the performance of the system can be explained in full detail. As a consequence, the developer often gets confronted with the task to design an adaptive system with the lack of prior knowledge about the problem at hand. The design of adaptive systems, which react autonomously to changes in their environment, requires the coordinated generation of sensors, providing information about the environment, actuators which change the current state of the system and signal processing structures thereby generating suitable reactions to changed conditions. Within the scope of the thesis, the new system growth method has been introduced. It is based on the evolutionary optimization design technique, which can automatically produce the efficient systems with optimal initially non-defined configuration. The final solutions produced by the novel growth method have low dimensional perception, actuation and signal processing structures optimally adjusted to each other during combined evolutionary optimization process. The co-evolutionary system design approach has been realized by the concurrent development and gradual complexification of the sensory, actuation and corresponding signal processing systems during entire optimization. The evolution of flexible system configuration is performed with the standard evolutionary strategies by means of adaptable representation of variable length and therewith variable complexity of the system which it can represent in the further optimization progress. The co-evolution of morphology and control of complex adaptive systems has been successfully performed for the examples of a complex aerodynamic problem of a morphing wing and a virtual intelligent autonomously driving vehicle. The thesis demonstrates the applicability of the concurrent evolutionary design of the optimal morphological configuration, presented as sensory and actuation systems, and the corresponding optimal system controller. Meanwhile, it underlines the potentials of direct genotype – phenotype encodings for the design of complex engineering real-world applications. The thesis argues that often better, cheaper, more robust and adaptive systems can be developed if the entire system is the design target rather than its separate functional parts, like sensors, actuators or controller structure. The simulation results demonstrate that co-evolutionary methods are able to generate systems which can optimally adapt to the unpredicted environmental conditions while at the same time shedding light on the precise synchronization of all functional system parts during its co-developmental process

    Genetic terrain programming

    Get PDF
    Dissertação apresentada à Universidad de Extremadura para obtenção do Diploma de Estudios Avanzados, orientada por Francisco Fernandéz de Vega e Carlos Cotta.Nowadays there are a wide range of techniques for terrain generation, but all of them are focused on providing realistic terrains, often neglecting other aspects (e.g., aesthetic appeal or presence of desired features). This thesis presents a new technique, GTP (Genetic Terrain Programming), based on evolutionary design with Genetic Programming. The GTP technique consists of a guided evolution, by means of Interactive Evolution, accordingly to a speci c desired terrain feature or aesthetic appeal. This technique can yield both aesthetic and real TPs (Terrain Programmes) which are capable of gen- erating di erent terrains, but consistently with the same features. TPs are also scale invariant, meaning that terrain features will be preserved across di erent LODs (Levels Of Details), which allows the use of low LODs dur- ing the evolutionary phase without compromising results. Additionally, the resulting TPs can be incorporated in video games, like any other procedural technique, to generate terrains. Furthermore, by way of resorting to several TPs to compose the full landscape, it is possible to control some localised terrain features, thus eliminating the main drawback of traditional procedu- ral techniques. The combination of GP with evolutionary art systems also diminish the e ort and time required to create complex terrains when com- pared to modeling techniques. Moreover, the results are not dependent on the designer's skills

    Evolutionary Legged Robotics

    Get PDF
    Due to the technological advance, robotic systems become more and more interesting for industrial and home applications. Popular examples are given by robotic lawn mower, robot vacuum cleaner, and package drones. Beside the toy industry, legged robots are not as popular, although they have some clear advantages compared to wheeled systems. With their flexibility concerning the locomotion, they are able to adapt their walking pattern to different environments. For instance they can walk over obstacles and gaps or climb over rubble and stairs. Another possible advantage could be a redundancy for locomotion. A faulty motor in one limb could be compensated by other motors in the kinematic chain. As well, multiple failing legs can be compensated by an adapted walking pattern. Compared to this, the more complex mechatronic systems represent a major challenge to the construction and the control. This thesis is dedicated to the control of complex walking robots. Genetic algorithms are applied to generate walking patterns for different robots. The evolutionary development of walking patterns is done in a simulation software. Results of various approaches are transferred and tested on existing systems which have been developed at RIC/DFKI. Different robotic systems are used to evaluate the generality of the applied methods. Eventually, a method is developed that can be utilized, with a few system specific modifications, for a variety of legged robots. As basis for the development and investigation of several methods, software tools are designed to generalize the application of applying genetic algorithms to legged locomotion. These tools include a simulation environment, a behavior representation, a genetic algorithm and a learning and benchmark framework. The simulation environment is adapted to the behavior of real robotic systems via reference experiments. In addition, the simulation is extended by a foot contact model for loose surfaces. The evaluation of the genetic algorithm is done on several benchmark problems and compared to three existing algorithms. This thesis contributes to the state of the art in many areas. The developed methodology can easily be applied to several complex robotic systems due to its transferability. The genetic algorithm and the hierarchical behavior representation provide a new opportunity to control the generation of the offspring in an evolutionary process. In addition, the developed software tools are an important contribution for their respective research fields

    The Evolution of Complexity in Autonomous Robots

    Get PDF
    Evolutionary robotics–the use of evolutionary algorithms to automate the production of autonomous robots–has been an active area of research for two decades. However, previous work in this domain has been limited by the simplicity of the evolved robots and the task environments within which they are able to succeed. This dissertation aims to address these challenges by developing techniques for evolving more complex robots. Particular focus is given to methods which evolve not only the control policies of manually-designed robots, but instead evolve both the control policy and physical form of the robot. These techniques are presented along with their application to investigating previously unexplored relationships between the complexity of evolving robots and the task environments within which they evolve

    Evolutionary robotics in high altitude wind energy applications

    Get PDF
    Recent years have seen the development of wind energy conversion systems that can exploit the superior wind resource that exists at altitudes above current wind turbine technology. One class of these systems incorporates a flying wing tethered to the ground which drives a winch at ground level. The wings often resemble sports kites, being composed of a combination of fabric and stiffening elements. Such wings are subject to load dependent deformation which makes them particularly difficult to model and control. Here we apply the techniques of evolutionary robotics i.e. evolution of neural network controllers using genetic algorithms, to the task of controlling a steerable kite. We introduce a multibody kite simulation that is used in an evolutionary process in which the kite is subject to deformation. We demonstrate how discrete time recurrent neural networks that are evolved to maximise line tension fly the kite in repeated looping trajectories similar to those seen using other methods. We show that these controllers are robust to limited environmental variation but show poor generalisation and occasional failure even after extended evolution. We show that continuous time recurrent neural networks (CTRNNs) can be evolved that are capable of flying appropriate repeated trajectories even when the length of the flying lines are changing. We also show that CTRNNs can be evolved that stabilise kites with a wide range of physical attributes at a given position in the sky, and systematically add noise to the simulated task in order to maximise the transferability of the behaviour to a real world system. We demonstrate how the difficulty of the task must be increased during the evolutionary process to deal with this extreme variability in small increments. We describe the development of a real world testing platform on which the evolved neurocontrollers can be tested
    • …
    corecore