590 research outputs found

    Adaptive and learning-based formation control of swarm robots

    Get PDF
    Autonomous aerial and wheeled mobile robots play a major role in tasks such as search and rescue, transportation, monitoring, and inspection. However, these operations are faced with a few open challenges including robust autonomy, and adaptive coordination based on the environment and operating conditions, particularly in swarm robots with limited communication and perception capabilities. Furthermore, the computational complexity increases exponentially with the number of robots in the swarm. This thesis examines two different aspects of the formation control problem. On the one hand, we investigate how formation could be performed by swarm robots with limited communication and perception (e.g., Crazyflie nano quadrotor). On the other hand, we explore human-swarm interaction (HSI) and different shared-control mechanisms between human and swarm robots (e.g., BristleBot) for artistic creation. In particular, we combine bio-inspired (i.e., flocking, foraging) techniques with learning-based control strategies (using artificial neural networks) for adaptive control of multi- robots. We first review how learning-based control and networked dynamical systems can be used to assign distributed and decentralized policies to individual robots such that the desired formation emerges from their collective behavior. We proceed by presenting a novel flocking control for UAV swarm using deep reinforcement learning. We formulate the flocking formation problem as a partially observable Markov decision process (POMDP), and consider a leader-follower configuration, where consensus among all UAVs is used to train a shared control policy, and each UAV performs actions based on the local information it collects. In addition, to avoid collision among UAVs and guarantee flocking and navigation, a reward function is added with the global flocking maintenance, mutual reward, and a collision penalty. We adapt deep deterministic policy gradient (DDPG) with centralized training and decentralized execution to obtain the flocking control policy using actor-critic networks and a global state space matrix. In the context of swarm robotics in arts, we investigate how the formation paradigm can serve as an interaction modality for artists to aesthetically utilize swarms. In particular, we explore particle swarm optimization (PSO) and random walk to control the communication between a team of robots with swarming behavior for musical creation

    Bio-inspired Dynamic Control Systems with Time Delays

    Get PDF
    The world around us exhibits a rich and ever changing environment of startling, bewildering and fascinating complexity. Almost everything is never as simple as it seems, but through the chaos we may catch fleeting glimpses of the mechanisms within. Throughout the history of human endeavour we have mimicked nature to harness it for our own ends. Our attempts to develop truly autonomous and intelligent machines have however struggled with the limitations of our human ability. This has encouraged some to shirk this responsibility and instead model biological processes and systems to do it for us. This Thesis explores the introduction of continuous time delays into biologically inspired dynamic control systems. We seek to exploit rich temporal dynamics found in physical and biological systems for modelling complex or adaptive behaviour through the artificial evolution of networks to control robots. Throughout, arguments have been presented for the modelling of delays not only to better represent key facets of physical and biological systems, but to increase the computational potential of such systems for the synthesis of control. The thorough investigation of the dynamics of small delayed networks with a wide range of time delays has been undertaken, with a detailed mathematical description of the fixed points of the system and possible oscillatory modes developed to fully describe the behaviour of a single node. Exploration of the behaviour for even small delayed networks illustrates the range of complex behaviour possible and guides the development of interesting solutions. To further exploit the potential of the rich dynamics in such systems, a novel approach to the 3D simulation of locomotory robots has been developed focussing on minimising the computational cost. To verify this simulation tool a simple quadruped robot was developed and the motion of the robot when undergoing a manually designed gait evaluated. The results displayed a high degree of agreement between the simulation and laser tracker data, verifying the accuracy of the model developed. A new model of a dynamic system which includes continuous time delays has been introduced, and its utility demonstrated in the evolution of networks for the solution of simple learning behaviours. A range of methods has been developed for determining the time delays, including the novel concept of representing the time delays as related to the distance between nodes in a spatial representation of the network. The application of these tools to a range of examples has been explored, from Gene Regulatory Networks (GRNs) to robot control and neural networks. The performance of these systems has been compared and contrasted with the efficacy of evolutionary runs for the same task over the whole range of network and delay types. It has been shown that delayed dynamic neural systems are at least as capable as traditional Continuous Time Recurrent Neural Networks (CTRNNs) and show significant performance improvements in the control of robot gaits. Experiments in adaptive behaviour, where there is not such a direct link between the enhanced system dynamics and performance, showed no such discernible improvement. Whilst we hypothesise that the ability of such delayed networks to generate switched pattern generating nodes may be useful in Evolutionary Robotics (ER) this was not borne out here. The spatial representation of delays was shown to be more efficient for larger networks, however these techniques restricted the search to lower complexity solutions or led to a significant falloff as the network structure becomes more complex. This would suggest that for anything other than a simple genotype, the direct method for encoding delays is likely most appropriate. With proven benefits for robot locomotion and the open potential for adaptive behaviour delayed dynamic systems for evolved control remain an interesting and promising field in complex systems research

    Neural dynamics of social behavior : An evolutionary and mechanistic perspective on communication, cooperation, and competition among situated agents

    Get PDF
    Social behavior can be found on almost every level of life, ranging from microorganisms to human societies. However, explaining the evolutionary emergence of cooperation, communication, or competition still challenges modern biology. The most common approaches to this problem are based on game-theoretic models. The problem is that these models often assume fixed and limited rules and actions that individual agents can choose from, which excludes the dynamical nature of the mechanisms that underlie the behavior of living systems. So far, there exists a lack of convincing modeling approaches to investigate the emergence of social behavior from a mechanistic and evolutionary perspective. Instead of studying animals, the methodology employed in this thesis combines several aspects from alternative approaches to study behavior in a rather novel way. Robotic models are considered as individual agents which are controlled by recurrent neural networks representing non-linear dynamical system. The topology and parameters of these networks are evolved following an open-ended evolution approach, that is, individuals are not evaluated on high-level goals or optimized for specific functions. Instead, agents compete for limited resources to enhance their chance of survival. Further, there is no restriction with respect to how individuals interact with their environment or with each other. As its main objective, this thesis aims at a complementary approach for studying not only the evolution, but also the mechanisms of basic forms of communication. For this purpose it can be shown that a robot does not necessarily have to be as complex as a human, not even as complex as a bacterium. The strength of this approach is that it deals with rather simple, yet complete and situated systems, facing similar real world problems as animals do, such as sensory noise or dynamically changing environments. The experimental part of this thesis is substantiated in a five-part examination. First, self-organized aggregation patterns are discussed. Second, the advantages of evolving decentralized control with respect to behavioral robustness and flexibility is demonstrated. Third, it is shown that only minimalistic local acoustic communication is required to coordinate the behavior of large groups. This is followed by investigations of the evolutionary emergence of communication. Finally, it is shown how already evolved communicative behavior changes during further evolution when a population is confronted with competition about limited environmental resources. All presented experiments entail thorough analysis of the dynamical mechanisms that underlie evolved communication systems, which has not been done so far in the context of cooperative behavior. This framework leads to a better understanding of the relation between intrinsic neurodynamics and observable agent-environment interactions. The results discussed here provide a new perspective on the evolution of cooperation because they deal with aspects largely neglected in traditional approaches, aspects such as embodiment, situatedness, and the dynamical nature of the mechanisms that underlie behavior. For the first time, it can be demonstrated how noise influences specific signaling strategies and that versatile dynamics of very small-scale neural networks embedded in sensory-motor feedback loops give rise to sophisticated forms of communication such as signal coordination, cooperative intraspecific communication, and, most intriguingly, aggressive interspecific signaling. Further, the results demonstrate the development of counteractive niche construction based on a modification of communication strategies which generates an evolutionary feedback resulting in an active reduction of selection pressure, which has not been shown so far. Thus, the novel findings presented here strongly support the complementary nature of robotic experiments to study the evolution and mechanisms of communication and cooperation.</p

    Goal-Based Control and Planning in Biped Locomotion Using Computational Intelligence Methods

    Get PDF
    Este trabajo explora la aplicación de campos neuronales, a tareas de control dinámico en el domino de caminata bípeda. En una primera aproximación, se propone una arquitectura de control que usa campos neuronales en 1D. Esta arquitectura de control es evaluada en el problema de estabilidad para el péndulo invertido de carro y barra, usado como modelo simplificado de caminata bípeda. El controlador por campos neuronales, parametrizado tanto manualmente como usando un algoritmo evolutivo (EA), se compara con una arquitectura de control basada en redes neuronales recurrentes (RNN), también parametrizada por por un EA. El controlador por campos neuronales parametrizado por EA se desempeña mejor que el parametrizado manualmente, y es capaz de recuperarse rápidamente de las condiciones iniciales más problemáticas. Luego, se desarrolla una arquitectura extendida de control y planificación usando campos neurales en 2D, y se aplica al problema caminata bípeda simple (SBW). Para ello se usa un conjunto de valores _óptimos para el parámetro de control, encontrado previamente usando algoritmos evolutivos. El controlador óptimo por campos neuronales obtenido se compara con el controlador lineal propuesto por Wisse et al., y a un controlador _optimo tabular que usa los mismos parámetros óptimos. Si bien los controladores propuestos para el problema SBW implementan una estrategia activa de control, se aproximan de manera más cercana a la caminata dinámica pasiva (PDW) que trabajos previos, disminuyendo la acción de control acumulada. / Abstract. This work explores the application of neural fields to dynamical control tasks in the domain of biped walking. In a first approximation, a controller architecture that uses 1D neural fields is proposed. This controller architecture is evaluated using the stability problem for the cart-and-pole inverted pendulum, as a simplified biped walking model. The neural field controller is compared, parameterized both manually and using an evolutionary algorithm (EA), to a controller architecture based on a recurrent neural neuron (RNN), also parametrized by an EA. The non-evolved neural field controller performs better than the RNN controller. Also, the evolved neural field controller performs better than the non-evolved one and is able to recover fast from worst-case initial conditions. Then, an extended control and planning architecture using 2D neural fields is developed and applied to the SBW problem. A set of optimal parameter values, previously found using an EA, is used as parameters for neural field controller. The optimal neural field controller is compared to the linear controller proposed by Wisse et al., and to a table-lookup controller using the same optimal parameters. While being an active control strategy, the controllers proposed here for the SBW problem approach more closely Passive Dynamic Walking (PDW) than previous works, by diminishing the cumulative control action.Maestrí

    Reservoir Computing: computation with dynamical systems

    Get PDF
    In het onderzoeksgebied Machine Learning worden systemen onderzocht die kunnen leren op basis van voorbeelden. Binnen dit onderzoeksgebied zijn de recurrente neurale netwerken een belangrijke deelgroep. Deze netwerken zijn abstracte modellen van de werking van delen van de hersenen. Zij zijn in staat om zeer complexe temporele problemen op te lossen maar zijn over het algemeen zeer moeilijk om te trainen. Recentelijk zijn een aantal gelijkaardige methodes voorgesteld die dit trainingsprobleem elimineren. Deze methodes worden aangeduid met de naam Reservoir Computing. Reservoir Computing combineert de indrukwekkende rekenkracht van recurrente neurale netwerken met een eenvoudige trainingsmethode. Bovendien blijkt dat deze trainingsmethoden niet beperkt zijn tot neurale netwerken, maar kunnen toegepast worden op generieke dynamische systemen. Waarom deze systemen goed werken en welke eigenschappen bepalend zijn voor de prestatie is evenwel nog niet duidelijk. Voor dit proefschrift is onderzoek gedaan naar de dynamische eigenschappen van generieke Reservoir Computing systemen. Zo is experimenteel aangetoond dat de idee van Reservoir Computing ook toepasbaar is op niet-neurale netwerken van dynamische knopen. Verder is een maat voorgesteld die gebruikt kan worden om het dynamisch regime van een reservoir te meten. Tenslotte is een adaptatieregel geïntroduceerd die voor een breed scala reservoirtypes de dynamica van het reservoir kan afregelen tot het gewenste dynamisch regime. De technieken beschreven in dit proefschrift zijn gedemonstreerd op verschillende academische en ingenieurstoepassingen

    Annotated Bibliography: Anticipation

    Get PDF

    Advances in Reinforcement Learning

    Get PDF
    Reinforcement Learning (RL) is a very dynamic area in terms of theory and application. This book brings together many different aspects of the current research on several fields associated to RL which has been growing rapidly, producing a wide variety of learning algorithms for different applications. Based on 24 Chapters, it covers a very broad variety of topics in RL and their application in autonomous systems. A set of chapters in this book provide a general overview of RL while other chapters focus mostly on the applications of RL paradigms: Game Theory, Multi-Agent Theory, Robotic, Networking Technologies, Vehicular Navigation, Medicine and Industrial Logistic

    Adaptive control of compliant robots with Reservoir Computing

    Get PDF
    In modern society, robots are increasingly used to handle dangerous, repetitive and/or heavy tasks with high precision. Because of the nature of the tasks, either being dangerous, high precision or simply repetitive, robots are usually constructed with high torque motors and sturdy materials, that makes them dangerous for humans to handle. In a car-manufacturing company, for example, a large cage is placed around the robot’s workspace that prevents humans from entering its vicinity. In the last few decades, efforts have been made to improve human-robot interaction. Often the movement of robots is characterized as not being smooth and clearly dividable into sub-movements. This makes their movement rather unpredictable for humans. So, there exists an opportunity to improve the motion generation of robots to enhance human-robot interaction. One interesting research direction is that of imitation learning. Here, human motions are recorded and demonstrated to the robot. Although the robot is able to reproduce such movements, it cannot be generalized to other situations. Therefore, a dynamical system approach is proposed where the recorded motions are embedded into the dynamics of the system. Shaping these nonlinear dynamics, according to recorded motions, allows for dynamical system to generalize beyond demonstration. As a result, the robot can generate motions of other situations not included in the recorded human demonstrations. In this dissertation, a Reservoir Computing approach is used to create a dynamical system in which such demonstrations are embedded. Reservoir Computing systems are Recurrent Neural Network-based approaches that are efficiently trained by considering only the training of the readout connections and retaining all other connections of such a network unchanged given their initial randomly chosen values. Although they have been used to embed periodic motions before, they were extended to embed discrete motions, or both. This work describes how such a motion pattern-generating system is built, investigates the nature of the underlying dynamics and evaluates their robustness in the face of perturbations. Additionally, a dynamical system approach to obstacle avoidance is proposed that is based on vector fields in the presence of repellers. This technique can be used to extend the motion abilities of the robot without need for changing the trained Motion Pattern Generator (MPG). Therefore, this approach can be applied in real-time on any system that generates a certain movement trajectory. Assume that the MPG system is implemented on an industrial robotic arm, similar to the ones used in a car factory. Even though the obstacle avoidance strategy presented is able to modify the generated motion of the robot’s gripper in such a way that it avoids obstacles, it does not guarantee that other parts of the robot cannot collide with a human. To prevent this, engineers have started to use advanced control algorithms that measure the amount of torque that is applied on the robot. This allows the robot to be aware of external perturbations. However, it turns out that, even with fast control loops, the adaptation to compensate for a sudden perturbation, is too slow to prevent high interaction forces. To reduce such forces, researchers started to use mechanical elements that are passively compliant (e.g., springs) and light-weight flexible materials to construct robots. Although such compliant robots are much safer and inherently energy efficient to use, their control becomes much harder. Most control approaches use model information about the robot (e.g., weight distribution and shape). However, when constructing a compliant robot it is hard to determine the dynamics of these materials. Therefore, a model-free adaptive control framework is proposed that assumes no prior knowledge about the robot. By interacting with the robot it learns an inverse robot model that is used as controller. The more it interacts, the better the control be- comes. Appropriately, this framework is called Inverse Modeling Adaptive (IMA) control framework. I have evaluated the IMA controller’s tracking ability on sev- eral tasks, investigating its model independence and stability. Furthermore, I have shown its fast learning ability and comparable performance to taskspecific designed controllers. Given both the MPG and IMA controllers, it is possible to improve the inter- actability of a compliant robot in a human-friendly environment. When the robot is to perform human-like motions for a large set of tasks, we need to demonstrate motion examples of all these tasks. However, biological research concerning the motion generation of animals and humans revealed that a limited set of motion patterns, called motion primitives, are modulated and combined to generate advanced motor/motion skills that humans and animals exhibit. Inspired by these interesting findings, I investigate if a single motion primitive indeed can be modulated to achieve a desired motion behavior. By some elementary experiments, where an MPG is controlled by an IMA controller, a proof of concept is presented. Furthermore, a general hierarchy is introduced that describes how a robot can be controlled in a biology-inspired manner. I also investigated how motion primitives can be combined to produce a desired motion. However, I was unable to get more advanced implementations to work. The results of some simple experiments are presented in the appendix. Another approach I investigated assumes that the primitives themselves are undefined. Instead, only a high-level description is given, which describes that every primitive on average should contribute equally, while still allowing for a single primitive to specialize in a part of the motion generation. Without defining the behavior of a primitive, only a set of untrained IMA controllers is used of which each will represent a single primitive. As a result of the high-level heuristic description, the task space is tiled into sub-regions in an unsupervised manner. Resulting in controllers that indeed represent a part of the motion generation. I have applied this Modular Architecture with Control Primitives (MACOP) on an inverse kinematic learning task and investigated the emerged primitives. Thanks to the tiling of the task space, it becomes possible to control redundant systems, because redundant solutions can be spread over several control primitives. Within each sub region of the task space, a specific control primitive is more accurate than in other regions allowing for the task complexity to be distributed over several less complex tasks. Finally, I extend the use of an IMA-controller, which is tracking controller, to the control of under-actuated systems. By using a sample-based planning algorithm it becomes possible to explore the system dynamics in which a path to a desired state can be planned. Afterwards, MACOP is used to incorporate feedback and to learn the necessary control commands corresponding to the planned state space trajectory, even if it contains errors. As a result, the under-actuated control of a cart pole system was achieved. Furthermore, I presented the concept of a simulation based control framework that allows the learning of the system dynamics, planning and feedback control iteratively and simultaneously
    • …
    corecore