71 research outputs found

    Adaptive and learning-based formation control of swarm robots

    Get PDF
    Autonomous aerial and wheeled mobile robots play a major role in tasks such as search and rescue, transportation, monitoring, and inspection. However, these operations are faced with a few open challenges including robust autonomy, and adaptive coordination based on the environment and operating conditions, particularly in swarm robots with limited communication and perception capabilities. Furthermore, the computational complexity increases exponentially with the number of robots in the swarm. This thesis examines two different aspects of the formation control problem. On the one hand, we investigate how formation could be performed by swarm robots with limited communication and perception (e.g., Crazyflie nano quadrotor). On the other hand, we explore human-swarm interaction (HSI) and different shared-control mechanisms between human and swarm robots (e.g., BristleBot) for artistic creation. In particular, we combine bio-inspired (i.e., flocking, foraging) techniques with learning-based control strategies (using artificial neural networks) for adaptive control of multi- robots. We first review how learning-based control and networked dynamical systems can be used to assign distributed and decentralized policies to individual robots such that the desired formation emerges from their collective behavior. We proceed by presenting a novel flocking control for UAV swarm using deep reinforcement learning. We formulate the flocking formation problem as a partially observable Markov decision process (POMDP), and consider a leader-follower configuration, where consensus among all UAVs is used to train a shared control policy, and each UAV performs actions based on the local information it collects. In addition, to avoid collision among UAVs and guarantee flocking and navigation, a reward function is added with the global flocking maintenance, mutual reward, and a collision penalty. We adapt deep deterministic policy gradient (DDPG) with centralized training and decentralized execution to obtain the flocking control policy using actor-critic networks and a global state space matrix. In the context of swarm robotics in arts, we investigate how the formation paradigm can serve as an interaction modality for artists to aesthetically utilize swarms. In particular, we explore particle swarm optimization (PSO) and random walk to control the communication between a team of robots with swarming behavior for musical creation

    Towards Agility: Definition, Benchmark and Design Considerations for Small, Quadrupedal Robots

    Get PDF
    Agile quadrupedal locomotion in animals and robots is yet to be fully understood, quantified or achieved. An intuitive notion of agility exists, but neither a concise definition nor a common benchmark can be found. Further, it is unclear, what minimal level of mechatronic complexity is needed for this particular aspect of locomotion. In this thesis we address and partially answer two primary questions: (Q1) What is agile legged locomotion (agility) and how can wemeasure it? (Q2) How can wemake agile legged locomotion with a robot a reality? To answer our first question, we define agility for robot and animal alike, building a common ground for this particular component of locomotion and introduce quantitative measures to enhance robot evaluation and comparison. The definition is based on and inspired by features of agility observed in nature, sports, and suggested in robotics related publications. Using the results of this observational and literature review, we build a novel and extendable benchmark of thirteen different tasks that implement our vision of quantitatively classifying agility. All scores are calculated from simple measures, such as time, distance, angles and characteristic geometric values for robot scaling. We normalize all unit-less scores to reach comparability between different systems. An initial implementation with available robots and real agility-dogs as baseline finalize our effort of answering the first question. Bio-inspired designs introducing and benefiting from morphological aspects present in nature allowed the generation of fast, robust and energy efficient locomotion. We use engineering tools and interdisciplinary knowledge transferred from biology to build low-cost robots able to achieve a certain level of agility and as a result of this addressing our second question. This iterative process led to a series of robots from Lynx over Cheetah-Cub-S, Cheetah-Cub-AL, and Oncilla to Serval, a compliant robot with actuated spine, high range of motion in all joints. Serval presents a high level of mobility at medium speeds. With many successfully implemented skills, using a basic kinematics-duplication from dogs (copying the foot-trajectories of real animals and replaying themotion on the robot using a mathematical interpretation), we found strengths to emphasize, weaknesses to correct and made Serval ready for future attempts to achieve even more agile locomotion. We calculated Servalâs agility scores with the result of it performing better than any of its predecessors. Our small, safe and low-cost robot is able to execute up to 6 agility tasks out of 13 with the potential to reachmore after extended development. Concluding, we like to mention that Serval is able to cope with step-downs, smooth, bumpy terrain and falling orthogonally to the ground

    Learning dynamic motor skills for terrestrial locomotion

    Get PDF
    The use of Deep Reinforcement Learning (DRL) has received significantly increased attention from researchers within the robotics field following the success of AlphaGo, which demonstrated the superhuman capabilities of deep reinforcement algorithms in terms of solving complex tasks by beating professional GO players. Since then, an increasing number of researchers have investigated the potential of using DRL to solve complex high-dimensional robotic tasks, such as legged locomotion, arm manipulation, and grasping, which are difficult tasks to solve using conventional optimization approaches. Understanding and recreating various modes of terrestrial locomotion has been of long-standing interest to roboticists. A large variety of applications, such as rescue missions, disaster responses and science expeditions, strongly demand mobility and versatility in legged locomotion to enable task completion. In order to create useful physical robots, it is necessary to design controllers to synthesize the complex locomotion behaviours observed in humans and other animals. In the past, legged locomotion was mainly achieved via analytical engineering approaches. However, conventional analytical approaches have their limitations, as they require relatively large amounts of human effort and knowledge. Machine learning approaches, such as DRL, require less human effort compared to analytical approaches. The project conducted for this thesis explores the feasibility of using DRL to acquire control policies comparable to, or better than, those acquired through analytical approaches while requiring less human effort. In this doctoral thesis, we developed a Multi-Expert Learning Architecture (MELA) that uses DRL to learn multi-skill control policies capable of synthesizing a diverse set of dynamic locomotion behaviours for legged robots. We first proposed a novel DRL framework for the locomotion of humanoid robots. The proposed learning framework is capable of acquiring robust and dynamic motor skills for humanoids, including balancing, walking, standing-up fall recovery. We subsequently improved upon the learning framework and design a novel multi-expert learning architecture that is capable of fusing multiple motor skills together in a seamless fashion and ultimately deploy this framework on a real quadrupedal robot. The successful deployment of learned control policies on a real quadrupedal robot demonstrates the feasibility of using an Artificial Intelligence (AI) based approach for real robot motion control

    Opinions and Outlooks on Morphological Computation

    Get PDF
    Morphological Computation is based on the observation that biological systems seem to carry out relevant computations with their morphology (physical body) in order to successfully interact with their environments. This can be observed in a whole range of systems and at many different scales. It has been studied in animals – e.g., while running, the functionality of coping with impact and slight unevenness in the ground is "delivered" by the shape of the legs and the damped elasticity of the muscle-tendon system – and plants, but it has also been observed at the cellular and even at the molecular level – as seen, for example, in spontaneous self-assembly. The concept of morphological computation has served as an inspirational resource to build bio-inspired robots, design novel approaches for support systems in health care, implement computation with natural systems, but also in art and architecture. As a consequence, the field is highly interdisciplinary, which is also nicely reflected in the wide range of authors that are featured in this e-book. We have contributions from robotics, mechanical engineering, health, architecture, biology, philosophy, and others

    From visuomotor control to latent space planning for robot manipulation

    Get PDF
    Deep visuomotor control is emerging as an active research area for robot manipulation. Recent advances in learning sensory and motor systems in an end-to-end manner have achieved remarkable performance across a range of complex tasks. Nevertheless, a few limitations restrict visuomotor control from being more widely adopted as the de facto choice when facing a manipulation task on a real robotic platform. First, imitation learning-based visuomotor control approaches tend to suffer from the inability to recover from an out-of-distribution state caused by compounding errors. Second, the lack of versatility in task definition limits skill generalisability. Finally, the training data acquisition process and domain transfer are often impractical. In this thesis, individual solutions are proposed to address each of these issues. In the first part, we find policy uncertainty to be an effective indicator of potential failure cases, in which the robot is stuck in out-of-distribution states. On this basis, we introduce a novel uncertainty-based approach to detect potential failure cases and a recovery strategy based on action-conditioned uncertainty predictions. Then, we propose to employ visual dynamics approximation to our model architecture to capture the motion of the robot arm instead of the static scene background, making it possible to learn versatile skill primitives. In the second part, taking inspiration from the recent progress in latent space planning, we propose a gradient-based optimisation method operating within the latent space of a deep generative model for motion planning. Our approach bypasses the traditional computational challenges encountered by established planning algorithms, and has the capability to specify novel constraints easily and handle multiple constraints simultaneously. Moreover, the training data comes from simple random motor-babbling of kinematically feasible robot states. Our real-world experiments further illustrate that our latent space planning approach can handle both open and closed-loop planning in challenging environments such as heavily cluttered or dynamic scenes. This leads to the first, to our knowledge, closed-loop motion planning algorithm that can incorporate novel custom constraints, and lays the foundation for more complex manipulation tasks

    In-silico behavior discovery system

    Get PDF

    On discovering and learning structure under limited supervision

    Full text link
    Les formes, les surfaces, les événements et les objets (vivants et non vivants) constituent le monde. L'intelligence des agents naturels, tels que les humains, va au-delà de la simple reconnaissance de formes. Nous excellons à construire des représentations et à distiller des connaissances pour comprendre et déduire la structure du monde. Spécifiquement, le développement de telles capacités de raisonnement peut se produire même avec une supervision limitée. D'autre part, malgré son développement phénoménal, les succès majeurs de l'apprentissage automatique, en particulier des modèles d'apprentissage profond, se situent principalement dans les tâches qui ont accès à de grands ensembles de données annotées. Dans cette thèse, nous proposons de nouvelles solutions pour aider à combler cette lacune en permettant aux modèles d'apprentissage automatique d'apprendre la structure et de permettre un raisonnement efficace en présence de tâches faiblement supervisés. Le thème récurrent de la thèse tente de s'articuler autour de la question « Comment un système perceptif peut-il apprendre à organiser des informations sensorielles en connaissances utiles sous une supervision limitée ? » Et il aborde les thèmes de la géométrie, de la composition et des associations dans quatre articles distincts avec des applications à la vision par ordinateur (CV) et à l'apprentissage par renforcement (RL). Notre première contribution ---Pix2Shape---présente une approche basée sur l'analyse par synthèse pour la perception. Pix2Shape exploite des modèles génératifs probabilistes pour apprendre des représentations 3D à partir d'images 2D uniques. Le formalisme qui en résulte nous offre une nouvelle façon de distiller l'information d'une scène ainsi qu'une représentation puissantes des images. Nous y parvenons en augmentant l'apprentissage profond non supervisé avec des biais inductifs basés sur la physique pour décomposer la structure causale des images en géométrie, orientation, pose, réflectance et éclairage. Notre deuxième contribution ---MILe--- aborde les problèmes d'ambiguïté dans les ensembles de données à label unique tels que ImageNet. Il est souvent inapproprié de décrire une image avec un seul label lorsqu'il est composé de plus d'un objet proéminent. Nous montrons que l'intégration d'idées issues de la littérature linguistique cognitive et l'imposition de biais inductifs appropriés aident à distiller de multiples descriptions possibles à l'aide d'ensembles de données aussi faiblement étiquetés. Ensuite, nous passons au paradigme d'apprentissage par renforcement, et considérons un agent interagissant avec son environnement sans signal de récompense. Notre troisième contribution ---HaC--- est une approche non supervisée basée sur la curiosité pour apprendre les associations entre les modalités visuelles et tactiles. Cela aide l'agent à explorer l'environnement de manière autonome et à utiliser davantage ses connaissances pour s'adapter aux tâches en aval. La supervision dense des récompenses n'est pas toujours disponible (ou n'est pas facile à concevoir), dans de tels cas, une exploration efficace est utile pour générer un comportement significatif de manière auto-supervisée. Pour notre contribution finale, nous abordons l'information limitée contenue dans les représentations obtenues par des agents RL non supervisés. Ceci peut avoir un effet néfaste sur la performance des agents lorsque leur perception est basée sur des images de haute dimension. Notre approche a base de modèles combine l'exploration et la planification sans récompense pour affiner efficacement les modèles pré-formés non supervisés, obtenant des résultats comparables à un agent entraîné spécifiquement sur ces tâches. Il s'agit d'une étape vers la création d'agents capables de généraliser rapidement à plusieurs tâches en utilisant uniquement des images comme perception.Shapes, surfaces, events, and objects (living and non-living) constitute the world. The intelligence of natural agents, such as humans is beyond pattern recognition. We excel at building representations and distilling knowledge to understand and infer the structure of the world. Critically, the development of such reasoning capabilities can occur even with limited supervision. On the other hand, despite its phenomenal development, the major successes of machine learning, in particular, deep learning models are primarily in tasks that have access to large annotated datasets. In this dissertation, we propose novel solutions to help address this gap by enabling machine learning models to learn the structure and enable effective reasoning in the presence of weakly supervised settings. The recurring theme of the thesis tries to revolve around the question of "How can a perceptual system learn to organize sensory information into useful knowledge under limited supervision?" And it discusses the themes of geometry, compositions, and associations in four separate articles with applications to computer vision (CV) and reinforcement learning (RL). Our first contribution ---Pix2Shape---presents an analysis-by-synthesis based approach(also referred to as inverse graphics) for perception. Pix2Shape leverages probabilistic generative models to learn 3D-aware representations from single 2D images. The resulting formalism allows us to perform a novel view synthesis of a scene and produce powerful representations of images. We achieve this by augmenting unsupervised learning with physically based inductive biases to decompose a scene structure into geometry, pose, reflectance and lighting. Our Second contribution ---MILe--- addresses the ambiguity issues in single-labeled datasets such as ImageNet. It is often inappropriate to describe an image with a single label when it is composed of more than one prominent object. We show that integrating ideas from Cognitive linguistic literature and imposing appropriate inductive biases helps in distilling multiple possible descriptions using such weakly labeled datasets. Next, moving into the RL setting, we consider an agent interacting with its environment without a reward signal. Our third Contribution ---HaC--- is a curiosity based unsupervised approach to learning associations between visual and tactile modalities. This aids the agent to explore the environment in an analogous self-guided fashion and further use this knowledge to adapt to downstream tasks. In the absence of reward supervision, intrinsic movitivation is useful to generate meaningful behavior in a self-supervised manner. In our final contribution, we address the representation learning bottleneck in unsupervised RL agents that has detrimental effect on the performance on high-dimensional pixel based inputs. Our model-based approach combines reward-free exploration and planning to efficiently fine-tune unsupervised pre-trained models, achieving comparable results to task-specific baselines. This is a step towards building agents that can generalize quickly on more than a single task using image inputs alone

    Wildland fire management. Volume 2: Wildland fire control 1985-1995

    Get PDF
    The preliminary design of a satellite plus computer earth resources information system is proposed for potential uses in fire prevention and control in the wildland fire community. Suggested are satellite characteristics, sensor characteristics, discrimination algorithms, data communication techniques, data processing requirements, display characteristics, and costs in achieving the integrated wildland fire information system

    Effective offline training and efficient online adaptation

    Get PDF
    Developing agents that behave intelligently in the world is an open challenge in machine learning. Desiderata for such agents are efficient exploration, maximizing long term utility, and the ability to effectively leverage prior data to solve new tasks. Reinforcement learning (RL) is an approach that is predicated on learning by directly interacting with an environment through trial-and-error, and presents a way for us to train and deploy such agents. Moreover, combining RL with powerful neural network function approximators – a sub-field known as “deep RL” – has shown evidence towards achieving this goal. For instance, deep RL has yielded agents that can play Go at superhuman levels, improve the efficiency of microchip designs, and learn complex novel strategies for controlling nuclear fusion reactions. A key issue that stands in the way of deploying deep RL is poor sample efficiency. Concretely, while it is possible to train effective agents using deep RL, the key successes have largely been in environments where we have access to large amounts of online interaction, often through the use of simulators. However, in many real-world problems, we are confronted with scenarios where samples are expensive to obtain. As has been alluded to, one way to alleviate this issue is through accessing some prior data, often termed “offline data”, which can accelerate how quickly we learn such agents, such as leveraging exploratory data to prevent redundant deployments, or using human-expert data to quickly guide agents towards promising behaviors and beyond. However, the best way to incorporate this data into existing deep RL algorithms is not straightforward; naïvely pre-training using RL algorithms on this offline data, a paradigm called “offline RL” as a starting point for subsequent learning is often detrimental. Moreover, it is unclear how to explicitly derive useful behaviors online that are positively influenced by this offline pre-training. With these factors in mind, this thesis follows a 3-pronged strategy towards improving sample-efficiency in deep RL. First, we investigate effective pre-training on offline data. Then, we tackle the online problem, looking at efficient adaptation to environments when operating purely online. Finally, we conclude with hybrid strategies that use offline data to explicitly augment policies when acting online
    corecore