22,353 research outputs found

    Learning for Multi-robot Cooperation in Partially Observable Stochastic Environments with Macro-actions

    Get PDF
    This paper presents a data-driven approach for multi-robot coordination in partially-observable domains based on Decentralized Partially Observable Markov Decision Processes (Dec-POMDPs) and macro-actions (MAs). Dec-POMDPs provide a general framework for cooperative sequential decision making under uncertainty and MAs allow temporally extended and asynchronous action execution. To date, most methods assume the underlying Dec-POMDP model is known a priori or a full simulator is available during planning time. Previous methods which aim to address these issues suffer from local optimality and sensitivity to initial conditions. Additionally, few hardware demonstrations involving a large team of heterogeneous robots and with long planning horizons exist. This work addresses these gaps by proposing an iterative sampling based Expectation-Maximization algorithm (iSEM) to learn polices using only trajectory data containing observations, MAs, and rewards. Our experiments show the algorithm is able to achieve better solution quality than the state-of-the-art learning-based methods. We implement two variants of multi-robot Search and Rescue (SAR) domains (with and without obstacles) on hardware to demonstrate the learned policies can effectively control a team of distributed robots to cooperate in a partially observable stochastic environment.Comment: Accepted to the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2017

    GUARDIANS final report

    Get PDF
    Emergencies in industrial warehouses are a major concern for firefghters. The large dimensions together with the development of dense smoke that drastically reduces visibility, represent major challenges. The Guardians robot swarm is designed to assist fire fighters in searching a large warehouse. In this report we discuss the technology developed for a swarm of robots searching and assisting fire fighters. We explain the swarming algorithms which provide the functionality by which the robots react to and follow humans while no communication is required. Next we discuss the wireless communication system, which is a so-called mobile ad-hoc network. The communication network provides also one of the means to locate the robots and humans. Thus the robot swarm is able to locate itself and provide guidance information to the humans. Together with the re ghters we explored how the robot swarm should feed information back to the human fire fighter. We have designed and experimented with interfaces for presenting swarm based information to human beings

    Evolutionary strategies in swarm robotics controllers

    Get PDF
    Nowadays, Unmanned Vehicles (UV) are widespread around the world. Most of these vehicles require a great level of human control, and mission success is reliant on this dependency. Therefore, it is important to use machine learning techniques that will train the robotic controllers to automate the control, making the process more efficient. Evolutionary strategies may be the key to having robust and adaptive learning in robotic systems. Many studies involving UV systems and evolutionary strategies have been conducted in the last years, however, there are still research gaps that need to be addressed, such as the reality gap. The reality gap occurs when controllers trained in simulated environments fail to be transferred to real robots. This work proposes an approach for solving robotic tasks using realistic simulation and using evolutionary strategies to train controllers. The chosen setup is easily scalable for multirobot systems or swarm robots. In this thesis, the simulation architecture and setup are presented, including the drone simulation model and software. The drone model chosen for the simulations is available in the real world and widely used, such as the software and flight control unit. This relevant factor makes the transition to reality smoother and easier. Controllers using behavior trees were evolved using a developed evolutionary algorithm, and several experiments were conducted. Results demonstrated that it is possible to evolve a robotic controller in realistic simulation environments, using a simulated drone model that exists in the real world, and also the same flight control unit and operating system that is generally used in real world experiments.Atualmente os Veículos Não Tripulados (VNT) encontram-se difundidos por todo o Mundo. A maioria destes veículos requerem um elevado controlo humano, e o sucesso das missões está diretamente dependente deste fator. Assim, é importante utilizar técnicas de aprendizagem automática que irão treinar os controladores dos VNT, de modo a automatizar o controlo, tornando o processo mais eficiente. As estratégias evolutivas podem ser a chave para uma aprendizagem robusta e adaptativa em sistemas robóticos. Vários estudos têm sido realizados nos últimos anos, contudo, existem lacunas que precisam de ser abordadas, tais como o reality gap. Este facto ocorre quando os controladores treinados em ambientes simulados falham ao serem transferidos para VNT reais. Este trabalho propõe uma abordagem para a resolução de missões com VNT, utilizando um simulador realista e estratégias evolutivas para treinar controladores. A arquitetura escolhida é facilmente escalável para sistemas com múltiplos VNT. Nesta tese, é apresentada a arquitetura e configuração do ambiente de simulação, incluindo o modelo e software de simulação do VNT. O modelo de VNT escolhido para as simulações é um modelo real e amplamente utilizado, assim como o software e a unidade de controlo de voo. Este fator é relevante e torna a transição para a realidade mais suave. É desenvolvido um algoritmo evolucionário para treinar um controlador, que utiliza behavior trees, e realizados diversos testes. Os resultados demonstram que é possível evoluir um controlador em ambientes de simulação realistas, utilizando um VNT simulado mas real, assim como utilizando as mesmas unidades de controlo de voo e software que são amplamente utilizados em ambiente real

    Advanced EVA system design requirements study

    Get PDF
    The results are presented of a study to identify specific criteria regarding space station extravehicular activity system (EVAS) hardware requirements. Key EVA design issues include maintainability, technology readiness, LSS volume vs. EVA time available, suit pressure/cabin pressure relationship and productivity effects, crew autonomy, integration of EVA as a program resource, and standardization of task interfaces. A variety of DOD EVA systems issues were taken into consideration. Recommendations include: (1) crew limitations, not hardware limitations; (2) capability to perform all of 15 generic missions; (3) 90 days on-orbit maintainability with 50 percent duty cycle as minimum; and (4) use by payload sponsors of JSC document 10615A plus a Generic Tool Kit and Specialized Tool Kit description. EVA baseline design requirements and criteria, including requirements of various subsystems, are outlined. Space station/EVA system interface requirements and EVA accommodations are discussed in the areas of atmosphere composition and pressure, communications, data management, logistics, safe haven, SS exterior and interior requirements, and SS airlock

    MODELING OF INNOVATIVE LIGHTER-THAN-AIR UAV FOR LOGISTICS, SURVEILLANCE AND RESCUE OPERATIONS

    Get PDF
    An unmanned aerial vehicle (UAV) is an aircraft that can operate without the presence of pilots, either through remote control or automated systems. The first part of the dissertation provides an overview of the various types of UAVs and their design features. The second section delves into specific experiences using UAVs as part of an automated monitoring system to identify potential problems such as pipeline leaks or equipment damage by conducting airborne surveys.Lighter-than-air UAVs, such as airships, can be used for various applications, from aerial photography, including surveying terrain, monitoring an area for security purposes and gathering information about weather patterns to surveillance. The third part reveals the applications of UAVs for assisting in search and rescue operations in disaster situations and transporting natural gas. Using PowerSim software, a model of airship behaviour was created to analyze the sprint-and-drift concept and study methods of increasing the operational time of airships while having a lower environmental impact when compared to a constantly switched-on engine. The analysis provided a reliable percentage of finding the victim during patrolling operations, although it did not account for victim behaviour. The study has also shown that airships may serve as a viable alternative to pipeline transportation for natural gas. The technology has the potential to revolutionize natural gas transportation, optimizing efficiency and reducing environmental impact. Additionally, airships have a unique advantage in accessing remote and otherwise inaccessible areas, providing significant benefits in the energy sector. The employment of this technology was studied to be effective in specific scenarios, and it will be worth continuing to study it for a positive impact on society and the environment

    802.11 Payload Iterative decoding between multiple transmission attempts

    Get PDF
    Abstract. The institute of electrical and electronics engineers (IEEE) 802.11 standard specifies widely used technology for wireless local area networks (WLAN). Standard specifies high-performance physical and media access control (MAC) layers for a distributed network but lacks an effective hybrid automatic repeat request (HARQ). Currently, the standard specifies forward error correction (FEC), error detection (ED), and automatic repeat request (ARQ), but in case of decoding errors, the previously transmitted information is not used when decoding the retransmitted packet. This is called Type 1 HARQ. Type 1 HARQ uses received energy inefficiently, but the simple implementation makes it an attractive solution. Unfortunately, research applying more sophisticated HARQ schemes on top of IEEE 802.11 is limited. In this Master’s Thesis, a novel HARQ technology based on packet retransmissions that can be decoded in a turbo-like manner, keeping as much as possible compatibility with vanilla 802.11, is proposed. The proposed technology is simulated with both the IEEE 802.11 code and with the robust, efficient and smart communication in unpredictable environments (RESCUE) code. An additional interleaver is added before the convolutional encoder in the proposed technology, interleaving either the whole frame or only the payload to enable effective iterative decoding. For received frames, turbo-like iterations are done between initially transmitted packet copy and retransmissions. Results are compared against the non-iterative combining method maximizing signal-to-noise ratio (SNR), maximum ratio combining (MRC). The main design goal for this technology is to maintain compatibility with the 802.11 standard while allowing efficient HARQ. Other design goals are range extension, higher throughput, and better performance in terms of bit error rate (BER) and frame error rate (FER). This technology can be used for range extension at low SNR range and may provide up to 4 dB gain at medium SNR range compared to MRC. At high SNR, technology can reduce the penalty from retransmission allowing higher average modulation and coding scheme (MCS). However, these gains come with the cost of computational complexity from the iterative decoding. The main limiting factors of the proposed technology are decoding errors in the header and the scrambler area, and resource-hungry-processing. In simulations, perfect synchronization and packet detection is assumed, but in reality, especially at low SNR, packet detection and synchronization would be challenging. 802.11 pakettien iteratiivinen dekoodaus lähetysten välillä. Tiivistelmä. IEEE 802.11-standardi määrittelee yleisesti käytetyn teknologian langattomille lähiverkoille. Standardissa määritellään tehokas fyysinen- ja verkkoliityntäkerros hajautetuille verkoille, mutta siitä puuttuu tehokas yhdistetty automaattinen uudelleenlähetys. Nykyisellään standardi määrittelee virheenkorjaavan koodin, virheellisen paketin tunnistuksen sekä automaattisen uudelleenlähetyksen, mutta aikaisemmin lähetetyn paketin informaatiota ei käytetä hyväksi uudelleenlähetystilanteessa. Tämä menetelmä tunnetaan tyypin yksi yhdistettynä automaattisena uudelleenlähetyksenä. Tyypin yksi yhdistetty automaattinen uudelleenlähetys käyttää vastaanotettua signaalia tehottomasti, mutta yksinkertaisuus tekee siitä houkuttelevan vaihtoehdon. Valitettavasti edistyneempien uudelleenlähetysvaihtoehtojen tutkimusta 802.11-standardiin on rajoitetusti. Tässä diplomityössä esitellään uusi yhdistetty uudelleenlähetysteknologia, joka pohjautuu pakettien uudelleenlähetykseen, sallien turbo-tyylisen dekoodaamisen säilyttäen mahdollisimman hyvän taaksepäin yhteensopivuutta alkuperäisen 802.11-standardin kanssa. Tämä teknologia on simuloitu käyttäen sekä 802.11- että nk. RESCUE-virheenkorjauskoodia. Teknologiassa uusi lomittaja on lisätty konvoluutio-enkoodaajan eteen, sallien tehokkaan iteratiivisen dekoodaamisen, lomittaen joko koko paketin tai ainoastaan hyötykuorman. Vastaanotetuille paketeille tehdään turbo-tyyppinen iteraatio alkuperäisen vastaanotetun kopion ja uudelleenlähetyksien välillä. Tuloksia vertaillaan eiiteratiiviseen yhdistämismenetelmään, maksimisuhdeyhdistelyyn, joka maksimoi yhdistetyn signaali-kohinasuhteen. Tärkeimpänä suunnittelutavoitteena tässä työssä on tehokas uudelleenlähetysmenetelmä, joka ylläpitää taaksepäin yhteensopivuutta IEEE 802.11-standardin kanssa. Muita tavoitteita ovat kantaman lisäys, nopeampi yhteys ja matalampi bitti- ja pakettivirhesuhde. Kehitettyä teknologiaa voidaan käyttää kantaman lisäykseen matalan signaalikohinasuhteen vallitessa ja se on jopa 4 dB parempi kohtuullisella signaalikohinasuhteella kuin maksimisuhdeyhdistely. Korkealla signaali-kohinasuhteella teknologiaa voidaan käyttää pienentämään häviötä epäonnistuneesta paketinlähetyksestä ja täten sallien korkeamman modulaatio-koodiasteen käyttämisen. Valitettavasti nämä parannukset tulevat kasvaneen laskennallisen monimutkaisuuden kustannuksella, johtuen iteratiivisesta dekoodaamisesta. Isoimmat rajoittavat tekijät teknologian käytössä ovat dekoodausvirheet otsikossa ja datamuokkaimen siemenessä. Tämän lisäksi käyttöä rajoittaa resurssisyöppö prosessointi. Simulaatioissa oletetaan täydellinen synkronisointi, mutta todellisuudessa, erityisesti matalalla signaali-kohinasuhteella, paketin tunnistus ja synkronointi voivat olla haasteellisia

    A deep reinforcement learning based homeostatic system for unmanned position control

    Get PDF
    Deep Reinforcement Learning (DRL) has been proven to be capable of designing an optimal control theory by minimising the error in dynamic systems. However, in many of the real-world operations, the exact behaviour of the environment is unknown. In such environments, random changes cause the system to reach different states for the same action. Hence, application of DRL for unpredictable environments is difficult as the states of the world cannot be known for non-stationary transition and reward functions. In this paper, a mechanism to encapsulate the randomness of the environment is suggested using a novel bio-inspired homeostatic approach based on a hybrid of Receptor Density Algorithm (an artificial immune system based anomaly detection application) and a Plastic Spiking Neuronal model. DRL is then introduced to run in conjunction with the above hybrid model. The system is tested on a vehicle to autonomously re-position in an unpredictable environment. Our results show that the DRL based process control raised the accuracy of the hybrid model by 32%.N/
    corecore