4,581 research outputs found

    Bi-Manual Block Assembly via Sim-to-Real Reinforcement Learning

    Full text link
    Most successes in robotic manipulation have been restricted to single-arm gripper robots, whose low dexterity limits the range of solvable tasks to pick-and-place, inser-tion, and object rearrangement. More complex tasks such as assembly require dual and multi-arm platforms, but entail a suite of unique challenges such as bi-arm coordination and collision avoidance, robust grasping, and long-horizon planning. In this work we investigate the feasibility of training deep reinforcement learning (RL) policies in simulation and transferring them to the real world (Sim2Real) as a generic methodology for obtaining performant controllers for real-world bi-manual robotic manipulation tasks. As a testbed for bi-manual manipulation, we develop the U-Shape Magnetic BlockAssembly Task, wherein two robots with parallel grippers must connect 3 magnetic blocks to form a U-shape. Without manually-designed controller nor human demonstrations, we demonstrate that with careful Sim2Real considerations, our policies trained with RL in simulation enable two xArm6 robots to solve the U-shape assembly task with a success rate of above90% in simulation, and 50% on real hardware without any additional real-world fine-tuning. Through careful ablations,we highlight how each component of the system is critical for such simple and successful policy learning and transfer,including task specification, learning algorithm, direct joint-space control, behavior constraints, perception and actuation noises, action delays and action interpolation. Our results present a significant step forward for bi-arm capability on real hardware, and we hope our system can inspire future research on deep RL and Sim2Real transfer of bi-manualpolicies, drastically scaling up the capability of real-world robot manipulators.Comment: Our accompanying project webpage can be found at: https://sites.google.com/view/u-shape-block-assembly. arXiv admin note: substantial text overlap with arXiv:2203.0827

    How to Certify Machine Learning Based Safety-critical Systems? A Systematic Literature Review

    Full text link
    Context: Machine Learning (ML) has been at the heart of many innovations over the past years. However, including it in so-called 'safety-critical' systems such as automotive or aeronautic has proven to be very challenging, since the shift in paradigm that ML brings completely changes traditional certification approaches. Objective: This paper aims to elucidate challenges related to the certification of ML-based safety-critical systems, as well as the solutions that are proposed in the literature to tackle them, answering the question 'How to Certify Machine Learning Based Safety-critical Systems?'. Method: We conduct a Systematic Literature Review (SLR) of research papers published between 2015 to 2020, covering topics related to the certification of ML systems. In total, we identified 217 papers covering topics considered to be the main pillars of ML certification: Robustness, Uncertainty, Explainability, Verification, Safe Reinforcement Learning, and Direct Certification. We analyzed the main trends and problems of each sub-field and provided summaries of the papers extracted. Results: The SLR results highlighted the enthusiasm of the community for this subject, as well as the lack of diversity in terms of datasets and type of models. It also emphasized the need to further develop connections between academia and industries to deepen the domain study. Finally, it also illustrated the necessity to build connections between the above mention main pillars that are for now mainly studied separately. Conclusion: We highlighted current efforts deployed to enable the certification of ML based software systems, and discuss some future research directions.Comment: 60 pages (92 pages with references and complements), submitted to a journal (Automated Software Engineering). Changes: Emphasizing difference traditional software engineering / ML approach. Adding Related Works, Threats to Validity and Complementary Materials. Adding a table listing papers reference for each section/subsection

    Fault Recovery in Swarm Robotics Systems using Learning Algorithms

    Get PDF
    When faults occur in swarm robotic systems they can have a detrimental effect on collective behaviours, to the point that failed individuals may jeopardise the swarm's ability to complete its task. Although fault tolerance is a desirable property of swarm robotic systems, fault recovery mechanisms have not yet been thoroughly explored. Individual robots may suffer a variety of faults, which will affect collective behaviours in different ways, therefore a recovery process is required that can cope with many different failure scenarios. In this thesis, we propose a novel approach for fault recovery in robot swarms that uses Reinforcement Learning and Self-Organising Maps to select the most appropriate recovery strategy for any given scenario. The learning process is evaluated in both centralised and distributed settings. Additionally, we experimentally evaluate the performance of this approach in comparison to random selection of fault recovery strategies, using simulated collective phototaxis, aggregation and foraging tasks as case studies. Our results show that this machine learning approach outperforms random selection, and allows swarm robotic systems to recover from faults that would otherwise prevent the swarm from completing its mission. This work builds upon existing research in fault detection and diagnosis in robot swarms, with the aim of creating a fully fault-tolerant swarm capable of long-term autonomy

    UAV swarm path planning with reinforcement learning for field prospecting

    Get PDF
    [Abstract] There has been steady growth in the adoption of Unmanned Aerial Vehicle (UAV) swarms by operators due to their time and cost benefits. However, this kind of system faces an important problem, which is the calculation of many optimal paths for each UAV. Solving this problem would allow control of many UAVs without human intervention while saving battery between recharges and performing several tasks simultaneously. The main aim is to develop a Reinforcement Learning based system capable of calculating the optimal flight path for a UAV swarm. This method stands out for its ability to learn through trial and error, allowing the model to adjust itself. The aim of these paths is to achieve full coverage of an overflight area for tasks such as field prospection, regardless of map size and the number of UAVs in the swarm. It is not necessary to establish targets or to have any previous knowledge other than the given map. Experiments have been conducted to determine whether it is optimal to establish a single control for all UAVs in the swarm or a control for each UAV. The results show that it is better to use one control for all UAVs because of the shorter flight time. In addition, the flight time is greatly affected by the size of the map. The results give starting points for future research, such as finding the optimal map size for each situation

    Self-Evolving Integrated Vertical Heterogeneous Networks

    Full text link
    6G and beyond networks tend towards fully intelligent and adaptive design in order to provide better operational agility in maintaining universal wireless access and supporting a wide range of services and use cases while dealing with network complexity efficiently. Such enhanced network agility will require developing a self-evolving capability in designing both the network architecture and resource management to intelligently utilize resources, reduce operational costs, and achieve the coveted quality of service (QoS). To enable this capability, the necessity of considering an integrated vertical heterogeneous network (VHetNet) architecture appears to be inevitable due to its high inherent agility. Moreover, employing an intelligent framework is another crucial requirement for self-evolving networks to deal with real-time network optimization problems. Hence, in this work, to provide a better insight on network architecture design in support of self-evolving networks, we highlight the merits of integrated VHetNet architecture while proposing an intelligent framework for self-evolving integrated vertical heterogeneous networks (SEI-VHetNets). The impact of the challenges associated with SEI-VHetNet architecture, on network management is also studied considering a generalized network model. Furthermore, the current literature on network management of integrated VHetNets along with the recent advancements in artificial intelligence (AI)/machine learning (ML) solutions are discussed. Accordingly, the core challenges of integrating AI/ML in SEI-VHetNets are identified. Finally, the potential future research directions for advancing the autonomous and self-evolving capabilities of SEI-VHetNets are discussed.Comment: 25 pages, 5 figures, 2 table

    Q-learning Based System for Path Planning with UAV Swarms in Obstacle Environments

    Get PDF
    Path Planning methods for autonomous control of Unmanned Aerial Vehicle (UAV) swarms are on the rise because of all the advantages they bring. There are more and more scenarios where autonomous control of multiple UAVs is required. Most of these scenarios present a large number of obstacles, such as power lines or trees. If all UAVs can be operated autonomously, personnel expenses can be decreased. In addition, if their flight paths are optimal, energy consumption is reduced. This ensures that more battery time is left for other operations. In this paper, a Reinforcement Learning based system is proposed for solving this problem in environments with obstacles by making use of Q-Learning. This method allows a model, in this particular case an Artificial Neural Network, to self-adjust by learning from its mistakes and achievements. Regardless of the size of the map or the number of UAVs in the swarm, the goal of these paths is to ensure complete coverage of an area with fixed obstacles for tasks, like field prospecting. Setting goals or having any prior information aside from the provided map is not required. For experimentation, five maps of different sizes with different obstacles were used. The experiments were performed with different number of UAVs. For the calculation of the results, the number of actions taken by all UAVs to complete the task in each experiment is taken into account. The lower the number of actions, the shorter the path and the lower the energy consumption. The results are satisfactory, showing that the system obtains solutions in fewer movements the more UAVs there are. For a better presentation, these results have been compared to another state-of-the-art approach

    Coordination and Self-Adaptive Communication Primitives for Low-Power Wireless Networks

    Get PDF
    The Internet of Things (IoT) is a recent trend where objects are augmented with computing and communication capabilities, often via low-power wireless radios. The Internet of Things is an enabler for a connected and more sustainable modern society: smart grids are deployed to improve energy production and consumption, wireless monitoring systems allow smart factories to detect faults early and reduce waste, while connected vehicles coordinate on the road to ensure our safety and save fuel. Many recent IoT applications have stringent requirements for their wireless communication substrate: devices must cooperate and coordinate, must perform efficiently under varying and sometimes extreme environments, while strict deadlines must be met. Current distributed coordination algorithms have high overheads and are unfit to meet the requirements of today\u27s wireless applications, while current wireless protocols are often best-effort and lack the guarantees provided by well-studied coordination solutions. Further, many communication primitives available today lack the ability to adapt to dynamic environments, and are often tuned during their design phase to reach a target performance, rather than be continuously updated at runtime to adapt to reality.In this thesis, we study the problem of efficient and low-latency consensus in the context of low-power wireless networks, where communication is unreliable and nodes can fail, and we investigate the design of a self-adaptive wireless stack, where the communication substrate is able to adapt to changes to its environment. We propose three new communication primitives: Wireless Paxos brings fault-tolerant consensus to low-power wireless networking, STARC is a middleware for safe vehicular coordination at intersections, while Dimmer builds on reinforcement learning to provide adaptivity to low-power wireless networks. We evaluate in-depth each primitive on testbed deployments and we provide an open-source implementation to enable their use and improvement by the community
    corecore