2,162 research outputs found

    Scalable Multi-Agent Reinforcement Learning for Warehouse Logistics with Robotic and Human Co-Workers

    Full text link
    We envision a warehouse in which dozens of mobile robots and human pickers work together to collect and deliver items within the warehouse. The fundamental problem we tackle, called the order-picking problem, is how these worker agents must coordinate their movement and actions in the warehouse to maximise performance (e.g. order throughput). Established industry methods using heuristic approaches require large engineering efforts to optimise for innately variable warehouse configurations. In contrast, multi-agent reinforcement learning (MARL) can be flexibly applied to diverse warehouse configurations (e.g. size, layout, number/types of workers, item replenishment frequency), as the agents learn through experience how to optimally cooperate with one another. We develop hierarchical MARL algorithms in which a manager assigns goals to worker agents, and the policies of the manager and workers are co-trained toward maximising a global objective (e.g. pick rate). Our hierarchical algorithms achieve significant gains in sample efficiency and overall pick rates over baseline MARL algorithms in diverse warehouse configurations, and substantially outperform two established industry heuristics for order-picking systems

    Towards Cooperative MARL in Industrial Domains

    Get PDF

    Heterogeneous Multi-Robot Collaboration for Coverage Path Planning in Partially Known Dynamic Environments

    Get PDF
    This research presents a cooperation strategy for a heterogeneous group of robots that comprises two Unmanned Aerial Vehicles (UAVs) and one Unmanned Ground Vehicles (UGVs) to perform tasks in dynamic scenarios. This paper defines specific roles for the UAVs and UGV within the framework to address challenges like partially known terrains and dynamic obstacles. The UAVs are focused on aerial inspections and mapping, while UGV conducts ground-level inspections. In addition, the UAVs can return and land at the UGV base, in case of a low battery level, to perform hot swapping so as not to interrupt the inspection process. This research mainly emphasizes developing a robust Coverage Path Planning (CPP) algorithm that dynamically adapts paths to avoid collisions and ensure efficient coverage. The Wavefront algorithm was selected for the two-dimensional offline CPP. All robots must follow a predefined path generated by the offline CPP. The study also integrates advanced technologies like Neural Networks (NN) and Deep Reinforcement Learning (DRL) for adaptive path planning for both robots to enable real-time responses to dynamic obstacles. Extensive simulations using a Robot Operating System (ROS) and Gazebo platforms were conducted to validate the approach considering specific real-world situations, that is, an electrical substation, in order to demonstrate its functionality in addressing challenges in dynamic environments and advancing the field of autonomous robots.The authors also would like to thank their home Institute, CEFET/RJ, the federal Brazilian research agencies CAPES (code 001) and CNPq, and the Rio de Janeiro research agency, FAPERJ, for supporting this work.info:eu-repo/semantics/publishedVersio
    • …
    corecore