19 research outputs found

    DACOOP-A: Decentralized Adaptive Cooperative Pursuit via Attention

    Full text link
    Integrating rule-based policies into reinforcement learning promises to improve data efficiency and generalization in cooperative pursuit problems. However, most implementations do not properly distinguish the influence of neighboring robots in observation embedding or inter-robot interaction rules, leading to information loss and inefficient cooperation. This paper proposes a cooperative pursuit algorithm named Decentralized Adaptive COOperative Pursuit via Attention (DACOOP-A) by empowering reinforcement learning with artificial potential field and attention mechanisms. An attention-based framework is developed to emphasize important neighbors by concurrently integrating the learned attention scores into observation embedding and inter-robot interaction rules. A KL divergence regularization is introduced to alleviate the resultant learning stability issue. Improvements in data efficiency and generalization are demonstrated through numerical simulations. Extensive quantitative analysis and ablation studies are performed to illustrate the advantages of the proposed modules. Real-world experiments are performed to justify the feasibility of deploying DACOOP-A in physical systems.Comment: 8 Pages; This manuscript has been accepted by IEEE Robotics and Automation Letter

    EASpace: Enhanced Action Space for Policy Transfer

    Full text link
    Formulating expert policies as macro actions promises to alleviate the long-horizon issue via structured exploration and efficient credit assignment. However, traditional option-based multi-policy transfer methods suffer from inefficient exploration of macro action's length and insufficient exploitation of useful long-duration macro actions. In this paper, a novel algorithm named EASpace (Enhanced Action Space) is proposed, which formulates macro actions in an alternative form to accelerate the learning process using multiple available sub-optimal expert policies. Specifically, EASpace formulates each expert policy into multiple macro actions with different execution {times}. All the macro actions are then integrated into the primitive action space directly. An intrinsic reward, which is proportional to the execution time of macro actions, is introduced to encourage the exploitation of useful macro actions. The corresponding learning rule that is similar to Intra-option Q-learning is employed to improve the data efficiency. Theoretical analysis is presented to show the convergence of the proposed learning rule. The efficiency of EASpace is illustrated by a grid-based game and a multi-agent pursuit problem. The proposed algorithm is also implemented in physical systems to validate its effectiveness.Comment: 15 Page

    Design of an Omnidirectional Mobile Robot Based on Decoupled Powered Caster Wheels

    No full text
    A novel decoupled powered caster wheel was designed herein, so that the wheel's steering and rolling motions could be independently controlled. Based on the new wheel design, an omnidirectional mobile robot was developed for indoor applications. Kinematic model of this robot was formulated and verified through the computer by MATLAB. Simulation results show that the omnidirectional mobile robot is able to perform the side-way and spinning motions

    Step-by-step pipeline processing approach for line segment detection

    No full text
    This study proposes a line segment detection that can efficiently and effectively handle non-linear uniform intensity changes. The presented sketching algorithm applies the resistant to affine transformation and monotonic intensity change (RATMIC) descriptor to conduct binary translation in the image pre-processing step, which can remove the unwanted smoothing of the Canny detector in most line detections. The Harris corner detector is applied to catch regions of line segments for the purpose of simulating the composition of sketching and achieving a sense of unity within the picture. Furthermore, the RATMIC descriptor is employed to obtain binary images of the regions of interest (ROIs). Finally, small eigenvalue analysis is implemented to detect straight lines in the ROIs. The experiments conducted on various images with image rotation, scaling, and translation validate the effectiveness of the proposed method. The experimental results also demonstrate that about 30% in the overall coverage of major lines and 20% in the coverage per major line are increased compared with the state-of-the-art line detectors. Moreover, the performance of the proposed method produces a combined advantage of approximate to 17% in the coverage of line segments over the line segment detector with noisy images

    Feature Extraction Method Based on 2.5-Dimensions Lidar Platform for Indoor Mobile Robots Localization

    No full text
    laser direct joining; thermal contact model; numerical simulation; CFRTP; stainless stee

    Design Analysis of a 3-DOF Cable-driven Variable-stiffness Joint Module

    No full text
    Variable-stiffness manipulators can produce intrinsically-safe motions, which are essential for next generation service robots. In this paper, the design analysis of a 3-DOF cable-driven joint module with variable stiffness is proposed. To achieve significant change of the stiffness, a flexure-based variable-stiffness device is serially connected to each of the cables. Due to the existence of redundant actuation, the stiffness of the joint module is controlled by regulating the cable tensions. To this end, the relationship between the stiffness matrix of the joint module and the cable tensions has been formulated and analyzed. Simulation examples are provided to illustrate the effectiveness of the proposed stiffness evaluation algorithm

    Regional Consensus Control for Multi-Agent Systems with Actuator Saturation

    No full text
    This paper considers the regional consensus problem for multi-agent systems with actuator saturation. By utilizing the theory of convex set, a novel multiple nonlinear feedback control protocol is presented, which can effectively reduce the conservatism in dealing with saturated nonlinear input. In order to obtain a larger estimate on the domain of consensus, the composite Laplacian quadratics function is constructed to derive sufficient conditions for the consensus of multi-agent systems. In addition, an alternative convex hull representation is employed to further enlarge the above-mentioned domain of consensus. Finally, a numerical simulation case study illustrates the validity as well as the superiority of the proposed approaches
    corecore