19 research outputs found
DACOOP-A: Decentralized Adaptive Cooperative Pursuit via Attention
Integrating rule-based policies into reinforcement learning promises to
improve data efficiency and generalization in cooperative pursuit problems.
However, most implementations do not properly distinguish the influence of
neighboring robots in observation embedding or inter-robot interaction rules,
leading to information loss and inefficient cooperation. This paper proposes a
cooperative pursuit algorithm named Decentralized Adaptive COOperative Pursuit
via Attention (DACOOP-A) by empowering reinforcement learning with artificial
potential field and attention mechanisms. An attention-based framework is
developed to emphasize important neighbors by concurrently integrating the
learned attention scores into observation embedding and inter-robot interaction
rules. A KL divergence regularization is introduced to alleviate the resultant
learning stability issue. Improvements in data efficiency and generalization
are demonstrated through numerical simulations. Extensive quantitative analysis
and ablation studies are performed to illustrate the advantages of the proposed
modules. Real-world experiments are performed to justify the feasibility of
deploying DACOOP-A in physical systems.Comment: 8 Pages; This manuscript has been accepted by IEEE Robotics and
Automation Letter
EASpace: Enhanced Action Space for Policy Transfer
Formulating expert policies as macro actions promises to alleviate the
long-horizon issue via structured exploration and efficient credit assignment.
However, traditional option-based multi-policy transfer methods suffer from
inefficient exploration of macro action's length and insufficient exploitation
of useful long-duration macro actions. In this paper, a novel algorithm named
EASpace (Enhanced Action Space) is proposed, which formulates macro actions in
an alternative form to accelerate the learning process using multiple available
sub-optimal expert policies. Specifically, EASpace formulates each expert
policy into multiple macro actions with different execution {times}. All the
macro actions are then integrated into the primitive action space directly. An
intrinsic reward, which is proportional to the execution time of macro actions,
is introduced to encourage the exploitation of useful macro actions. The
corresponding learning rule that is similar to Intra-option Q-learning is
employed to improve the data efficiency. Theoretical analysis is presented to
show the convergence of the proposed learning rule. The efficiency of EASpace
is illustrated by a grid-based game and a multi-agent pursuit problem. The
proposed algorithm is also implemented in physical systems to validate its
effectiveness.Comment: 15 Page
Design of an Omnidirectional Mobile Robot Based on Decoupled Powered Caster Wheels
A novel decoupled powered caster wheel was designed herein, so that the wheel's steering and rolling motions could be independently controlled. Based on the new wheel design, an omnidirectional mobile robot was developed for indoor applications. Kinematic model of this robot was formulated and verified through the computer by MATLAB. Simulation results show that the omnidirectional mobile robot is able to perform the side-way and spinning motions
Step-by-step pipeline processing approach for line segment detection
This study proposes a line segment detection that can efficiently and effectively handle non-linear uniform intensity changes. The presented sketching algorithm applies the resistant to affine transformation and monotonic intensity change (RATMIC) descriptor to conduct binary translation in the image pre-processing step, which can remove the unwanted smoothing of the Canny detector in most line detections. The Harris corner detector is applied to catch regions of line segments for the purpose of simulating the composition of sketching and achieving a sense of unity within the picture. Furthermore, the RATMIC descriptor is employed to obtain binary images of the regions of interest (ROIs). Finally, small eigenvalue analysis is implemented to detect straight lines in the ROIs. The experiments conducted on various images with image rotation, scaling, and translation validate the effectiveness of the proposed method. The experimental results also demonstrate that about 30% in the overall coverage of major lines and 20% in the coverage per major line are increased compared with the state-of-the-art line detectors. Moreover, the performance of the proposed method produces a combined advantage of approximate to 17% in the coverage of line segments over the line segment detector with noisy images
Feature Extraction Method Based on 2.5-Dimensions Lidar Platform for Indoor Mobile Robots Localization
laser direct joining; thermal contact model; numerical simulation; CFRTP; stainless stee
Design Analysis of a 3-DOF Cable-driven Variable-stiffness Joint Module
Variable-stiffness manipulators can produce intrinsically-safe motions, which are essential for next generation service robots. In this paper, the design analysis of a 3-DOF cable-driven joint module with variable stiffness is proposed. To achieve significant change of the stiffness, a flexure-based variable-stiffness device is serially connected to each of the cables. Due to the existence of redundant actuation, the stiffness of the joint module is controlled by regulating the cable tensions. To this end, the relationship between the stiffness matrix of the joint module and the cable tensions has been formulated and analyzed. Simulation examples are provided to illustrate the effectiveness of the proposed stiffness evaluation algorithm
Regional Consensus Control for Multi-Agent Systems with Actuator Saturation
This paper considers the regional consensus problem for multi-agent systems with actuator saturation. By utilizing the theory of convex set, a novel multiple nonlinear feedback control protocol is presented, which can effectively reduce the conservatism in dealing with saturated nonlinear input. In order to obtain a larger estimate on the domain of consensus, the composite Laplacian quadratics function is constructed to derive sufficient conditions for the consensus of multi-agent systems. In addition, an alternative convex hull representation is employed to further enlarge the above-mentioned domain of consensus. Finally, a numerical simulation case study illustrates the validity as well as the superiority of the proposed approaches