19 research outputs found

    A Regularized Opponent Model with Maximum Entropy Objective

    Get PDF
    In a single-agent setting, reinforcement learning (RL) tasks can be cast into an inference problem by introducing a binary random variable o, which stands for the "optimality". In this paper, we redefine the binary random variable o in multi-agent setting and formalize multi-agent reinforcement learning (MARL) as probabilistic inference. We derive a variational lower bound of the likelihood of achieving the optimality and name it as Regularized Opponent Model with Maximum Entropy Objective (ROMMEO). From ROMMEO, we present a novel perspective on opponent modeling and show how it can improve the performance of training agents theoretically and empirically in cooperative games. To optimize ROMMEO, we first introduce a tabular Q-iteration method ROMMEO-Q with proof of convergence. We extend the exact algorithm to complex environments by proposing an approximate version, ROMMEO-AC. We evaluate these two algorithms on the challenging iterated matrix game and differential game respectively and show that they can outperform strong MARL baselines.Comment: Accepted to International Joint Conference on Artificial Intelligence (IJCA2019

    A novel actuator-internal micro/nano positioning stage with an arch-shape bridge type amplifier

    Get PDF
    This paper presents a novel actuator-internal two degree-of-freedom (2-DOF) micro/nano positioning stage actuated by piezoelectric (PZT) actuators, which can be used as a fine actuation part in dual-stage system. To compensate the positioning error of coarse stage and achieve a large motion stroke, a symmetrical structure with an arch-shape bridge type amplifier based on single notch circular flexure hinges is proposed and utilized in the positioning stage. Due to the compound bridge arm configuration and compact flexure hinge structure, the amplification mechanism can realize high lateral stiffness and compact structure simultaneously, which is of great importance to protect PZT actuators. The amplification mechanism is integrated into the decoupling mechanism to improve compactness, and to produce decoupled motion in X- and Y- axes. An analytical model is established to explore the static and dynamic characteristics, and the geometric parameters are optimized. The performance of the positioning stage is evaluated through finite element analysis (FEA) and experimental test. The results indicate that the stage can implement 2-DOF decoupled motion with a travel range of 55.4×53.2 μm2, and the motion resolution is 8 nm. The stage can be used in probe tip-based micro/nano scratching

    Contact force sensing and control for inserting operation during precise assembly using a micromanipulator integrated with force sensors

    Get PDF
    This paper proposes a novel contact force sensing and control method for the inserting operation during precise assembly process, which is based on a micromanipulator integrated with force sensors. At first, theoretical analysis is carried out to calculate the admissible contact force between the gripped holes and the pegs. The contact force thresholds which are smaller than the admissible contact forces are adopted in the control algorithm to avoid the rotating of the gripped holes during assembly process. The force sensors are calibrated using an ATI force sensor and the conversing coefficients are calculated. The admissible contact forces are tested when different contact distance and preload force are adopted. The performance of the proposed contact force sensing and control method is verified by carrying out the task of applying contact force on the surface of the gripped holes with different contacting speeds. The results indicate that the contact force can be adjusted to be smaller than the threshold 1 and the peg-in-hole assembly can be completed successfully. Note to Practitioners—This paper proposes a novel contact force sensing method during the inserting operation. Compared with the traditional contact force sensing method, this paper adopts the force sensor integrated into the micromanipulator instead of commercial force sensor to detect the contact force between two parts. To ensure the assembling precision, the theoretical analysis is conducted to calculated the admissible contact force to avoid the sliding and rotating of the gripped micro part during assembling. This work efficiently simplifies the contact force sensing and control process, where complex calibration process needn’t to be carried out to eliminate the influence of the mass of the micromanipulator on the testing results. In addition, the assembling costs are reduced by replacing commercial force sensors with strain gauges

    An improved positioning algorithm in a long-range asymmetric perimeter security system

    Get PDF
    In this paper, an improved positioning algorithm is proposed for a long-range asymmetric perimeter security system. This algorithm employs zero-crossing rate to detect the disturbance starting point, and then utilizes an improved empirical mode decomposition to obtain the effective time-frequency distribution of the extracted signal. In the end, a cross-correlation is used to estimate the time delay of the effective extracted signal. The scheme is also verified and analyzed experimentally. The field test results demonstrate that the proposed scheme can achieve a detection of 96.60% of positioning errors distributed within the range of 0-±20 m at the sensing length of 75 km, which significantly improves the positioning accuracy for the long-range asymmetric fence perimeter application

    Sub-second periodic radio oscillations in a microquasar

    Full text link
    Powerful relativistic jets are one of the ubiquitous features of accreting black holes in all scales. GRS 1915+105 is a well-known fast-spinning black-hole X-ray binary with a relativistic jet, termed as a ``microquasar'', as indicated by its superluminal motion of radio emission. It exhibits persistent x-ray activity over the last 30 years, with quasi-periodic oscillations of ∼1−10\sim 1-10 Hz and 34 and 67 Hz in the x-ray band. These oscillations likely originate in the inner accretion disk, but other origins have been considered. Radio observations found variable light curves with quasi-periodic flares or oscillations with periods of ∼20−50\sim 20-50 minutes. Here we report two instances of ∼\sim5 Hz transient periodic oscillation features from the source detected in the 1.05-1.45 GHz radio band that occurred in January 2021 and June 2022, respectively. Circular polarization was also observed during the oscillation phase.Comment: The author version of the article which will appear in Nature on 26 July 2023, 32 pages including the extended data. The online publication version can be found at the following URL: https://www.nature.com/articles/s41586-023-06336-

    A dual-driven high precision rotary platform based on stick-slip principle

    No full text
    Aiming at the realization of high precision angle adjusting, this article proposed a rotary platform based on stick-slip principle, which adopted a dual-driven working mode to realize large circular motion stroke and high loading capacity. Based on flexure hinges, a symmetrical flexible mechanism with two driving feet was designed to generate coupled driving displacement. By actuating the piezoelectrics alternatively, the proposed dual-driven working principle was described in detail, which could effectively suppress the back-off phenomenon and improve loading capacity. The theoretical analysis and finite element simulation were conducted to investigate the characteristic of flexible driving unit. The dynamic model of dual-driven working mode was established and simulated in MATLAB/Simulink, and the influence of preloading coefficient and initial preloading force were investigated, which provided a guidance for the design and optimization of stick-slip actuator. Additionally, a prototype was fabricated, and a series of experiments were conducted. The results indicated that the maximum rotary speed and loading capacity of rotary platform were 48.3 mrad/s and 98.8 mN·m, respectively
    corecore