2,285 research outputs found

    A Regularized Opponent Model with Maximum Entropy Objective

    Get PDF
    In a single-agent setting, reinforcement learning (RL) tasks can be cast into an inference problem by introducing a binary random variable o, which stands for the "optimality". In this paper, we redefine the binary random variable o in multi-agent setting and formalize multi-agent reinforcement learning (MARL) as probabilistic inference. We derive a variational lower bound of the likelihood of achieving the optimality and name it as Regularized Opponent Model with Maximum Entropy Objective (ROMMEO). From ROMMEO, we present a novel perspective on opponent modeling and show how it can improve the performance of training agents theoretically and empirically in cooperative games. To optimize ROMMEO, we first introduce a tabular Q-iteration method ROMMEO-Q with proof of convergence. We extend the exact algorithm to complex environments by proposing an approximate version, ROMMEO-AC. We evaluate these two algorithms on the challenging iterated matrix game and differential game respectively and show that they can outperform strong MARL baselines.Comment: Accepted to International Joint Conference on Artificial Intelligence (IJCA2019

    A preliminary study on the monitoring of mixed venous oxygen saturation through the left main bronchus

    Get PDF
    INTRODUCTION: The study sought to assess the feasibility and accuracy of measuring mixed venous oxygen saturation (SvO(2)) through the left main bronchus (SpO(2trachea)) METHODS: Twenty hybrid pigs of each sex were studied. After anesthesia, a Robertshaw double-lumen tracheal tube with a single-use pediatric pulse oximeter attached to the left lateral surface was introduced toward the left main bronchus of the pig by means of a fibrobronchoscope. Measurements of SpO(2trachea )and oxygen saturation from pulmonary artery samples (SvO(2blood)) were performed with an intracuff pressure of 0 to 60 cmH(2)O. After equilibration, hemorrhagic shock was induced in these pigs by bleeding to a mean arterial blood pressure of 40 mmHg. With the intracuff pressure maintained at 60 cmH(2)O, SpO(2trachea )and SvO(2blood )were obtained respectively during the pre-shock period, immediately after the onset of shock, 15 and 30 minutes after shock, and 15, 30, and 60 minutes after resuscitation. RESULTS: SpO(2trachea )was the same as SvO(2blood )at an intracuff pressure of 10, 20, 40, and 60 cmH(2)O, but was reduced when the intracuff pressure was zero (p < 0.001 compared with SvO(2blood)) in hemodynamically stable states. Changes of SpO(2trachea )and SvO(2blood )corresponded with varieties of cardiac output during the hemorrhagic shock period. There was a significant correlation between the two methods at different time points. CONCLUSION: Measurement of the left main bronchus SpO(2 )is feasible and provides similar readings to SvO(2blood )in hemodynamically stable or in low saturation states. Tracheal oximetry readings are not primarily derived from the tracheal mucosa. The technique merits further evaluation

    Non-Abelian Chiral Spin Liquid on the Kagome Lattice

    Full text link
    We study S=1S=1 spin liquid states on the kagome lattice constructed by Gutzwiller-projected px+ipyp_x+ip_y superconductors. We show that the obtained spin liquids are either non-Abelian or Abelian topological phases, depending on the topology of the fermionic mean-field state. By calculating the modular matrices SS and TT, we confirm that projected topological superconductors are non-Abelian chiral spin liquid (NACSL). The chiral central charge and the spin Hall conductance we obtained agree very well with the SO(3)1SO(3)_1 (or, equivalently, SU(2)2SU(2)_2) field theory predictions. We propose a local Hamiltonian which may stabilize the NACSL. From a variational study we observe a topological phase transition from the NACSL to the Z2Z_2 Abelian spin liquid.Comment: 12 pages, 7 figures, 1 tabl

    DNMT3a in the hippocampal CA1 is crucial in the acquisition of morphine self‐administration in rats

    Get PDF
    Drug‐reinforced excessive operant responding is one fundamental feature of long-lasting addiction‐like behaviors and relapse in animals. However, the transcriptional regulatory mechanisms responsible for the persistent drug‐specific (not natural rewards) operant behavior are not entirely clear. In this study, we demonstrate a key role for one of the de novo DNA methyltransferase, DNMT3a, in the acquisition of morphine self‐administration (SA) in rats. The expression of DNMT3a in the hippocampal CA1 region but not in the nucleus accumbens shell was significantly up‐regulated after 1‐ and 7‐day morphine SA (0.3 mg/kg/infusion) but not after the yoked morphine injection. On the other hand, saccharin SA did not affect the expression of DNMT3a or DNMT3b. DNMT inhibitor 5‐aza‐2‐deoxycytidine (5‐aza) microinjected into the hippocampal CA1 significantly attenuated the acquisition of morphine SA. Knockdown of DNMT3a also impaired the ability to acquire the morphine SA. Overall, these findings suggest that DNMT3a in the hippocampus plays an important role in the acquisition of morphine SA and may be a valid target to prevent the development of morphine addiction. Includes Supplemental informatio

    Three-photon absorption in water-soluble ZnS nanocrystals

    Full text link
    We report on large three-photon absorption (3PA) in glutathione-capped ZnS semiconductor nanocrystals (NCs), determined by both Z-scan and transient transmission techniques with 120-fs laser pulses. The monodispersed, water-soluble ZnS NCs are synthesized by a modified protocol with a mean diameter of 2.5 nm. Their 3PA cross-section is determined to be around 2.7x10^-78 cm^6s^2photon^-2 at an optimal wavelength of commercial Ti:sapphire femtosecond lasers. This value is nearly one order of magnitude greater than that of CdS NCs, and four to five orders of magnitude higher than those of the previously reported common UV fluorescent dyes.Comment: 15 pages, 4 figure
    corecore