2,285 research outputs found
A Regularized Opponent Model with Maximum Entropy Objective
In a single-agent setting, reinforcement learning (RL) tasks can be cast into
an inference problem by introducing a binary random variable o, which stands
for the "optimality". In this paper, we redefine the binary random variable o
in multi-agent setting and formalize multi-agent reinforcement learning (MARL)
as probabilistic inference. We derive a variational lower bound of the
likelihood of achieving the optimality and name it as Regularized Opponent
Model with Maximum Entropy Objective (ROMMEO). From ROMMEO, we present a novel
perspective on opponent modeling and show how it can improve the performance of
training agents theoretically and empirically in cooperative games. To optimize
ROMMEO, we first introduce a tabular Q-iteration method ROMMEO-Q with proof of
convergence. We extend the exact algorithm to complex environments by proposing
an approximate version, ROMMEO-AC. We evaluate these two algorithms on the
challenging iterated matrix game and differential game respectively and show
that they can outperform strong MARL baselines.Comment: Accepted to International Joint Conference on Artificial Intelligence
(IJCA2019
Response of Soil Microbial Biomass and Enzyme Activities Under Three Temperate Tree Species to Elevated CO2 in Changbai Mountain, Northeastern China
A preliminary study on the monitoring of mixed venous oxygen saturation through the left main bronchus
INTRODUCTION: The study sought to assess the feasibility and accuracy of measuring mixed venous oxygen saturation (SvO(2)) through the left main bronchus (SpO(2trachea)) METHODS: Twenty hybrid pigs of each sex were studied. After anesthesia, a Robertshaw double-lumen tracheal tube with a single-use pediatric pulse oximeter attached to the left lateral surface was introduced toward the left main bronchus of the pig by means of a fibrobronchoscope. Measurements of SpO(2trachea )and oxygen saturation from pulmonary artery samples (SvO(2blood)) were performed with an intracuff pressure of 0 to 60 cmH(2)O. After equilibration, hemorrhagic shock was induced in these pigs by bleeding to a mean arterial blood pressure of 40 mmHg. With the intracuff pressure maintained at 60 cmH(2)O, SpO(2trachea )and SvO(2blood )were obtained respectively during the pre-shock period, immediately after the onset of shock, 15 and 30 minutes after shock, and 15, 30, and 60 minutes after resuscitation. RESULTS: SpO(2trachea )was the same as SvO(2blood )at an intracuff pressure of 10, 20, 40, and 60 cmH(2)O, but was reduced when the intracuff pressure was zero (p < 0.001 compared with SvO(2blood)) in hemodynamically stable states. Changes of SpO(2trachea )and SvO(2blood )corresponded with varieties of cardiac output during the hemorrhagic shock period. There was a significant correlation between the two methods at different time points. CONCLUSION: Measurement of the left main bronchus SpO(2 )is feasible and provides similar readings to SvO(2blood )in hemodynamically stable or in low saturation states. Tracheal oximetry readings are not primarily derived from the tracheal mucosa. The technique merits further evaluation
Non-Abelian Chiral Spin Liquid on the Kagome Lattice
We study spin liquid states on the kagome lattice constructed by
Gutzwiller-projected superconductors. We show that the obtained spin
liquids are either non-Abelian or Abelian topological phases, depending on the
topology of the fermionic mean-field state. By calculating the modular matrices
and , we confirm that projected topological superconductors are
non-Abelian chiral spin liquid (NACSL). The chiral central charge and the spin
Hall conductance we obtained agree very well with the (or,
equivalently, ) field theory predictions. We propose a local
Hamiltonian which may stabilize the NACSL. From a variational study we observe
a topological phase transition from the NACSL to the Abelian spin liquid.Comment: 12 pages, 7 figures, 1 tabl
DNMT3a in the hippocampal CA1 is crucial in the acquisition of morphine self‐administration in rats
Drug‐reinforced excessive operant responding is one fundamental feature of long-lasting addiction‐like behaviors and relapse in animals. However, the transcriptional regulatory mechanisms responsible for the persistent drug‐specific (not natural rewards) operant behavior are not entirely clear. In this study, we demonstrate a key role for one of the de novo DNA methyltransferase, DNMT3a, in the acquisition of morphine self‐administration (SA) in rats. The expression of DNMT3a in the hippocampal CA1 region but not in the nucleus accumbens shell was significantly up‐regulated after 1‐ and 7‐day morphine SA (0.3 mg/kg/infusion) but not after the yoked morphine injection. On the other hand, saccharin SA did not affect the expression of DNMT3a or DNMT3b. DNMT inhibitor 5‐aza‐2‐deoxycytidine (5‐aza) microinjected into the hippocampal CA1 significantly attenuated the acquisition of morphine SA. Knockdown of DNMT3a also impaired the ability to acquire the morphine SA. Overall, these findings suggest that DNMT3a in the hippocampus plays an important role in the acquisition of morphine SA and may be a valid target to prevent the development of morphine addiction.
Includes Supplemental informatio
Three-photon absorption in water-soluble ZnS nanocrystals
We report on large three-photon absorption (3PA) in glutathione-capped ZnS
semiconductor nanocrystals (NCs), determined by both Z-scan and transient
transmission techniques with 120-fs laser pulses. The monodispersed,
water-soluble ZnS NCs are synthesized by a modified protocol with a mean
diameter of 2.5 nm. Their 3PA cross-section is determined to be around
2.7x10^-78 cm^6s^2photon^-2 at an optimal wavelength of commercial Ti:sapphire
femtosecond lasers. This value is nearly one order of magnitude greater than
that of CdS NCs, and four to five orders of magnitude higher than those of the
previously reported common UV fluorescent dyes.Comment: 15 pages, 4 figure
- …