9 research outputs found

    Offline Multi-Agent Reinforcement Learning with Implicit Global-to-Local Value Regularization

    Full text link
    Offline reinforcement learning (RL) has received considerable attention in recent years due to its attractive capability of learning policies from offline datasets without environmental interactions. Despite some success in the single-agent setting, offline multi-agent RL (MARL) remains to be a challenge. The large joint state-action space and the coupled multi-agent behaviors pose extra complexities for offline policy optimization. Most existing offline MARL studies simply apply offline data-related regularizations on individual agents, without fully considering the multi-agent system at the global level. In this work, we present OMIGA, a new offline m ulti-agent RL algorithm with implicit global-to-local v alue regularization. OMIGA provides a principled framework to convert global-level value regularization into equivalent implicit local value regularizations and simultaneously enables in-sample learning, thus elegantly bridging multi-agent value decomposition and policy learning with offline regularizations. Based on comprehensive experiments on the offline multi-agent MuJoCo and StarCraft II micro-management tasks, we show that OMIGA achieves superior performance over the state-of-the-art offline MARL methods in almost all tasks

    The safe screw path along inferior border of the arcuate line at acetabular area: an anatomical study based on CT scans

    No full text
    Abstract Background Misplaced screw during the internal fixation of acetabular fractures may penetrate the hip joint which might cause chondrolysis and traumatic osteoarthritis in the future. This study aims to acquire the safe path for screw insertion along inferior border of the arcuate line fixation route at acetabular area. Methods Computed tomography (CT) scans of 98 patients without pelvic trauma were rebuilt for three-dimensional models of pelvis. After depicting the fixation route curve, five cross-sections perpendicularly to the curve were established from the anterior of pelvis to the posterior along inferior border of the arcuate line. The safe screw lengths for section 1 and 5 were measured from the computer models. In section 2, 3 and 4, a line from the screw entry point tangent to the inferior edge of the acetabulum was depicted and the measurements of minimum safe direction of screw insertion were performed then marked with angle θ. Results The safe screw lengths for section 1 and 5 were 22.29 ± 4.41 mm and 32.64 ± 4.70 mm (n = 98). The minimum safe angles of screw insertion for the middle three sections 2, 3, and 4 were 65.38 ± 10.23°, 74.20 ± 10.20°, and 57.88 ± 11.11°(n = 98), respectively. The results for the male group (n = 98) indicated smaller minimum safe angles in these three sections compared with the female (n = 98). Conclusions Compared to male, the minimum safe angles of screw placement at acetabular area for female should be more away from inferior edge of acetabulum and tilt to the bottom of pelvis along inferior border fixation route in surgical management of acetabular fractures

    Text Sentiment Classification Based on BERT Embedding and Sliced Multi-Head Self-Attention Bi-GRU

    No full text
    In the task of text sentiment analysis, the main problem that we face is that the traditional word vectors represent lack of polysemy, the Recurrent Neural Network cannot be trained in parallel, and the classification accuracy is not high. We propose a sentiment classification model based on the proposed Sliced Bidirectional Gated Recurrent Unit (Sliced Bi-GRU), Multi-head Self-Attention mechanism, and Bidirectional Encoder Representations from Transformers embedding. First, the word vector representation obtained by the BERT pre-trained language model is used as the embedding layer of the neural network. Then the input sequence is sliced into subsequences of equal length. And the Bi-sequence Gated Recurrent Unit is applied to extract the subsequent feature information. The relationship between words is learned sequentially via the Multi-head Self-attention mechanism. Finally, the emotional tendency of the text is output by the Softmax function. Experiments show that the classification accuracy of this model on the Yelp 2015 dataset and the Amazon dataset is 74.37% and 62.57%, respectively. And the training speed of the model is better than most existing models, which verifies the effectiveness of the model

    Text Sentiment Classification Based on BERT Embedding and Sliced Multi-Head Self-Attention Bi-GRU

    No full text
    In the task of text sentiment analysis, the main problem that we face is that the traditional word vectors represent lack of polysemy, the Recurrent Neural Network cannot be trained in parallel, and the classification accuracy is not high. We propose a sentiment classification model based on the proposed Sliced Bidirectional Gated Recurrent Unit (Sliced Bi-GRU), Multi-head Self-Attention mechanism, and Bidirectional Encoder Representations from Transformers embedding. First, the word vector representation obtained by the BERT pre-trained language model is used as the embedding layer of the neural network. Then the input sequence is sliced into subsequences of equal length. And the Bi-sequence Gated Recurrent Unit is applied to extract the subsequent feature information. The relationship between words is learned sequentially via the Multi-head Self-attention mechanism. Finally, the emotional tendency of the text is output by the Softmax function. Experiments show that the classification accuracy of this model on the Yelp 2015 dataset and the Amazon dataset is 74.37% and 62.57%, respectively. And the training speed of the model is better than most existing models, which verifies the effectiveness of the model

    Polyphenylene Oxide Film Sandwiched between SiO<sub>2</sub> Layers for High-Temperature Dielectric Energy Storage

    No full text
    The commercial capacitor using dielectric biaxially oriented polypropylene (BOPP) can work effectively only at low temperatures (less than 105 °C). Polyphenylene oxide (PPO), with better heat resistance and a higher dielectric constant, is promising for capacitors operating at elevated temperatures, but its charge–discharge efficiency (η) degrades greatly under high fields at 125 °C. Here, SiO2 layers are magnetron sputtered on both sides of the PPO film, forming a composite material of SiO2/PPO/SiO2. Due to the wide bandgap and high Young’s modulus of SiO2, the breakdown strength (Eb) of this composite material reaches 552 MV/m at 125 °C (PPO: 534 MV/m), and the discharged energy density (Ue) under Eb improves to 3.5 J/cm3 (PPO: 2.5 J/cm3), with a significantly enhanced η of 89% (PPO: 70%). Furthermore, SiO2/PPO/SiO2 can discharge a Ue of 0.45 J/cm3 with an η of 97% at 125 °C under 200 MV/m (working condition in hybrid electric vehicles) for 20,000 cycles, and this value is higher than the energy density (∼0.39 J/cm3 under 200 MV/m) of BOPP at room temperature. Interestingly, the metalized SiO2/PPO/SiO2 film exhibits valuable self-healing behavior. These results make PPO-based dielectrics promising for high-temperature capacitor applications
    corecore