Search CORE

8 research outputs found

Seizing Serendipity: Exploiting the Value of Past Success in Off-Policy Actor-Critic

Author: Ji Tianying
Luo Yu
Sun Fuchun
Xu Huazhe
Zhan Xianyuan
Zhang Jianwei
Publication venue
Publication date: 12/10/2023
Field of study

Learning high-quality Q-value functions plays a key role in the success of many modern off-policy deep reinforcement learning (RL) algorithms. Previous works focus on addressing the value overestimation issue, an outcome of adopting function approximators and off-policy learning. Deviating from the common viewpoint, we observe that Q-values are indeed underestimated in the latter stage of the RL training process, primarily related to the use of inferior actions from the current policy in Bellman updates as compared to the more optimal action samples in the replay buffer. We hypothesize that this long-neglected phenomenon potentially hinders policy learning and reduces sample efficiency. Our insight to address this issue is to incorporate sufficient exploitation of past successes while maintaining exploration optimism. We propose the Blended Exploitation and Exploration (BEE) operator, a simple yet effective approach that updates Q-value using both historical best-performing actions and the current policy. The instantiations of our method in both model-free and model-based settings outperform state-of-the-art methods in various continuous control tasks and achieve strong performance in failure-prone scenarios and real-world robot tasks

arXiv.org e-Print Archive

H2O+: An Improved Framework for Hybrid Offline-and-Online RL with Dynamics Gaps

Author: Hu Jianming
Huang Pengfei
Ji Tianying
Liu Bingqi
Niu Haoyi
Zhan Xianyuan
Zhao Haocheng
Zheng Jianying
Zhou Guyue
Zhu Xiangyu
Publication venue
Publication date: 22/09/2023
Field of study

Solving real-world complex tasks using reinforcement learning (RL) without high-fidelity simulation environments or large amounts of offline data can be quite challenging. Online RL agents trained in imperfect simulation environments can suffer from severe sim-to-real issues. Offline RL approaches although bypass the need for simulators, often pose demanding requirements on the size and quality of the offline datasets. The recently emerged hybrid offline-and-online RL provides an attractive framework that enables joint use of limited offline data and imperfect simulator for transferable policy learning. In this paper, we develop a new algorithm, called H2O+, which offers great flexibility to bridge various choices of offline and online learning methods, while also accounting for dynamics gaps between the real and simulation environment. Through extensive simulation and real-world robotics experiments, we demonstrate superior performance and flexibility over advanced cross-domain online and offline RL algorithms

arXiv.org e-Print Archive

DrM: Mastering Visual Reinforcement Learning through Dormant Ratio Minimization

Author: Daumé III Hal
Hua Pu
Huang Furong
Ji Tianying
Li Shuzhen
Liang Yongyuan
Liu Xiaoyu
Luo Yu
Wang Xiyao
Xu Guowei
Xu Huazhe
Yuan Jiaxin
Yuan Zhecheng
Ze Yanjie
Zheng Ruijie
Publication venue
Publication date: 30/10/2023
Field of study

Visual reinforcement learning (RL) has shown promise in continuous control tasks. Despite its progress, current algorithms are still unsatisfactory in virtually every aspect of the performance such as sample efficiency, asymptotic performance, and their robustness to the choice of random seeds. In this paper, we identify a major shortcoming in existing visual RL methods that is the agents often exhibit sustained inactivity during early training, thereby limiting their ability to explore effectively. Expanding upon this crucial observation, we additionally unveil a significant correlation between the agents' inclination towards motorically inactive exploration and the absence of neuronal activity within their policy networks. To quantify this inactivity, we adopt dormant ratio as a metric to measure inactivity in the RL agent's network. Empirically, we also recognize that the dormant ratio can act as a standalone indicator of an agent's activity level, regardless of the received reward signals. Leveraging the aforementioned insights, we introduce DrM, a method that uses three core mechanisms to guide agents' exploration-exploitation trade-offs by actively minimizing the dormant ratio. Experiments demonstrate that DrM achieves significant improvements in sample efficiency and asymptotic performance with no broken seeds (76 seeds in total) across three continuous control benchmark environments, including DeepMind Control Suite, MetaWorld, and Adroit. Most importantly, DrM is the first model-free algorithm that consistently solves tasks in both the Dog and Manipulator domains from the DeepMind Control Suite as well as three dexterous hand manipulation tasks without demonstrations in Adroit, all based on pixel observations

arXiv.org e-Print Archive

Molecular Dynamics Simulation of LiTFSI−Acetamide Electrolytes: Structural Properties

Author: De-Ying Song
Ji-Qiang Wang
Lei Liu
Pan-Wen Shen
Shu Li
Shu Wang
Tianying Yan
Xue-Ping Gao
Yonglong Wang
Yuxing Peng
Zhen Cao
Publication venue: 'American Chemical Society (ACS)'
Publication date
Field of study

Crossref

Inhibition of NLRP3 inflammasome-mediated pyroptosis in macrophage by cycloastragenol contributes to amelioration of imiquimod-induced psoriasis-like skin inflammation in mice

Author: Alan
Amber
Boehncke
Caliş
Chen
Clark
Dai
Dou
Dowlatshahi
Elder
Elliott
Evavold
Foerster
Franchi
Freeman
Fuentes-Duculan
Goodman
Guo
Guoliang Deng
Gutierrez
Hawkes
Hirota
Hunter
Hwang
Lichtnekert
Liu
Liu
Lorthois
Lowes
Lowes
Mease
Meng
Menu
Miglio
Mizutani
Muschaweckh
Nestle
Nickoloff
Parisi
Peng Wang
Peral de Castro
Perera
Puig
Shao
Shen
Steven Russell
Sun
Tianying Zhan
Wan-Ting
Wang
Wang
Wang
Wei Zheng
Wenjun Chen
Xiaomei Wang
Xiaoyun Ji
Yang Sun
Zhao
Zhao
Zhengbing Gu
Publication venue: 'Elsevier BV'
Publication date
Field of study

Crossref

Border Control-A Membrane-Linked Interactome of Arabidopsis

Author: Acharya BR[Acharya, Biswa R.]
Albert R[Albert, Reka]
Assmann SM[Assmann, Sarah M.]
Chen J[Chen, Jin]
Dong QF[Dong, Qunfeng]
Engineer C[Engineer, Cawas]
Frazer KA[Frazer, Keith A.]
Frommer WB[Frommer, Wolf B.]
Grossmann G[Grossmann, Guido]
Ho CH[Ho, Cheng-Hsun]
Hu HC[Hu, Heng-Cheng]
Jones AM[Jones, Alexander M.]
Ju CL[Ju, Chuanli]
Kwak JM[Kwak, June M.]
Lalonde S[Lalonde, Sylvie]
Parsa SA[Parsa, Saman A.]
Pilot G[Pilot, Guillaume]
Pratelli R[Pratelli, Rejane]
Rhee SY[Rhee, Seung Y.]
Sardi MI[Sardi, Maria I.]
Schroeder JI[Schroeder, Julian I.]
Smith-Valle E[Smith-Valle, Erika]
Su TY[Su, Tianying]
Su Z[Su, Zhao]
Takeda K[Takeda, Kouji]
Villiers F[Villiers, Florent]
Wang RS[Wang, Rui-Sheng]
Xu M[Xu, Meng]
Xuan YH[Xuan, Yuanhu]
You CH[You, Chang Hun]
Publication venue: American Association for the Advancement of Science
Publication date: 16/05/2014
Field of study

Cellular membranes act as signaling platforms and control solute transport. Membrane receptors, transporters, and enzymes communicate with intracellular processes through protein-protein interactions. Using a split-ubiquitin yeast two-hybrid screen that covers a test-space of 6.4 × 106 pairs, we identified 12,102 membrane/signaling protein interactions from Arabidopsis. Besides confirmation of expected interactions such as heterotrimeric G protein subunit interactions and aquaporin oligomerization, >99% of the interactions were previously unknown. Interactions were confirmed at a rate of 32% in orthogonal in planta split-green flourescent protein interaction assays, which was statistically indistinguishable from the confirmation rate for known interactions collected from literature (38%). Regulatory associations in membrane protein trafficking, turnover, and phosphorylation include regulation of potassium channel activity through abscisic acid signaling, transporter activity by a WNK kinase, and a brassinolide receptor kinase by trafficking-related proteins. These examples underscore the utility of the membrane/signaling protein interaction network for gene discovery and hypothesis generation in plants and other organisms.

DGIST Library Institutional Repository