202 research outputs found

    Offline Experience Replay for Continual Offline Reinforcement Learning

    Full text link
    The capability of continuously learning new skills via a sequence of pre-collected offline datasets is desired for an agent. However, consecutively learning a sequence of offline tasks likely leads to the catastrophic forgetting issue under resource-limited scenarios. In this paper, we formulate a new setting, continual offline reinforcement learning (CORL), where an agent learns a sequence of offline reinforcement learning tasks and pursues good performance on all learned tasks with a small replay buffer without exploring any of the environments of all the sequential tasks. For consistently learning on all sequential tasks, an agent requires acquiring new knowledge and meanwhile preserving old knowledge in an offline manner. To this end, we introduced continual learning algorithms and experimentally found experience replay (ER) to be the most suitable algorithm for the CORL problem. However, we observe that introducing ER into CORL encounters a new distribution shift problem: the mismatch between the experiences in the replay buffer and trajectories from the learned policy. To address such an issue, we propose a new model-based experience selection (MBES) scheme to build the replay buffer, where a transition model is learned to approximate the state distribution. This model is used to bridge the distribution bias between the replay buffer and the learned model by filtering the data from offline data that most closely resembles the learned model for storage. Moreover, in order to enhance the ability on learning new tasks, we retrofit the experience replay method with a new dual behavior cloning (DBC) architecture to avoid the disturbance of behavior-cloning loss on the Q-learning process. In general, we call our algorithm offline experience replay (OER). Extensive experiments demonstrate that our OER method outperforms SOTA baselines in widely-used Mujoco environments.Comment: 9 pages, 4 figure

    Lugrandoside attenuates spinal cord injury by targeting peli1 and TLR4/NF-κB pathway to exert anti-inflammatory and anti-apoptotic effects

    Get PDF
    Purpose: To investigate the curative effect and mechanism of lugrandoside (LG) on spinal cord injury (SCI).Methods: We probed the expression of Pellino1 (peli1) in microglia and spinal cord tissues withdifferent treatments of LG. Lipopolysaccharide (LPS) was used to activate the microglia. Furthermore, rats were used to establish SCI model, and LG, at low and high concentrations, was administered to injured animals to ascertain whether LG exerted a therapeutic effect on SCI.Results: LG inhibited the activation and recruitment of glial cells by acting as a negative regulator of glial inflammation, and this reverse the targeting of peli1 and TLR4/NF-κB pathway. Furthermore, the in vivo data showed that LG exerted a neuroprotective effect, following SCI, via anti-inflammatory and antiapoptotic effects. Furthermore, Peli1 and TLR4/NF-κB were suppressed by LG stimuli.Conclusion: These results suggest that LG protects neural tissue against neuroinflammation and apoptosis by suppressing TLR4/NF-κB pathway and negatively targeting peli1. The findings may provide new insights into the treatment of spinal cord injury

    Time-Sensitive Collaborative Filtering Algorithm with Feature Stability

    Get PDF
    In the recommendation system, the collaborative filtering algorithm is widely used. However, there are lots of problems which need to be solved in recommendation field, such as low precision, the long tail of items. In this paper, we design an algorithm called FSTS for solving the low precision and the long tail. We adopt stability variables and time-sensitive factors to solve the problem of user's interest drift, and improve the accuracy of prediction. Experiments show that, compared with Item-CF, the precision, the recall, the coverage and the popularity have been significantly improved by FSTS algorithm. At the same time, it can mine long tail items and alleviate the phenomenon of the long tail

    A fuzzy-clustering based approach for MADM handover in 5G ultra-dense networks

    Get PDF
    As the global data traffic has significantly increased in the recent year, the ultra-dense deployment of cellular networks (UDN) is being proposed as one of the key technologies in the fifth-generation mobile communications system (5G) to provide a much higher density of radio resource. The densification of small base stations could introduce much higher inter-cell interference and lead user to meet the edge of coverage more frequently. As the current handover scheme was originally proposed for macro BS, it could cause serious handover issues in UDN i.e. ping-pong handover, handover failures and frequent handover. In order to address these handover challenges and provide a high quality of service (QoS) to the user in UDN. This paper proposed a novel handover scheme, which integrates both advantages of fuzzy logic and multiple attributes decision algorithms (MADM) to ensure handover process be triggered at the right time and connection be switched to the optimal neighbouring BS. To further enhance the performance of the proposed scheme, this paper also adopts the subtractive clustering technique by using historical data to define the optimal membership functions within the fuzzy system. Performance results show that the proposed handover scheme outperforms traditional approaches and can significantly minimise the number of handovers and the ping-pong handover while maintaining QoS at a relatively high level. © 2019, Springer Science+Business Media, LLC, part of Springer Nature

    What influenced the lesion patterns and hemodynamic characteristics in patients with internal carotid artery stenosis? A retrospective study

    Get PDF
    •Blood perfusion influences ischemic lesions in patients with of ICAS.•Communicating arteries influence intracranial blood flow.•TCD was a convenient and rapid tool to assess cerebral blood flow
    • …
    corecore