202 research outputs found
Offline Experience Replay for Continual Offline Reinforcement Learning
The capability of continuously learning new skills via a sequence of
pre-collected offline datasets is desired for an agent. However, consecutively
learning a sequence of offline tasks likely leads to the catastrophic
forgetting issue under resource-limited scenarios. In this paper, we formulate
a new setting, continual offline reinforcement learning (CORL), where an agent
learns a sequence of offline reinforcement learning tasks and pursues good
performance on all learned tasks with a small replay buffer without exploring
any of the environments of all the sequential tasks. For consistently learning
on all sequential tasks, an agent requires acquiring new knowledge and
meanwhile preserving old knowledge in an offline manner. To this end, we
introduced continual learning algorithms and experimentally found experience
replay (ER) to be the most suitable algorithm for the CORL problem. However, we
observe that introducing ER into CORL encounters a new distribution shift
problem: the mismatch between the experiences in the replay buffer and
trajectories from the learned policy. To address such an issue, we propose a
new model-based experience selection (MBES) scheme to build the replay buffer,
where a transition model is learned to approximate the state distribution. This
model is used to bridge the distribution bias between the replay buffer and the
learned model by filtering the data from offline data that most closely
resembles the learned model for storage. Moreover, in order to enhance the
ability on learning new tasks, we retrofit the experience replay method with a
new dual behavior cloning (DBC) architecture to avoid the disturbance of
behavior-cloning loss on the Q-learning process. In general, we call our
algorithm offline experience replay (OER). Extensive experiments demonstrate
that our OER method outperforms SOTA baselines in widely-used Mujoco
environments.Comment: 9 pages, 4 figure
Lugrandoside attenuates spinal cord injury by targeting peli1 and TLR4/NF-κB pathway to exert anti-inflammatory and anti-apoptotic effects
Purpose: To investigate the curative effect and mechanism of lugrandoside (LG) on spinal cord injury (SCI).Methods: We probed the expression of Pellino1 (peli1) in microglia and spinal cord tissues withdifferent treatments of LG. Lipopolysaccharide (LPS) was used to activate the microglia. Furthermore, rats were used to establish SCI model, and LG, at low and high concentrations, was administered to injured animals to ascertain whether LG exerted a therapeutic effect on SCI.Results: LG inhibited the activation and recruitment of glial cells by acting as a negative regulator of glial inflammation, and this reverse the targeting of peli1 and TLR4/NF-κB pathway. Furthermore, the in vivo data showed that LG exerted a neuroprotective effect, following SCI, via anti-inflammatory and antiapoptotic effects. Furthermore, Peli1 and TLR4/NF-κB were suppressed by LG stimuli.Conclusion: These results suggest that LG protects neural tissue against neuroinflammation and apoptosis by suppressing TLR4/NF-κB pathway and negatively targeting peli1. The findings may provide new insights into the treatment of spinal cord injury
Time-Sensitive Collaborative Filtering Algorithm with Feature Stability
In the recommendation system, the collaborative filtering algorithm is widely used. However, there are lots of problems which need to be solved in recommendation field, such as low precision, the long tail of items. In this paper, we design an algorithm called FSTS for solving the low precision and the long tail. We adopt stability variables and time-sensitive factors to solve the problem of user's interest drift, and improve the accuracy of prediction. Experiments show that, compared with Item-CF, the precision, the recall, the coverage and the popularity have been significantly improved by FSTS algorithm. At the same time, it can mine long tail items and alleviate the phenomenon of the long tail
A fuzzy-clustering based approach for MADM handover in 5G ultra-dense networks
As the global data traffic has significantly increased in the recent year, the ultra-dense deployment of cellular networks (UDN) is being proposed as one of the key technologies in the fifth-generation mobile communications system (5G) to provide a much higher density of radio resource. The densification of small base stations could introduce much higher inter-cell interference and lead user to meet the edge of coverage more frequently. As the current handover scheme was originally proposed for macro BS, it could cause serious handover issues in UDN i.e. ping-pong handover, handover failures and frequent handover. In order to address these handover challenges and provide a high quality of service (QoS) to the user in UDN. This paper proposed a novel handover scheme, which integrates both advantages of fuzzy logic and multiple attributes decision algorithms (MADM) to ensure handover process be triggered at the right time and connection be switched to the optimal neighbouring BS. To further enhance the performance of the proposed scheme, this paper also adopts the subtractive clustering technique by using historical data to define the optimal membership functions within the fuzzy system. Performance results show that the proposed handover scheme outperforms traditional approaches and can significantly minimise the number of handovers and the ping-pong handover while maintaining QoS at a relatively high level. © 2019, Springer Science+Business Media, LLC, part of Springer Nature
What influenced the lesion patterns and hemodynamic characteristics in patients with internal carotid artery stenosis? A retrospective study
•Blood perfusion influences ischemic lesions in patients with of ICAS.•Communicating arteries influence intracranial blood flow.•TCD was a convenient and rapid tool to assess cerebral blood flow
- …