12 research outputs found
Imitation Learning from Observation with Automatic Discount Scheduling
Humans often acquire new skills through observation and imitation. For
robotic agents, learning from the plethora of unlabeled video demonstration
data available on the Internet necessitates imitating the expert without access
to its action, presenting a challenge known as Imitation Learning from
Observations (ILfO). A common approach to tackle ILfO problems is to convert
them into inverse reinforcement learning problems, utilizing a proxy reward
computed from the agent's and the expert's observations. Nonetheless, we
identify that tasks characterized by a progress dependency property pose
significant challenges for such approaches; in these tasks, the agent needs to
initially learn the expert's preceding behaviors before mastering the
subsequent ones. Our investigation reveals that the main cause is that the
reward signals assigned to later steps hinder the learning of initial
behaviors. To address this challenge, we present a novel ILfO framework that
enables the agent to master earlier behaviors before advancing to later ones.
We introduce an Automatic Discount Scheduling (ADS) mechanism that adaptively
alters the discount factor in reinforcement learning during the training phase,
prioritizing earlier rewards initially and gradually engaging later rewards
only when the earlier behaviors have been mastered. Our experiments, conducted
on nine Meta-World tasks, demonstrate that our method significantly outperforms
state-of-the-art methods across all tasks, including those that are unsolvable
by them.Comment: Accepted by ICLR 202
Krüppel-like factors in tumors: Key regulators and therapeutic avenues
Krüppel-like factors (KLFs) are a group of DNA-binding transcriptional regulators with multiple essential functions in various cellular processes, including proliferation, migration, inflammation, and angiogenesis. The aberrant expression of KLFs is often found in tumor tissues and is essential for tumor development. At the molecular level, KLFs regulate multiple signaling pathways and mediate crosstalk among them. Some KLFs may also be molecular switches for specific biological signals, driving their transition from tumor suppressors to promoters. At the histological level, the abnormal expression of KLFs is closely associated with tumor cell stemness, proliferation, apoptosis, and alterations in the tumor microenvironment. Notably, the role of each KLF in tumors varies according to tumor type and different stages of tumor development rather than being invariant. In this review, we focus on the advances in the molecular biology of KLFs, particularly the regulations of several classical signaling pathways by these factors, and the critical role of KLFs in tumor development. We also highlight their strong potential as molecular targets in tumor therapy and suggest potential directions for clinical translational research
Recommended from our members
Thrombospondin-1 Gene Deficiency Worsens the Neurological Outcomes of Traumatic Brain Injury in Mice
Background: Thrombospondin-1 (TSP-1) is an extracellular matrix protein that plays multiple physiological and pathophysiological roles in the brain. Experimental reports suggest that TSP-1 may have an adverse role in neuronal function recovery under certain injury conditions. However, the roles of TSP-1 in traumatic brain injury (TBI) have not been elucidated. In this study we for the first time investigated the roles of TSP-1 in a controlled cortical impact (CCI) model of TBI in TSP-1 knockout (TSP-1 KO) and wild type (WT) mice. Methods: We examined blood brain-barrier (BBB) damage using at 1 day post-TBI by measuring Evans Blue leakage, and neurological functional recovery at 3 weeks post-TBI by measuring neurological severity score (NSS), wire gripping, corner test and Morris Water Maze (MWM). Mechanistically, we quantified pro-angiogenic biomarkers including cerebral vessel density, vascular endothelial growth factors (VEGF) and angiopoietin-1 (Ang-1) protein expression, synaptic biomarker synaptophysin, and synaptogenesis marker brain-derived neurotrophic factor (BDNF) protein expression in contralateral and ipsilateral (peri-lesion) cortex at 21 days after TBI using immunohistochemistry and Western Blot. Results: TSP-1 is upregulated at early phase of TBI in WT mice. Compared to WT mice, TSP-1 KO (1) significantly worsened TBI-induced BBB leakage at 1 day after TBI; (2) had similar lesion size as WT mice at 3 weeks after TBI; (3) exhibited a significantly worse neurological deficits in motor and cognitive functions; (4) had no significant difference in cerebral vessel density, but significant increase of VEGF and Ang-1 protein expressions in peri-lesion cortex; (5) significantly increased BDNF but not synaptophysin protein level in peri-lesion cortex compared to sham, but both synaptophysin and BDNF expressions were significantly decreased in contralateral cortex compared to WT. Conclusion: Our results suggest that TSP-1 may be beneficial for maintaining BBB integrity in the early phase and functional recovery in late phase after TBI. The molecular mechanisms of TSP-1 in early BBB pathophysiology, and long-term neurological function recovery after TBI need to be further investigated
On the Role of Discount Factor in Offline Reinforcement Learning
Offline reinforcement learning (RL) enables effective learning from
previously collected data without exploration, which shows great promise in
real-world applications when exploration is expensive or even infeasible. The
discount factor, , plays a vital role in improving online RL sample
efficiency and estimation accuracy, but the role of the discount factor in
offline RL is not well explored. This paper examines two distinct effects of
in offline RL with theoretical analysis, namely the regularization
effect and the pessimism effect. On the one hand, is a regulator to
trade-off optimality with sample efficiency upon existing offline techniques.
On the other hand, lower guidance can also be seen as a way of
pessimism where we optimize the policy's performance in the worst possible
models. We empirically verify the above theoretical observation with tabular
MDPs and standard D4RL tasks. The results show that the discount factor plays
an essential role in the performance of offline RL algorithms, both under small
data regimes upon existing offline methods and in large data regimes without
other conservative methods.Comment: Thirty-ninth International Conference on Machine Learnin
Modeling based-on semi-tensor product for the start-up stage of the radiant cooling system
For nonlinear systems with multiple outputs and multiple inputs which cannot be decoupled, we make use of the sampling data of the real system to obtain a fuzzy relation matrix model via the semi-tensor product (STP) operation of matrices, and establish the mathematical model for a complicated system based on STP. This method has been applied to analyze the dynamic performance of floor for the radiant floor system. In this paper, a radiant floor cooling system based on concrete core radiant floors is examined. To analyze dynamic behaviour of floors during non working time operation, model of fuzzy relation matrix based on STP is established. The model is used to estimate quantitativelyrelated parameters in the start-up period of the floor systems under the impact of outdoor environment such as temperature and humidity during the actual cool-down times. The results show that it is not evident to time delay of the floor surface temperature and indoor air temperature due to the thermal inertia while the amplitude of outdoor air temperature vibration is reduced significantly
Flow to Control: Offline Reinforcement Learning with Lossless Primitive Discovery
Offline reinforcement learning (RL) enables the agent to effectively learn from logged data, which significantly extends the applicability of RL algorithms in real-world scenarios where exploration can be expensive or unsafe. Previous works have shown that extracting primitive skills from the recurring and temporally extended structures in the logged data yields better learning. However, these methods suffer greatly when the primitives have limited representation ability to recover the original policy space, especially in offline settings. In this paper, we give a quantitative characterization of the performance of offline hierarchical learning and highlight the importance of learning lossless primitives. To this end, we propose to use a flow-based structure as the representation for low-level policies. This allows us to represent the behaviors in the dataset faithfully while keeping the expression ability to recover the whole policy space. We show that such lossless primitives can drastically improve the performance of hierarchical policies. The experimental results and extensive ablation studies on the standard D4RL benchmark show that our method has a good representation ability for policies and achieves superior performance in most tasks
Protective Effects and Mechanism of a Novel Probiotic Strain <i>Ligilactobacillus salivarius</i> YL20 against <i>Cronobacter sakazakii</i>-Induced Necrotizing Enterocolitis In Vitro and In Vivo
Exposure to probiotics in early life contributes to host intestinal development and prevention of necrotizing enterocolitis (NEC). Cronobacter sakazakii (C. sakazakii), an opportunistic pathogen, can cause NEC, bacteremia, and meningitis in neonates, but the research of probiotics against C. sakazakii is limited relative to other enteropathogens. Here, the protective effect and mechanism of a novel probiotic Ligilactobacillus salivarius (L. salivarius) YL20 isolated from breast milk on C. sakazakii-induced intestinal injury were explored by using two in vitro models, including an C. sakazakii-infected intestinal organoid model and intestinal barrier model, as well as an in vivo experimental animal model. Our results revealed that L. salivarius YL20 could promote epithelial cell proliferation in intestinal organoids, rescue budding-impaired organoids, prevent the decrease of mRNA levels of leucine-rich repeat containing G protein-coupled receptor 5 (Lgr5), zonula occludens-1 (Zo-1) and Occludin, and reverse C. sakazakii-induced low level of Mucin 2 (MUC2) in intestinal organoids. Additionally, YL20 could inhibit C. sakazakii invasion, increase the expression of ZO-1 and occludin in C. sakazakii-infected HT-29 cells, and reverse TEER decrease and corresponding permeability increase across C. sakazakii-infected Caco-2 monolayers. Furthermore, YL20 administration could alleviate NEC in C. sakazakii-infected neonatal mice by increasing the mice survival ratio, decreasing pathology scores, and downregulating pro-inflammatory cytokines. Meanwhile, YL20 could also enhance intestinal barrier function in vivo by increasing the number of goblet cells, the level of MUC-2 and the expression of ZO-1. Our overall findings demonstrated for the first time the beneficial effects of L. salivarius YL20 against C. sakazakii-induced NEC by improving intestinal stem cell function and enhancing intestinal barrier integrity