89 research outputs found
On the Robustness of Safe Reinforcement Learning under Observational Perturbations
Safe reinforcement learning (RL) trains a policy to maximize the task reward
while satisfying safety constraints. While prior works focus on the performance
optimality, we find that the optimal solutions of many safe RL problems are not
robust and safe against carefully designed observational perturbations. We
formally analyze the unique properties of designing effective state adversarial
attackers in the safe RL setting. We show that baseline adversarial attack
techniques for standard RL tasks are not always effective for safe RL and
proposed two new approaches - one maximizes the cost and the other maximizes
the reward. One interesting and counter-intuitive finding is that the maximum
reward attack is strong, as it can both induce unsafe behaviors and make the
attack stealthy by maintaining the reward. We further propose a more effective
adversarial training framework for safe RL and evaluate it via comprehensive
experiments. This paper provides a pioneer work to investigate the safety and
robustness of RL under observational attacks for future safe RL studies.Comment: 30 pages, 4 figures, 8 table
Modeling and analysis of a stochastic giving-up-smoking model with quit smoking duration
Smoking has gradually become a very common behavior, and the related situation in different groups also presents different forms. Due to the differences of individual smoking cessation time and the interference of environmental factors in the spread of smoking behavior, we establish a stochastic giving up smoking model with quit-smoking duration. We also consider the saturated incidence rate. The total population is composed of potential smokers, smokers, quitters and removed. By using Itô's formula and constructing appropriate Lyapunov functions, we first ensure the existence of a unique global positive solution of the stochastic model. In addition, a threshold condition for extinction and permanence of smoking behavior is deduced. If the intensity of white noise is small, and \widetilde{\mathcal{R}}_0 < 1 , smokers will eventually become extinct. If \widetilde{\mathcal{R}}_0 > 1 , smoking will last. Then, the sufficient condition for the existence of a unique stationary distribution of the smoking phenomenon is studied as R_0^s > 1 . Finally, conclusions are explained by numerical simulations
Recommended from our members
Threshold dynamics of a stochastic mathematical model for Wolbachia infections
A stochastic mathematical model is proposed to study how environmental heterogeneity and the augmentation of mosquitoes with Wolbachia bacteria affect the outcomes of dengue disease. The existence and uniqueness of the positive solutions of the system are studied. Then the V-geometrically ergodicity and stochastic ultimate boundedness are investigated. Further, threshold conditions for successful population replacement are derived and the existence of a unique ergodic steady-state distribution of the system is explored. The results show that the ratio of infected to uninfected mosquitoes has a great influence on population replacement. Moreover, environmental noise plays a significant role in control of dengue fever
Recommended from our members
Bifurcation analysis of a tumour-immune model with nonlinear killing rate as state-dependent feedback control
Impulsive control strategies have been widely used in cancer treatment and linear impulsive control has always been considered in previous studies. We propose a novel tumour-immune model with nonlinear killing rate as state-dependent feedback control, which can better reflect the saturation effects of the tumour and immune cell mortalities due to chemotherapy, and its dynamic behaviors are investigated. The paper aims to discuss the transcritical and subcritical bifurcations of the model. To begin with, the threshold conditions for tumour eradication and tumour persistence in the model without pulse interventions are provided. We define the Poincar´e map of the proposed model and then address the existence and orbital asymptotically stability of the model’s tumour-free periodic solution. Furthermore, by using the bifurcation theory of the discrete one-parameter family of maps, which is determined by the Poincar´e mapping, we investigate the model’s transcritical and subcritical pitchfork bifurcations with respect to the key parameter
Recommended from our members
Threshold dynamics of a stochastic model of intermittent androgen deprivation therapy for prostate cancer
Intermittent androgen deprivation therapy is often used to treat prostate cancer, but there are few mathematical modelling studies of it. To explore the mechanisms of such therapy, we describe intermittent therapy with impulsive differential equations, then we propose a novel mathematical model of intermittent androgen deprivation therapy with white noise. We first studied the model’s basic properties including the existence and uniqueness of the solution. By using the theory of stochastic differential equations, we investigated the thresholds for the extinction and persistence of prostate cancer cells, which are markedly affected by antigenicity of tumours and noise parameters. Moreover, sufficient conditions for the stationary distribution and ergodicity of the system are provided. The results show that reducing the period of pulsed interventions or increasing the dosages (or frequencies) of the therapy will be helpful for curing prostate cancer
Toward improving control performance of myoelectric arm prosthesis by adding wrist position feedback
Wearing a myoelectric prosthesis is a basic way for limb amputees to restore their lost limb functions in the activities of daily living (ADLs). However, it is estimated that around 40% of amputees refuse the prosthesis. One of the primary reasons would be that the current prostheses lack appropriate sensory feedback. Currently, the amputees only depend on their visual feedback (Vis-FB) when using their arm prostheses. It would be difficult for them to accurately control the wrist position, which is vital for flexible manipulation in ADLs. This manuscript designed a myoelectric arm prosthesis with wrist position feedback (WP-FB). To study the effect level of position feedback on prosthetic control, two tests were performed. The vibrotactile perception range test aims to analyze the perception sensitivity of the vibration in humans and obtain the optimal perception range utilized in the sensory feedback test. The sensory feedback test analyzes the effectiveness of the position feedback by comparing three feedback methods of Vis-FB, WP-FB, and a combination of Vis-FB and WP-FB (VP-FB). These tests were conducted by asking six able-bodied subjects to perform 20 movement combinations of five target positions. The WP-FB was transiently activated with five vibrating motors embedded in an armband to stimulate the arm stump when the prosthetic wrist rotates to the target positions. Our experimental results showed that when WP-FB was added to the prosthetic control, the absolute angular error (AAE) of the prosthetic wrist declined from 4.50° to 1.08° while the success rate 3 (SR3) increased from 0.34 to 0.84, respectively. This study demonstrates the importance of WP-FB to the effective control of the arm prosthesis
CrossCodeEval: A Diverse and Multilingual Benchmark for Cross-File Code Completion
Code completion models have made significant progress in recent years, yet
current popular evaluation datasets, such as HumanEval and MBPP, predominantly
focus on code completion tasks within a single file. This over-simplified
setting falls short of representing the real-world software development
scenario where repositories span multiple files with numerous cross-file
dependencies, and accessing and understanding cross-file context is often
required to complete the code correctly.
To fill in this gap, we propose CrossCodeEval, a diverse and multilingual
code completion benchmark that necessitates an in-depth cross-file contextual
understanding to complete the code accurately. CrossCodeEval is built on a
diverse set of real-world, open-sourced, permissively-licensed repositories in
four popular programming languages: Python, Java, TypeScript, and C#. To create
examples that strictly require cross-file context for accurate completion, we
propose a straightforward yet efficient static-analysis-based approach to
pinpoint the use of cross-file context within the current file.
Extensive experiments on state-of-the-art code language models like CodeGen
and StarCoder demonstrate that CrossCodeEval is extremely challenging when the
relevant cross-file context is absent, and we see clear improvements when
adding these context into the prompt. However, despite such improvements, the
pinnacle of performance remains notably unattained even with the
highest-performing model, indicating that CrossCodeEval is also capable of
assessing model's capability in leveraging extensive context to make better
code completion. Finally, we benchmarked various methods in retrieving
cross-file context, and show that CrossCodeEval can also be used to measure the
capability of code retrievers.Comment: To appear at NeurIPS 2023 (Datasets and Benchmarks Track
ContraGen: Effective Contrastive Learning For Causal Language Model
Despite exciting progress in large-scale language generation, the
expressiveness of its representations is severely limited by the
\textit{anisotropy} issue where the hidden representations are distributed into
a narrow cone in the vector space. To address this issue, we present ContraGen,
a novel contrastive learning framework to improve the representation with
better uniformity and discrimination. We assess ContraGen on a wide range of
downstream tasks in natural and programming languages. We show that ContraGen
can effectively enhance both uniformity and discrimination of the
representations and lead to the desired improvement on various language
understanding tasks where discriminative representations are crucial for
attaining good performance. Specifically, we attain relative improvement
on the Semantic Textual Similarity tasks and on Code-to-Code Search
tasks. Furthermore, by improving the expressiveness of the representations,
ContraGen also boosts the source code generation capability with relative
improvement on execution accuracy on the HumanEval benchmark.Comment: 10 page
Effects of Different Exogenous Substances on the Protein Conformation and in Vitro Digestion Characteristics of Low-salt Tilapia Surimi
The effects of glutamine transaminase (TGase), hydroxypropyl distarch phosphate (HDP), gellan gum and their complex (THG) on the water distribution and protein conformation of low-salt tilapia surimi gel prepared with microwave and ultrasound were analyzed. In addition, the effects of different exogenous substances on the characteristics of low-salt tilapia fish cake were explored through in vitro digestion experiment. The results showed that compared with the control group, THG increased the bound water and immovable water of surimi to 98.71% and 14.75%, respectively, and significantly decreased the free water content (P<0.05). Moreover, THG promoted the transformation of α-helix to β-folding, β-turning and random curling structures. TGase and THG (0.4%) played important roles on gastric emptying rate, protein digestibility and protein hydrolysis degree of low-salt tilapia cake. THG significantly promoted protein decomposition into aggregates with smaller particle size (P<0.05). After the digestion of stomach and duodenum, color of the THG group products was more transparent and clear. And it could be observed by the laser confocal microscope that the red fluorescence highlights of the THG group were significantly reduced, indicating that proteins had been fully digested. Hence, compared with a single exogenous substance, THG not only promoted the binding of water molecules and proteins and induced the change of protein conformation, but also facilitated the exposure of hydrophobic protein groups and the interaction between proteins, and promoted the digestion and absorption of surimi products in the stomach and duodenum. This project provided a theoretical reference for the research on the gel properties of tilapia surimi and the development and application of tilapia fish cake
Ligand-Specific Factors Influencing GLP-1 Receptor Post-Endocytic Trafficking and Degradation in Pancreatic Beta Cells.
The glucagon-like peptide-1 receptor (GLP-1R) is an important regulator of blood glucose homeostasis. Ligand-specific differences in membrane trafficking of the GLP-1R influence its signalling properties and therapeutic potential in type 2 diabetes. Here, we have evaluated how different factors combine to control the post-endocytic trafficking of GLP-1R to recycling versus degradative pathways. Experiments were performed in primary islet cells, INS-1 832/3 clonal beta cells and HEK293 cells, using biorthogonal labelling of GLP-1R to determine its localisation and degradation after treatment with GLP-1, exendin-4 and several further GLP-1R agonist peptides. We also characterised the effect of a rare GLP1R coding variant, T149M, and the role of endosomal peptidase endothelin-converting enzyme-1 (ECE-1), in GLP1R trafficking. Our data reveal how treatment with GLP-1 versus exendin-4 is associated with preferential GLP-1R targeting towards a recycling pathway. GLP-1, but not exendin-4, is a substrate for ECE-1, and the resultant propensity to intra-endosomal degradation, in conjunction with differences in binding affinity, contributes to alterations in GLP-1R trafficking behaviours and degradation. The T149M GLP-1R variant shows reduced signalling and internalisation responses, which is likely to be due to disruption of the cytoplasmic region that couples to intracellular effectors. These observations provide insights into how ligand- and genotype-specific factors can influence GLP-1R trafficking
- …