Search CORE

89 research outputs found

On the Robustness of Safe Reinforcement Learning under Observational Perturbations

Author: Cen Zhepeng
Guo Zijian
Li Bo
Liu Zuxin
Tan Jie
Zhang Huan
Zhao Ding
Publication venue
Publication date: 03/10/2022
Field of study

Safe reinforcement learning (RL) trains a policy to maximize the task reward while satisfying safety constraints. While prior works focus on the performance optimality, we find that the optimal solutions of many safe RL problems are not robust and safe against carefully designed observational perturbations. We formally analyze the unique properties of designing effective state adversarial attackers in the safe RL setting. We show that baseline adversarial attack techniques for standard RL tasks are not always effective for safe RL and proposed two new approaches - one maximizes the cost and the other maximizes the reward. One interesting and counter-intuitive finding is that the maximum reward attack is strong, as it can both induce unsafe behaviors and make the attack stealthy by maintaining the reward. We further propose a more effective adversarial training framework for safe RL and evaluate it via comprehensive experiments. This paper provides a pioneer work to investigate the safety and robustness of RL under observational attacks for future safe RL studies.Comment: 30 pages, 4 figures, 8 table

arXiv.org e-Print Archive

Modeling and analysis of a stochastic giving-up-smoking model with quit smoking duration

Author: Yajuan Guo
Yawei Liu
Yuanshun Tan
Zijian Liu
Publication venue: AIMS Press
Publication date: 01/11/2023
Field of study

Smoking has gradually become a very common behavior, and the related situation in different groups also presents different forms. Due to the differences of individual smoking cessation time and the interference of environmental factors in the spread of smoking behavior, we establish a stochastic giving up smoking model with quit-smoking duration. We also consider the saturated incidence rate. The total population is composed of potential smokers, smokers, quitters and removed. By using Itô's formula and constructing appropriate Lyapunov functions, we first ensure the existence of a unique global positive solution of the stochastic model. In addition, a threshold condition for extinction and permanence of smoking behavior is deduced. If the intensity of white noise is small, and \widetilde{\mathcal{R}}_0 < 1 , smokers will eventually become extinct. If \widetilde{\mathcal{R}}_0 > 1 , smoking will last. Then, the sufficient condition for the existence of a unique stationary distribution of the smoking phenomenon is studied as R_0^s > 1 . Finally, conclusions are explained by numerical simulations

Directory of Open Access Journals

Recommended from our members

Threshold dynamics of a stochastic mathematical model for Wolbachia infections

Author: Cheke Robert
Chen Zhou
Liu Zijian
Tan Yuanshan
Yang Jin
Publication venue: 'Informa UK Limited'
Publication date: 07/07/2023
Field of study

A stochastic mathematical model is proposed to study how environmental heterogeneity and the augmentation of mosquitoes with Wolbachia bacteria affect the outcomes of dengue disease. The existence and uniqueness of the positive solutions of the system are studied. Then the V-geometrically ergodicity and stochastic ultimate boundedness are investigated. Further, threshold conditions for successful population replacement are derived and the existence of a unique ergodic steady-state distribution of the system is explored. The results show that the ratio of infected to uninfected mosquitoes has a great influence on population replacement. Moreover, environmental noise plays a significant role in control of dengue fever

Greenwich Academic Literature Archive

Recommended from our members

Bifurcation analysis of a tumour-immune model with nonlinear killing rate as state-dependent feedback control

Author: Cheke Robert
Guan Likud
Liu Zijian
Tan Yuanshun
Yang Jin
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 22/08/2022
Field of study

Impulsive control strategies have been widely used in cancer treatment and linear impulsive control has always been considered in previous studies. We propose a novel tumour-immune model with nonlinear killing rate as state-dependent feedback control, which can better reflect the saturation effects of the tumour and immune cell mortalities due to chemotherapy, and its dynamic behaviors are investigated. The paper aims to discuss the transcritical and subcritical bifurcations of the model. To begin with, the threshold conditions for tumour eradication and tumour persistence in the model without pulse interventions are provided. We define the Poincar´e map of the proposed model and then address the existence and orbital asymptotically stability of the model’s tumour-free periodic solution. Furthermore, by using the bifurcation theory of the discrete one-parameter family of maps, which is determined by the Poincar´e mapping, we investigate the model’s transcritical and subcritical pitchfork bifurcations with respect to the key parameter

Greenwich Academic Literature Archive

Recommended from our members

Threshold dynamics of a stochastic model of intermittent androgen deprivation therapy for prostate cancer

Author: Cheke Robert A.
Chen Lin
Liu Zijian
Tan Yuanshun
Yang Jin
Publication venue: 'Elsevier BV'
Publication date: 01/09/2021
Field of study

Intermittent androgen deprivation therapy is often used to treat prostate cancer, but there are few mathematical modelling studies of it. To explore the mechanisms of such therapy, we describe intermittent therapy with impulsive differential equations, then we propose a novel mathematical model of intermittent androgen deprivation therapy with white noise. We first studied the model’s basic properties including the existence and uniqueness of the solution. By using the theory of stochastic differential equations, we investigated the thresholds for the extinction and persistence of prostate cancer cells, which are markedly affected by antigenicity of tumours and noise parameters. Moreover, sufficient conditions for the stationary distribution and ergodicity of the system are provided. The results show that reducing the period of pulsed interventions or increasing the dosages (or frequencies) of the therapy will be helpful for curing prostate cancer

Greenwich Academic Literature Archive

Toward improving control performance of myoelectric arm prosthesis by adding wrist position feedback

Author: Guanglin Li
Guanglin Li
Guanglin Li
Lan Tian
Lan Tian
Lan Tian
Xiangxin Li
Xiangxin Li
Yingxiao Tan
Yingxiao Tan
Yue Zheng
Yue Zheng
Yue Zheng
Zijian Yang
Zijian Yang
Publication venue: 'Frontiers Media SA'
Publication date: 01/07/2022
Field of study

Wearing a myoelectric prosthesis is a basic way for limb amputees to restore their lost limb functions in the activities of daily living (ADLs). However, it is estimated that around 40% of amputees refuse the prosthesis. One of the primary reasons would be that the current prostheses lack appropriate sensory feedback. Currently, the amputees only depend on their visual feedback (Vis-FB) when using their arm prostheses. It would be difficult for them to accurately control the wrist position, which is vital for flexible manipulation in ADLs. This manuscript designed a myoelectric arm prosthesis with wrist position feedback (WP-FB). To study the effect level of position feedback on prosthetic control, two tests were performed. The vibrotactile perception range test aims to analyze the perception sensitivity of the vibration in humans and obtain the optimal perception range utilized in the sensory feedback test. The sensory feedback test analyzes the effectiveness of the position feedback by comparing three feedback methods of Vis-FB, WP-FB, and a combination of Vis-FB and WP-FB (VP-FB). These tests were conducted by asking six able-bodied subjects to perform 20 movement combinations of five target positions. The WP-FB was transiently activated with five vibrating motors embedded in an armband to stimulate the arm stump when the prosthetic wrist rotates to the target positions. Our experimental results showed that when WP-FB was added to the prosthetic control, the absolute angular error (AAE) of the prosthetic wrist declined from 4.50° to 1.08° while the success rate 3 (SR3) increased from 0.34 to 0.84, respectively. This study demonstrates the importance of WP-FB to the effective control of the arm prosthesis

Directory of Open Access Journals

CrossCodeEval: A Diverse and Multilingual Benchmark for Cross-File Code Completion

Author: Ahmad Wasi Uddin
Bhatia Parminder
Ding Hantian
Ding Yangruibo
Jain Nihal
Nallapati Ramesh
Ramanathan Murali Krishna
Roth Dan
Tan Ming
Wang Zijian
Xiang Bing
Publication venue
Publication date: 16/11/2023
Field of study

Code completion models have made significant progress in recent years, yet current popular evaluation datasets, such as HumanEval and MBPP, predominantly focus on code completion tasks within a single file. This over-simplified setting falls short of representing the real-world software development scenario where repositories span multiple files with numerous cross-file dependencies, and accessing and understanding cross-file context is often required to complete the code correctly. To fill in this gap, we propose CrossCodeEval, a diverse and multilingual code completion benchmark that necessitates an in-depth cross-file contextual understanding to complete the code accurately. CrossCodeEval is built on a diverse set of real-world, open-sourced, permissively-licensed repositories in four popular programming languages: Python, Java, TypeScript, and C#. To create examples that strictly require cross-file context for accurate completion, we propose a straightforward yet efficient static-analysis-based approach to pinpoint the use of cross-file context within the current file. Extensive experiments on state-of-the-art code language models like CodeGen and StarCoder demonstrate that CrossCodeEval is extremely challenging when the relevant cross-file context is absent, and we see clear improvements when adding these context into the prompt. However, despite such improvements, the pinnacle of performance remains notably unattained even with the highest-performing model, indicating that CrossCodeEval is also capable of assessing model's capability in leveraging extensive context to make better code completion. Finally, we benchmarked various methods in retrieving cross-file context, and show that CrossCodeEval can also be used to measure the capability of code retrievers.Comment: To appear at NeurIPS 2023 (Datasets and Benchmarks Track

arXiv.org e-Print Archive

ContraGen: Effective Contrastive Learning For Causal Language Model

Author: Ahmad Wasi Uddin
Bhatia Parminder
Jain Nihal
Li Xiaopeng
Ma Xiaofei
Nallapati Ramesh
Nan Feng
Ray Baishakhi
Tan Ming
Wang Zijian
Xiang Bing
Zhang Dejiao
Publication venue
Publication date: 03/10/2022
Field of study

Despite exciting progress in large-scale language generation, the expressiveness of its representations is severely limited by the \textit{anisotropy} issue where the hidden representations are distributed into a narrow cone in the vector space. To address this issue, we present ContraGen, a novel contrastive learning framework to improve the representation with better uniformity and discrimination. We assess ContraGen on a wide range of downstream tasks in natural and programming languages. We show that ContraGen can effectively enhance both uniformity and discrimination of the representations and lead to the desired improvement on various language understanding tasks where discriminative representations are crucial for attaining good performance. Specifically, we attain

44\%

relative improvement on the Semantic Textual Similarity tasks and

34\%

on Code-to-Code Search tasks. Furthermore, by improving the expressiveness of the representations, ContraGen also boosts the source code generation capability with

9\%

relative improvement on execution accuracy on the HumanEval benchmark.Comment: 10 page

arXiv.org e-Print Archive

Effects of Different Exogenous Substances on the Protein Conformation and in Vitro Digestion Characteristics of Low-salt Tilapia Surimi

Author: Baiqiao OU
Min QIAN
Qiaoyu LIU
Weidong BAI
Xiaoyan LIU
Yuehua YE
Zijian TAN
Publication venue: The editorial department of Science and Technology of Food Industry
Publication date: 01/01/2024
Field of study

The effects of glutamine transaminase (TGase), hydroxypropyl distarch phosphate (HDP), gellan gum and their complex (THG) on the water distribution and protein conformation of low-salt tilapia surimi gel prepared with microwave and ultrasound were analyzed. In addition, the effects of different exogenous substances on the characteristics of low-salt tilapia fish cake were explored through in vitro digestion experiment. The results showed that compared with the control group, THG increased the bound water and immovable water of surimi to 98.71% and 14.75%, respectively, and significantly decreased the free water content (P<0.05). Moreover, THG promoted the transformation of α-helix to β-folding, β-turning and random curling structures. TGase and THG (0.4%) played important roles on gastric emptying rate, protein digestibility and protein hydrolysis degree of low-salt tilapia cake. THG significantly promoted protein decomposition into aggregates with smaller particle size (P<0.05). After the digestion of stomach and duodenum, color of the THG group products was more transparent and clear. And it could be observed by the laser confocal microscope that the red fluorescence highlights of the THG group were significantly reduced, indicating that proteins had been fully digested. Hence, compared with a single exogenous substance, THG not only promoted the binding of water molecules and proteins and induced the change of protein conformation, but also facilitated the exposure of hydrophobic protein groups and the interaction between proteins, and promoted the digestion and absorption of surimi products in the stomach and duodenum. This project provided a theoretical reference for the research on the gel properties of tilapia surimi and the development and application of tilapia fish cake

Directory of Open Access Journals

Ligand-Specific Factors Influencing GLP-1 Receptor Post-Endocytic Trafficking and Degradation in Pancreatic Beta Cells.

Author: Bitsi Stavroula
Bloom Stephen R
Broichhagen Johannes
Chen Shiqian
Corrêa Ivan R
David Alessia
Fang Zijian
Hodson David J
Jones Ben
Manchanda Yusman
Pickford Philip
Reimann Frank
Rutter Guy A
Salem Victoria
Shchepinova Maria M
Tan Tricia
Tate Edward W
Tomas Alejandra
Publication venue: Int J Mol Sci
Publication date: 01/01/2020
Field of study

The glucagon-like peptide-1 receptor (GLP-1R) is an important regulator of blood glucose homeostasis. Ligand-specific differences in membrane trafficking of the GLP-1R influence its signalling properties and therapeutic potential in type 2 diabetes. Here, we have evaluated how different factors combine to control the post-endocytic trafficking of GLP-1R to recycling versus degradative pathways. Experiments were performed in primary islet cells, INS-1 832/3 clonal beta cells and HEK293 cells, using biorthogonal labelling of GLP-1R to determine its localisation and degradation after treatment with GLP-1, exendin-4 and several further GLP-1R agonist peptides. We also characterised the effect of a rare GLP1R coding variant, T149M, and the role of endosomal peptidase endothelin-converting enzyme-1 (ECE-1), in GLP1R trafficking. Our data reveal how treatment with GLP-1 versus exendin-4 is associated with preferential GLP-1R targeting towards a recycling pathway. GLP-1, but not exendin-4, is a substrate for ECE-1, and the resultant propensity to intra-endosomal degradation, in conjunction with differences in binding affinity, contributes to alterations in GLP-1R trafficking behaviours and degradation. The T149M GLP-1R variant shows reduced signalling and internalisation responses, which is likely to be due to disruption of the cytoplasmic region that couples to intracellular effectors. These observations provide insights into how ligand- and genotype-specific factors can influence GLP-1R trafficking

University of Birmingham Research Portal

Spiral - Imperial College Digital Repository

Apollo (Cambridge)

DR-NTU (Digital Repository of NTU)

MPG.PuRe