12 research outputs found

    Defense Against Reward Poisoning Attacks in Reinforcement Learning

    Get PDF
    We study defense strategies against reward poisoning attacks in reinforcement learning. As a threat model, we consider attacks that minimally alter rewards to make the attacker's target policy uniquely optimal under the poisoned rewards, with the optimality gap specified by an attack parameter. Our goal is to design agents that are robust against such attacks in terms of the worst-case utility w.r.t. the true, unpoisoned, rewards while computing their policies under the poisoned rewards. We propose an optimization framework for deriving optimal defense policies, both when the attack parameter is known and unknown. Moreover, we show that defense policies that are solutions to the proposed optimization problems have provable performance guarantees. In particular, we provide the following bounds with respect to the true, unpoisoned, rewards: a) lower bounds on the expected return of the defense policies, and b) upper bounds on how suboptimal these defense policies are compared to the attacker's target policy. We conclude the paper by illustrating the intuitions behind our formal results, and showing that the derived bounds are non-trivial

    Admissible Policy Teaching through Reward Design

    Get PDF
    We study reward design strategies for incentivizing a reinforcement learning agent to adopt a policy from a set of admissible policies. The goal of the reward designer is to modify the underlying reward function cost-efficiently while ensuring that any approximately optimal deterministic policy under the new reward function is admissible and performs well under the original reward function. This problem can be viewed as a dual to the problem of optimal reward poisoning attacks: instead of forcing an agent to adopt a specific policy, the reward designer incentivizes an agent to avoid taking actions that are inadmissible in certain states. Perhaps surprisingly, and in contrast to the problem of optimal reward poisoning attacks, we first show that the reward design problem for admissible policy teaching is computationally challenging, and it is NP-hard to find an approximately optimal reward modification. We then proceed by formulating a surrogate problem whose optimal solution approximates the optimal solution to the reward design problem in our setting, but is more amenable to optimization techniques and analysis. For this surrogate problem, we present characterization results that provide bounds on the value of the optimal solution. Finally, we design a local search algorithm to solve the surrogate problem and showcase its utility using simulation-based experiments

    The prevalence of metabolic syndrome in the north of Iran. An epidemiologic comparative study

    Get PDF
    Background and Objective: The metabolic syndrome (MetS) increases the risk of cardiovascular diseases and the main aim of this study is to explorer prevalence of it in the north of Iran with comparison of Turkman and non-Turkman ethnic groups in 2012. Material and Methods: This is a cross-sectional study that conducted on the 248 subjects aged 25-70 years (Turkman=88 and non-Turkman=160). Individuals were chosen randomly from 25 clusters. Waist circumference was measured with the subject standing at the end of normal breathing; blood pressure was measured in three times and 5 ml of venous blood drawn after 8-12 h fast in the morning for laboratory test. Biochemical analysis including fasting blood glucose, triglyceride and high-density lipoprotein (HDL) cholesterol was assayed using a commercially kit (Pars Azmoon, Karaj, Iran). ATP-III method and SPSS 16.0 software (Chicago II, USA) were used for diagnosis of MetS and for statistical analyzes, respectively. P-value < 0.05 considered statistically significant. Results: Compare to Turkman group, the mean of FBG (fasting blood glucose), triglyceride and waist circumference are 15.9 mg/dl, 30.2 mg/dl and 6.5 cm were more in non-Turkman group, respectively (P<0.05 for all). The Pearson's correlation coefficient is positive between age and MetS (r=0.287, P=0.01). Generally, MetS was common in 37.9 of subjects and it was 14.7 in non-Turkman more than in Turkman people (P=0.015). Prevalence rate of MetS in men and women was 29.7 and 43.5, respectively (P=0.001). Conclusion: In the north of Iran, the prevalence of MetS is high and it was in non-Turkman ethnic group more than in Turkman group and in women more than in men while gender differences only was shown in non-Turkman ethnic group

    The association of fasting blood glucose (FBG) and waist circumference in northern adults in Iran: A population based study

    Get PDF
    Objectives: The aim of this study was to evaluate the association between Fasting Blood Glucose (FBG) level and Waist Circumference (WC) in men and women among 25-65 years old people in the north of Iran.Material and methods: This was a cross-sectional and analytical research gender that carried out on the 1797 subjects (941 males and 856 females) between 25-65 years old using multistage cluster sampling technique. FBG was measured in the morning after a 12-hour fast and was determined by using laboratory kits (enzymatic methods) and spectrophotometry technique. Central obesity was defined based on World Health Organization criteria: waist circumference ≄102 cm and ≄88 cm in men and women, respectively. The SPSS.16 software was used for statistical analysis.Results: As whole, the mean of FBG in women (98.3 ± 40.1 mg/dl) was higher than in men (94.6 ± 32.2 mg/dl). Also, the mean of WC in men 4.5 cm was lower than in women. In men, the mean of FBG statistically differs between normal and central obese subjects both in 35-45 year-age group (P = 0.001) and in 45-55 year-age group (P = 0.042). As whole, in men, the FBG level increased up 2.82 mg/dl in each 10 cm of WC with the highest rate in 35-45 year-age group. In totally, in women, the FBG level increased up 3.48 mg/dl in each 10 cm of WC and in 25-35 year-age group and it was higher than in other age groups. In men, the regression coefficients were constant with age increasing while in women it was decreased. Constant trend in men and decreasing trend in women with age was shown between FBG and WC. The cut-off point of WC for detecting of diabetes obtained 89 cm and 107 cm in men and women, respectively.Conclusion: The positive correlation was seen between WC and FBG level and it was declined with age in women. Cut-off point for detecting of diabetes in men was less than in women. WC is useable as a predictor of type 2 diabetes mellitus risk among adults in the north of Iran. © 2014 Veghari et al.; licensee BioMed Central Ltd

    Admissible policy teaching through reward design

    No full text
    We study reward design strategies for incentivizing a reinforcement learning agent to adopt a policy from a set of admissible policies. The goal of the reward designer is to modify the underlying reward function cost-efficiently while ensuring that any approximately optimal deterministic policy under the new reward function is admissible and performs well under the original reward function. This problem can be viewed as a dual to the problem of optimal reward poisoning attacks: instead of forcing an agent to adopt a specific policy, the reward designer incentivizes an agent to avoid taking actions that are inadmissible in certain states. Perhaps surprisingly, and in contrast to the problem of optimal reward poisoning attacks, we first show that the reward design problem for admissible policy teaching is computationally challenging, and it is NP-hard to find an approximately optimal reward modification. We then proceed by formulating a surrogate problem whose optimal solution approximates the optimal solution to the reward design problem in our setting, but is more amenable to optimization techniques and analysis. For this surrogate problem, we present characterization results that provide bounds on the value of the optimal solution. Finally, we design a local search algorithm to solve the surrogate problem and showcase its utility using simulation-based experiments

    Admissible Policy Teaching through Reward Design

    No full text
    We study reward design strategies for incentivizing a reinforcement learning agent to adopt a policy from a set of admissible policies. The goal of the reward designer is to modify the underlying reward function cost-efficiently while ensuring that any approximately optimal deterministic policy under the new reward function is admissible and performs well under the original reward function. This problem can be viewed as a dual to the problem of optimal reward poisoning attacks: instead of forcing an agent to adopt a specific policy, the reward designer incentivizes an agent to avoid taking actions that are inadmissible in certain states. Perhaps surprisingly, and in contrast to the problem of optimal reward poisoning attacks, we first show that the reward design problem for admissible policy teaching is computationally challenging, and it is NP-hard to find an approximately optimal reward modification. We then proceed by formulating a surrogate problem whose optimal solution approximates the optimal solution to the reward design problem in our setting, but is more amenable to optimization techniques and analysis. For this surrogate problem, we present characterization results that provide bounds on the value of the optimal solution. Finally, we design a local search algorithm to solve the surrogate problem and showcase its utility using simulation-based experiments

    The role of motivation in MOOCs’ retention rates : a systematic literature review

    No full text
    Although MOOCs platforms offer a unique way to provide information for a large cohort of participants, only a small percentage of participants complete MOOCs. The high number of dropouts in MOOCs is a key challenge, and the literature suggests that it can be affected by participants' motivation. However, it is not known how and to what extent motivation influences participants’ dropout in MOOCs. There is a need to provide an overview of the role of motivation in MOOCs’ retention. In this study, we aimed to identify motivational factors and theories that affect participants’ retention in MOOCs and explain how does motivation supports participants to complete MOOCs. To do so, a systematic review was conducted using specific inclusion and exclusion criteria and a set of relevant keywords and databases which resulted in 50 relevant publications. Our analysis led us to identify six main motivational factors that influence participants’ MOOCs completion including academic, social, course, personal, professional, and technological motives. These factors were divided into two main categories including need-based motivation and interest-based motivation. The results showed that academic motives play the most important role in participants’ MOOCs retention compared to the other factors. It was also found that self-determination theory was used as the most dominant theory to support participants’ motivation for MOOCs completion. In addition, the results revealed that the motivational factors not only impacts participants’ MOOCs retention directly, but also this impact is mediated by participant satisfaction, self-regulation, attitude toward using MOOCs, performance, engagement, and level of participation. Based on the results, further implications for practice and future research are provided
    corecore