Search CORE

116,451 research outputs found

Smoothing Policies and Safe Policy Gradients

Author: Papini Matteo
Pirotta Matteo
Restelli Marcello
Publication venue
Publication date: 08/05/2019
Field of study

Policy gradient algorithms are among the best candidates for the much anticipated application of reinforcement learning to real-world control tasks, such as the ones arising in robotics. However, the trial-and-error nature of these methods introduces safety issues whenever the learning phase itself must be performed on a physical system. In this paper, we address a specific safety formulation, where danger is encoded in the reward signal and the learning agent is constrained to never worsen its performance. By studying actor-only policy gradient from a stochastic optimization perspective, we establish improvement guarantees for a wide class of parametric policies, generalizing existing results on Gaussian policies. This, together with novel upper bounds on the variance of policy gradient estimators, allows to identify those meta-parameter schedules that guarantee monotonic improvement with high probability. The two key meta-parameters are the step size of the parameter updates and the batch size of the gradient estimators. By a joint, adaptive selection of these meta-parameters, we obtain a safe policy gradient algorithm

arXiv.org e-Print Archive

Archivio istituzionale della ricerca - Politecnico di Milano

UPF Digital Repository

Achieving the Millennium Development Goals in Sub-Saharan Africa : a macroeconomic monitoring framework

Author: Agenor Pierre-Richard
Bayraktar Nihal
El Aynaoui Karim
Pinto Moreira Emmanuel
Publication venue
Publication date
Field of study

The authors present an integrated macroeconomic approach to monitoring progress toward achieving the Millennium Development Goals (MDGs) in Sub-Saharan Africa. At the heart of their approach is a macroeconomic model that captures key linkages between foreign aid, public investment (disaggregated into education, infrastructure, and health), the supply side, and poverty. The model is linked through cross-section regressions to indicators of malnutrition, infant mortality, life expectancy, and access to safe water. A composite MDG indicator is also calculated. The functioning of the framework is illustrated by simulating the impact of an increase in aid and a debt write-off for Niger at the MDG horizon of 2015, under alternative assumptions about the degree of efficiency of public investment. The authors'approach can serve as the building block of Strategy Papers for Human Development (SPAHD), a more encompassing concept than the current"Poverty Reduction"Strategy Papers.Economic Theory&Research,Public Sector Economics&Finance,Inequality,Investment and Investment Climate,Achieving Shared Growth

Research Papers in Economics

The Impact of After-School Programs: Interpreting the Results of Four Recent Evaluations

Author: Thomas J. Kane
Publication venue: William T. Grant Foundation
Publication date: 01/01/2004
Field of study

Within the last decade, after-school programs have moved from the periphery to the center of the national education policy debate. The demand for after-school care by working parents and a new focus on test-based accountability are the two primary reasons. Reflecting these pressures, federal funding for after-school programs has grown dramatically over the last half-decade. Between 1998 and 2002, federal funding for the 21st Century Community Learning Centers program grew from

40 million to

1 billion. State and local governments have also increased their funding, with California committing itself to a six- fold increase in funding for after-school programs over the next few years.As a wave of evaluation results has recently become available, policymakers are understandably eager to see evidence that these investments are paying off. The purpose of this review is to summarize the results of four recent evaluations, to draw the lessons we have learned so far, and to identify the unanswered questions

CiteSeerX

IssueLab

Recommended from our members

Understanding Model-Based Reinforcement Learning and its Application in Safe Reinforcement Learning

Author: Hu Dingcheng
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

Model-based reinforcement learning algorithms have been shown to achieve successful results on various continuous control benchmarks, but the understanding of model-based methods is limited. We try to interpret how model-based method works through novel experiments on state-of-the-art algorithms with an emphasis on the model learning part. We evaluate the role of the model learning in policy optimization and propose methods to learn a more accurate model. With a better understanding of model-based reinforcement learning, we then apply model-based methods to solve safe reinforcement learning (RL) problems with near-zero violation of hard constraints throughout training. Drawing an analogy with how humans and animals learn to perform safe actions, we break down the safe RL problem into three stages. First, we train agents in a constraint-free environment to learn a performant policy for reaching high rewards, and simultaneously learn a model of the dynamics. Second, we use model-based methods to plan safe actions and train a safeguarding policy from these actions through imitation. Finally, we propose a factored framework to train an overall policy that mixes the performant policy and the safeguarding policy. This three-step curriculum ensures near-zero violation of safety constraints at all times. As an advantage of model-based method, the sample complexity required at the second and third steps of the process is significantly lower than model-free methods and can enable online safe learning. We demonstrate the effectiveness of our methods in various continuous control problems and analyze the advantages over state-of-the-art approaches

eScholarship - University of California