Search CORE

28,649 research outputs found

Reinforcement Learning Based on Real-Time Iteration NMPC

Author: Gros Sébastien
Kungurtsev Vyacheslav
Zanon Mario
Publication venue
Publication date: 01/01/2020
Field of study

Reinforcement Learning (RL) has proven a stunning ability to learn optimal policies from data without any prior knowledge on the process. The main drawback of RL is that it is typically very difficult to guarantee stability and safety. On the other hand, Nonlinear Model Predictive Control (NMPC) is an advanced model-based control technique which does guarantee safety and stability, but only yields optimality for the nominal model. Therefore, it has been recently proposed to use NMPC as a function approximator within RL. While the ability of this approach to yield good performance has been demonstrated, the main drawback hindering its applicability is related to the computational burden of NMPC, which has to be solved to full convergence. In practice, however, computationally efficient algorithms such as the Real-Time Iteration (RTI) scheme are deployed in order to return an approximate NMPC solution in very short time. In this paper we bridge this gap by extending the existing theoretical framework to also cover RL based on RTI NMPC. We demonstrate the effectiveness of this new RL approach with a nontrivial example modeling a challenging nonlinear system subject to stochastic perturbations with the objective of optimizing an economic cost.Comment: accepted for the IFAC World Congress 202

arXiv.org e-Print Archive

Archivio della ricerca della Scuola IMT Alti Studi Lucca

Shape-constrained Estimation of Value Functions

Author: Glynn Peter W.
Mousavi Mohammad
Publication venue
Publication date: 01/01/2013
Field of study

We present a fully nonparametric method to estimate the value function, via simulation, in the context of expected infinite-horizon discounted rewards for Markov chains. Estimating such value functions plays an important role in approximate dynamic programming and applied probability in general. We incorporate "soft information" into the estimation algorithm, such as knowledge of convexity, monotonicity, or Lipchitz constants. In the presence of such information, a nonparametric estimator for the value function can be computed that is provably consistent as the simulated time horizon tends to infinity. As an application, we implement our method on price tolling agreement contracts in energy markets

arXiv.org e-Print Archive

CiteSeerX

Operational Research in Education

Author: Abbott
Abbott
Abdullah
Abramson
Abramson
Aigner
Aitken
Al-Betar
Al-Betar
Al-Yakoob
Aladag
Alexander
Alper
Alvarez-Valdes
Armitage
Atkinson
Atteberry
Baker
Balakrishnan
Balakrishnan
Banker
Barham
Battese
Bayraktar
Beasley
Beasley
Bektaş
Belford
Beligiannis
BenDavid-Hadar
Berkner
Bessent
Beyrouthy
Birbas
Blaug
Blaug
Bodin
Bovet
Bradley
Bradley
Brailsford
Brandeau
Burke
Burke
Burke
Burke
Burke
Burke
Burke
Burke
Burke
Burke
Burke
Burke
Burke
Burke
Burke
Burke
Burney
Caballero
Carter
Carter
Carter
Carter
Carter
Casu
Casu
Ceylan
Charnes
Charnes
Charnes
Cherchye
Chizmar
Chizmar
Clarke
Clough
Cobacho
Colbert
Cook
Corberán
Cordero-Ferrera
Costa
Cutshall
da Fonseca
De Causmaecker
de Werra
de Werra
De Witte
De Witte
Dean
Dehnokhalaji
Deris
Dhar
Dickmeyer
Diminnie
Dimitrov
Dimopoulou
Dimopoulou
Dowsland
Dowsland
Doyle
Duh
Dyson
Eglese
Emrouznejad
Epstein
Epstein
Essid
Essid
Fan
Fandel
Fandel
Ferland
Finlay
Fukuyama
Gac
Gallego
Gani
Gass
Gass
Geiger
Geoffrion
Giménez
Giménez
Gochenour
Golany
Goldstein
Goldstein
Gosselin
Goyal
Gray
Griffin
Grosskopf
Grosskopf
Grosskopf
Gyimah-Brempong
Haelermans
Haelermans
Haelermans
Harden
Hartl
Hemaida
Hertz
Hertz
Hinchliffe
Holder
Holder
Holloway
Hopkins
Hopkins
Hopkins
Horvath
Howard
Huédé
Jamison
Jauch
Jennergren
Jessop
Jill Johnes
Johnes
Johnes
Johnes
Johnes
Johnes
Johnes
Johnes
Johnes
Johnes
Johnes
Johnes
Johnes
Johnes
Johnson
Johnson
Jondrow
Kao
Kao
Kao
Kheiri
Kirby
Kirjavainen
Knutson
Kodama
Kwak
Kwak
Ladley
Lara-Velázquez
Lee
Lee
Lee
Lenzner
Letchford
Lewin
Lewis
Lewis
Li
Liu
Lofti
Lovell
Mancebón
Mancebón
Mancebón
Mar Molinero
Mar Molinero
Mar Molinero
Mar Molinero
Massy
Massy
Mayston
McCollum
McCollum
McKeown
McMillan
Meeusen
Mirrazavi
Miyaji
Moreno
Moscato
Moura
Muller
Mumford
Mühlenthaler
Nicholls
Nicholls
Nicholls
Nicholls
Nicholls
O'Brien
Ochoa
OECD
Oliver
Ondrich
Ouellette
Ozdemir
Pais
Pantelous
Papoutsis
Paucar-Caceres
Petrovic
Pillay
Pillay
Pillay
Pillay
Platt
Podinovski
Politis
Portela
Portela
Portela
Post
Powell
Qing
Qu
Qu
Ramanathan
Rassouli-Currier
Rath
Raveh
Ray
Ray
Rayeni
Reeves
Ritzen
Romero
Romero
Ruggiero
Ruggiero
Ruggiero
Ruggiero
Ruggiero
Ruggiero
Ruggiero
Russell
Saltzman
Sammons
Sampson
Santos
Sarrico
Sarrico
Schniederjans
Schoepfle
Schroeder
Schroeder
Schultz
Scott
Shah
Shepherd
Simon
Sinuany-Stern
Smith
Smith
Soria-Alcaraz
Southwick
Soyibo
Spada
Stevens
Stone
Stone
Sutcliffe
Sutcliffe
Tadisina
Tavares
Taylor
Thanassoulis
Thanassoulis
Thanassoulis
Thanassoulis
Thanassoulis
Thanassoulis
Thompson
Tripathy
Turabieh
Van Dusseldorp
Waldo
Walters
Ward
Wehrung
Weir
Weisbrod
Weitz
Weitz
Welsh
White
White
Wood
Woodhouse
Wright
Zhang
Zoghbi
Özcan
Publication venue: 'Elsevier BV'
Publication date: 29/10/2014
Field of study

Operational Research (OR) techniques have been applied, from the early stages of the discipline, to a wide variety of issues in education. At the government level, these include questions of what resources should be allocated to education as a whole and how these should be divided amongst the individual sectors of education and the institutions within the sectors. Another pertinent issue concerns the efficient operation of institutions, how to measure it, and whether resource allocation can be used to incentivise efficiency savings. Local governments, as well as being concerned with issues of resource allocation, may also need to make decisions regarding, for example, the creation and location of new institutions or closure of existing ones, as well as the day-to-day logistics of getting pupils to schools. Issues of concern for managers within schools and colleges include allocating the budgets, scheduling lessons and the assignment of students to courses. This survey provides an overview of the diverse problems faced by government, managers and consumers of education, and the OR techniques which have typically been applied in an effort to improve operations and provide solutions

Crossref

Lancaster E-Prints

University of Huddersfield Repository

The flexible coefficient multinomial logit (FC-MNL) model of demand for differentiated products

Author: Davis Peter
Schiraldi Pasquale
Publication venue: The London School of Economics and Political Science
Publication date: 01/01/2013
Field of study

We show FC-MNL is flexible in the sense of Diewert (1974), thus its parameters can be chosen to match a well-defined class of possible own- and cross-price elasticities of demand. In contrast to models such as Probit and Random Coefficient-MNL models, FC-MNL does not require estimation via simulation; it is fully analytic. Under well-defined and testable parameter restrictions, FC-MNL is shown to be an unexplored member of McFadden’s class of Multivariate Extreme Value discrete-choice models. Therefore, FC-MNL is fully consistent with an underlying structural model of heterogeneous, utility-maximizing consumers. We provide a Monte-Carlo study to establish its properties and we illustrate the use by estimating the demand for new automobiles in Italy

CiteSeerX

Crossref

LSE Research Online

Recommended from our members

Econometrics: A bird's eye view

Author: Geweke J
Horowitz JL
Pesaran MH
Publication venue: Macmillan
Publication date: 01/01/2008
Field of study

As a unified discipline, econometrics is still relatively young and has been transforming and expanding very rapidly over the past few decades. Major advances have taken place in the analysis of cross sectional data by means of semi-parametric and non-parametric techniques. Heterogeneity of economic relations across individuals, firms and industries is increasingly acknowledge and attempts have been made to take them into account either by integrating out their effects or by modeling the sources of heterogeneity when suitable panel data exists. The counterfactual considerations that underlie policy analysis and treatment evaluation have been given a more satisfactory foundation. New time series econometric techniques have been developed and employed extensively in the areas of macroeconometrics and finance. Non-linear econometric techniques are used increasingly in the analysis of cross section and time series observations. Applications of Bayesian techniques to econometric problems have been given new impetus largely thanks to advances in computer power and computational techniques. The use of Bayesian techniques have in turn provided the investigators with a unifying framework where the tasks and forecasting, decision making, model evaluation and learning can be considered as parts of the same interactive and iterative process; thus paving the way for establishing the foundation of the "real time econometrics". This paper attempts to provide an overview of some of these developments

Apollo (Cambridge)

Have Econometric Analyses of Happiness Data Been Futile? A Simple Truth About Happiness Scales

Author: Chen Le-Yu
Oparina Ekaterina
Powdthavee Nattavudh
Srisuma Sorawoot
Publication venue
Publication date: 20/02/2019
Field of study

Econometric analyses in the happiness literature typically use subjective well-being (SWB) data to compare the mean of observed or latent happiness across samples. Recent critiques show that comparing the mean of ordinal data is only valid under strong assumptions that are usually rejected by SWB data. This leads to an open question whether much of the empirical studies in the economics of happiness literature have been futile. In order to salvage some of the prior results and avoid future issues, we suggest regression analysis of SWB (and other ordinal data) should focus on the median rather than the mean. Median comparisons using parametric models such as the ordered probit and logit can be readily carried out using familiar statistical softwares like STATA. We also show a previously assumed impractical task of estimating a semiparametric median ordered-response model is also possible by using a novel constrained mixed integer optimization technique. We use GSS data to show the famous Easterlin Paradox from the happiness literature holds for the US independent of any parametric assumption

arXiv.org e-Print Archive

LSE Research Online

Warwick Research Archives Portal Repository

DR-NTU (Digital Repository of NTU)