28,649 research outputs found
Reinforcement Learning Based on Real-Time Iteration NMPC
Reinforcement Learning (RL) has proven a stunning ability to learn optimal
policies from data without any prior knowledge on the process. The main
drawback of RL is that it is typically very difficult to guarantee stability
and safety. On the other hand, Nonlinear Model Predictive Control (NMPC) is an
advanced model-based control technique which does guarantee safety and
stability, but only yields optimality for the nominal model. Therefore, it has
been recently proposed to use NMPC as a function approximator within RL. While
the ability of this approach to yield good performance has been demonstrated,
the main drawback hindering its applicability is related to the computational
burden of NMPC, which has to be solved to full convergence. In practice,
however, computationally efficient algorithms such as the Real-Time Iteration
(RTI) scheme are deployed in order to return an approximate NMPC solution in
very short time. In this paper we bridge this gap by extending the existing
theoretical framework to also cover RL based on RTI NMPC. We demonstrate the
effectiveness of this new RL approach with a nontrivial example modeling a
challenging nonlinear system subject to stochastic perturbations with the
objective of optimizing an economic cost.Comment: accepted for the IFAC World Congress 202
Shape-constrained Estimation of Value Functions
We present a fully nonparametric method to estimate the value function, via
simulation, in the context of expected infinite-horizon discounted rewards for
Markov chains. Estimating such value functions plays an important role in
approximate dynamic programming and applied probability in general. We
incorporate "soft information" into the estimation algorithm, such as knowledge
of convexity, monotonicity, or Lipchitz constants. In the presence of such
information, a nonparametric estimator for the value function can be computed
that is provably consistent as the simulated time horizon tends to infinity. As
an application, we implement our method on price tolling agreement contracts in
energy markets
Operational Research in Education
Operational Research (OR) techniques have been applied, from the early stages of the discipline, to a wide variety of issues in education. At the government level, these include questions of what resources should be allocated to education as a whole and how these should be divided amongst the individual sectors of education and the institutions within the sectors. Another pertinent issue concerns the efficient operation of institutions, how to measure it, and whether resource allocation can be used to incentivise efficiency savings. Local governments, as well as being concerned with issues of resource allocation, may also need to make decisions regarding, for example, the creation and location of new institutions or closure of existing ones, as well as the day-to-day logistics of getting pupils to schools. Issues of concern for managers within schools and colleges include allocating the budgets, scheduling lessons and the assignment of students to courses. This survey provides an overview of the diverse problems faced by government, managers and consumers of education, and the OR techniques which have typically been applied in an effort to improve operations and provide solutions
The flexible coefficient multinomial logit (FC-MNL) model of demand for differentiated products
We show FC-MNL is flexible in the sense of Diewert (1974), thus its parameters can be chosen to match a well-defined class of possible own- and cross-price elasticities of demand. In contrast to models such as Probit and Random Coefficient-MNL models, FC-MNL does not require estimation via simulation; it is fully analytic. Under well-defined and testable parameter restrictions, FC-MNL is shown to be an unexplored member of McFaddenâs class of Multivariate Extreme Value discrete-choice models. Therefore, FC-MNL is fully consistent with an underlying structural model of heterogeneous, utility-maximizing consumers. We provide a Monte-Carlo study to establish its properties and we illustrate the use by estimating the demand for new automobiles in Italy
Recommended from our members
Econometrics: A bird's eye view
As a unified discipline, econometrics is still relatively young and has been transforming and expanding very rapidly over the past few decades. Major advances have taken place in the analysis of cross sectional data by means of semi-parametric and non-parametric techniques. Heterogeneity of economic relations across individuals, firms and industries is increasingly acknowledge and attempts have been made to take them into account either by integrating out their effects or by modeling the sources of heterogeneity when suitable panel data exists. The counterfactual considerations that underlie policy analysis and treatment evaluation have been given a more satisfactory foundation. New time series econometric techniques have been developed and employed extensively in the areas of macroeconometrics and finance. Non-linear econometric techniques are used increasingly in the analysis of cross section and time series observations. Applications of Bayesian techniques to econometric problems have been given new impetus largely thanks to advances in computer power and computational techniques. The use of Bayesian techniques have in turn provided the investigators with a unifying framework where the tasks and forecasting, decision making, model evaluation and learning can be considered as parts of the same interactive and iterative process; thus paving the way for establishing the foundation of the "real time econometrics". This paper attempts to provide an overview of some of these developments
Have Econometric Analyses of Happiness Data Been Futile? A Simple Truth About Happiness Scales
Econometric analyses in the happiness literature typically use subjective
well-being (SWB) data to compare the mean of observed or latent happiness
across samples. Recent critiques show that comparing the mean of ordinal data
is only valid under strong assumptions that are usually rejected by SWB data.
This leads to an open question whether much of the empirical studies in the
economics of happiness literature have been futile. In order to salvage some of
the prior results and avoid future issues, we suggest regression analysis of
SWB (and other ordinal data) should focus on the median rather than the mean.
Median comparisons using parametric models such as the ordered probit and logit
can be readily carried out using familiar statistical softwares like STATA. We
also show a previously assumed impractical task of estimating a semiparametric
median ordered-response model is also possible by using a novel constrained
mixed integer optimization technique. We use GSS data to show the famous
Easterlin Paradox from the happiness literature holds for the US independent of
any parametric assumption
- âŠ