6,882 research outputs found

    Optimizing the CVaR via Sampling

    Full text link
    Conditional Value at Risk (CVaR) is a prominent risk measure that is being used extensively in various domains. We develop a new formula for the gradient of the CVaR in the form of a conditional expectation. Based on this formula, we propose a novel sampling-based estimator for the CVaR gradient, in the spirit of the likelihood-ratio method. We analyze the bias of the estimator, and prove the convergence of a corresponding stochastic gradient descent algorithm to a local CVaR optimum. Our method allows to consider CVaR optimization in new domains. As an example, we consider a reinforcement learning application, and learn a risk-sensitive controller for the game of Tetris.Comment: To appear in AAAI 201

    DYNAMIC PROGRAMMING: HAS ITS DAY ARRIVED?

    Get PDF
    Research Methods/ Statistical Methods,

    Econometric Methods for Endogenously Sampled Time Series: The Case of Commodity Price Speculation in the Steel Market

    Get PDF
    This paper studies the econometric problems associated with estimation of a stochastic process that is endogenously sampled. Our interest is to infer the law of motion of a discrete-time stochastic process {p_t} that is observed only at a subset of times {t_1,...,t_n} that depend on the outcome of a probabilistic sampling rule that depends on the history of the process as well as other observed covariates x_t. We focus on a particular example where p_t denotes the daily wholesale price of a standardized steel product. However there are no formal exchanges or centralized markets where steel is traded and pt can be observed. Instead nearly all steel transaction prices are a result of private bilateral negotiations between buyers and sellers, typically intermediated by middlemen known as steel service centers. Even though there is no central record of daily transactions prices in the steel market, we do observe transaction prices for a particular firm -- a steel service center that purchases large quantities of steel in the wholesale market for subsequent resale in the retail market. The endogenous sampling problem arises from the fact that the firm only records p_t on the days that it purchases steel. We present a parametric analysis of this problem under the assumption that the timing of steel purchases is part of an optimal trading strategy that maximizes the firm's expected discounted trading profits. We derive a parametric partial information maximum likelihood (PIML) estimator that solves the endogenous sampling problem and efficiently estimates the unknown parameters of a Markov transition probability that determines the law of motion for the underlying {p_t} process. The PIML estimator also yields estimates of the structural parameters that determine the optimal trading rule. We also introduce an alternative consistent, less efficient, but computationally simpler simulated minimum distance (SMD) estimator that avoids high dimensional numerical integrations required by the PIML estimator. Using the SMD estimator, we provide estimates of a truncated lognormal AR(1) model of the wholesale price processes for particular types of steel plate. We use this to infer the share of the middleman's discounted profits that are due to markups paid by its retail customers, and the share due to price speculation. The latter measures the firm's success in forecasting steel prices and in timing its purchases in order to "buy low and sell high'." The more successful the firm is in speculation (i.e., in strategically timing its purchases), the more serious are the potential biases that would result from failing to account for the endogeneity of the sampling process.Endogenous sampling, Markov processes, Maximum likelihood, Simulation estimation

    Econometric Methods for Endogenously Sampled Time Series: The Case of Commodity Price Speculation in the Steel Market

    Get PDF
    This paper studies the econometric problems associated with estimation of a stochastic process that is endogenously sampled. Our interest is to infer the law of motion of a discrete-time stochastic process {pt} that is observed only at a subset of times {t1,..., tn} that depend on the outcome of a probabilistic sampling rule that depends on the history of the process as well as other observed covariates xt . We focus on a particular example where pt denotes the daily wholesale price of a standardized steel product. However there are no formal exchanges or centralized markets where steel is traded and pt can be observed. Instead nearly all steel transaction prices are a result of private bilateral negotiations between buyers and sellers, typically intermediated by middlemen known as steel service centers. Even though there is no central record of daily transactions prices in the steel market, we do observe transaction prices for a particular firm -- a steel service center that purchases large quantities of steel in the wholesale market for subsequent resale in the retail market. The endogenous sampling problem arises from the fact that the firm only records pt on the days that it purchases steel. We present a parametric analysis of this problem under the assumption that the timing of steel purchases is part of an optimal trading strategy that maximizes the firm's expected discounted trading profits. We derive a parametric partial information maximum likelihood (PIML) estimator that solves the endogenous sampling problem and efficiently estimates the unknown parameters of a Markov transition probability that determines the law of motion for the underlying {pt} process. The PIML estimator also yields estimates of the structural parameters that determine the optimal trading rule. We also introduce an alternative consistent, less efficient, but computationally simpler simulated minimum distance (SMD) estimator that avoids high dimensional numerical integrations required by the PIML estimator. Using the SMD estimator, we provide estimates of a truncated lognormal AR(1) model of the wholesale price processes for particular types of steel plate. We use this to infer the share of the middleman's discounted profits that are due to markups paid by its retail customers, and the share due to price speculation. The latter measures the firm's success in forecasting steel prices and in timing its purchases in order to buy low and sell high'. The more successful the firm is in speculation (i.e. in strategically timing its purchases), the more serious are the potential biases that would result from failing to account for the endogeneity of the sampling process.
    • …
    corecore