50,461 research outputs found

    The application of dynamic programming to optimal inventory control

    Get PDF
    Published versio

    Reliability-based economic model predictive control for generalized flow-based networks including actuators' health-aware capabilities

    Get PDF
    This paper proposes a reliability-based economic model predictive control (MPC) strategy for the management of generalized flow-based networks, integrating some ideas on network service reliability, dynamic safety stock planning, and degradation of equipment health. The proposed strategy is based on a single-layer economic optimisation problem with dynamic constraints, which includes two enhancements with respect to existing approaches. The first enhancement considers chance-constraint programming to compute an optimal inventory replenishment policy based on a desired risk acceptability level, leading to dynamically allocate safety stocks in flow-based networks to satisfy non-stationary flow demands. The second enhancement computes a smart distribution of the control effort and maximises actuators’ availability by estimating their degradation and reliability. The proposed approach is illustrated with an application of water transport networks using the Barcelona network as the considered case study.Peer ReviewedPostprint (author's final draft

    Computing policy parameters for stochastic inventory control using stochastic dynamic programming approaches

    Get PDF
    The objective of this work is to introduce techniques for the computation of optimal and near-optimal inventory control policy parameters for the stochastic inventory control problem under Scarf’s setting. A common aspect of the solutions presented herein is the usage of stochastic dynamic programming approaches, a mathematical programming technique introduced by Bellman. Stochastic dynamic programming is hybridised with branch-and-bound, binary search, constraint programming and other computational techniques to develop innovative and competitive solutions. In this work, the classic single-item, single location-inventory control with penalty cost under the independent stochastic demand is extended to model a fixed review cost. This cost is charged when the inventory level is assessed at the beginning of a period. This operation is costly in practice and including it can lead to significant savings. This makes it possible to model an order cancellation penalty charge. The first contribution hereby presented is the first stochastic dynamic program- ming that captures Bookbinder and Tan’s static-dynamic uncertainty control policy with penalty cost. Numerous techniques are available in the literature to compute such parameters; however, they all make assumptions on the de- mand probability distribution. This technique has many similarities to Scarf’s stochastic dynamic programming formulation, and it does not require any ex- ternal solver to be deployed. Memoisation and binary search techniques are deployed to improve computational performances. Extensive computational studies show that this new model has a tighter optimality gap compared to the state of the art. The second contribution is to introduce the first procedure to compute cost- optimal parameters for the well-known (R, s, S) policy. Practitioners widely use such a policy; however, the determination of its parameters is considered com- putationally prohibitive. A technique that hybridises stochastic dynamic pro- gramming and branch-and-bound is presented, alongside with computational enhancements. Computing the optimal policy allows the determination of op- timality gaps for future heuristics. This approach can solve instances of consid- erable size, making it usable by practitioners. The computational study shows the reduction of the cost that such a system can provide. Thirdly, this work presents the first heuristics for determining the near-optimal parameters for the (R,s,S) policy. The first is an algorithm that formally models the (R,s,S) policy computation in the form of a functional equation. The second is a heuristic formed by a hybridisation of (R,S) and (s,S) policy parameters solvers. These heuristics can compute near-optimal parameters in a fraction of time compared to the exact methods. They can be used to speed up the optimal branch-and-bound technique. The last contribution is the introduction of a technique to encode dynamic programming in constraint programming. Constraint programming provides the user with an expressive modelling language and delegates the search for the solution to a specific solver. The possibility to seamlessly encode dynamic programming provides new modelling options, e.g. the computation of optimal (R,s,S) policy parameters. The performances in this specific application are not competitive with the other techniques proposed herein; however, this encoding opens up new connections between constraint programming and dynamic programming. The encoding allows deploying DP based constraints in modelling languages such as MiniZinc. The computational study shows how this technique can outperform a similar encoding for mixed-integer programming

    Mathematical programming heuristics for nonstationary stochastic inventory control

    Get PDF
    This work focuses on the computation of near-optimal inventory policies for a wide range of problems in the field of nonstationary stochastic inventory control. These problems are modelled and solved by leveraging novel mathematical programming models built upon the application of stochastic programming bounding techniques: Jensen's lower bound and Edmundson-Madanski upper bound. The single-item single-stock location inventory problem under the classical assumption of independent demand is a long-standing problem in the literature of stochastic inventory control. The first contribution hereby presented is the development of the first mathematical programming based model for computing near-optimal inventory policy parameters for this problem; the model is then paired with a binary search procedure to tackle large-scale problems. The second contribution is to relax the independence assumption and investigate the case in which demand in different periods is correlated. More specifically, this work introduces the first stochastic programming model that captures Bookbinder and Tan's static-dynamic uncertainty control policy under nonstationary correlated demand; in addition, it discusses a mathematical programming heuristic that computes near-optimal policy parameters under normally distributed demand featuring correlation, as well as under a collection of time-series-based demand process. Finally, the third contribution is to consider a multi-item stochastic inventory system subject to joint replenishment costs. This work presents the first mathematical programming heuristic for determining near-optimal inventory policy parameters for this system. This model comes with the advantage of tackling nonstationary demand, a variant which has not been previously explored in the literature. Unlike other existing approaches in the literature, these mathematical programming models can be easily implemented and solved by using off-the-shelf mathematical programming packages, such as IBM ILOG optimisation studio and XPRESS Optimizer; and do not require tedious computer coding. Extensive computational studies demonstrate that these new models are competitive in terms of cost performance: in the case of independent demand, they provide the best optimality gap in the literature; in the case of correlated demand, they yield tight optimality gap; in the case of nonstationary joint replenishment problem, they are competitive with state-of-the-art approaches in the literature and come with the advantage of being able to tackle nonstationary problems

    Convex Q Learning in a Stochastic Environment: Extended Version

    Full text link
    The paper introduces the first formulation of convex Q-learning for Markov decision processes with function approximation. The algorithms and theory rest on a relaxation of a dual of Manne's celebrated linear programming characterization of optimal control. The main contributions firstly concern properties of the relaxation, described as a deterministic convex program: we identify conditions for a bounded solution, and a significant relationship between the solution to the new convex program, and the solution to standard Q-learning. The second set of contributions concern algorithm design and analysis: (i) A direct model-free method for approximating the convex program for Q-learning shares properties with its ideal. In particular, a bounded solution is ensured subject to a simple property of the basis functions; (ii) The proposed algorithms are convergent and new techniques are introduced to obtain the rate of convergence in a mean-square sense; (iii) The approach can be generalized to a range of performance criteria, and it is found that variance can be reduced by considering ``relative'' dynamic programming equations; (iv) The theory is illustrated with an application to a classical inventory control problem.Comment: Extended version of "Convex Q-learning in a stochastic environment", IEEE Conference on Decision and Control, 2023 (to appear

    Generalizing backdoors

    Get PDF
    Abstract. A powerful intuition in the design of search methods is that one wants to proactively select variables that simplify the problem instance as much as possible when these variables are assigned values. The notion of “Backdoor ” variables follows this intuition. In this work we generalize Backdoors in such a way to allow more general classes of sub-solvers, both complete and heuristic. In order to do so, Pseudo-Backdoors and Heuristic-Backdoors are formally introduced and then applied firstly to a simple Multiple Knapsack Problem and secondly to a complex combinatorial optimization problem in the area of stochastic inventory control. Our preliminary computational experience shows the effectiveness of these approaches that are able to produce very low run times and — in the case of Heuristic-Backdoors — high quality solutions by employing very simple heuristic rules such as greedy local search strategies.

    Local Water Storage Control for the Developing World

    Full text link
    Most cities in India do not have water distribution networks that provide water throughout the entire day. As a result, it is common for homes and apartment buildings to utilize water storage systems that are filled during a small window of time in the day when the water distribution network is active. However, these water storage systems do not have disinfection capabilities, and so long durations of storage (i.e., as few as four days) of the same water leads to substantial increases in the amount of bacteria and viruses in that water. This paper considers the stochastic control problem of deciding how much water to store each day in the system, as well as deciding when to completely empty the water system, in order to tradeoff: the financial costs of the water, the health costs implicit in long durations of storing the same water, the potential for a shortfall in the quantity of stored versus demanded water, and water wastage from emptying the system. To solve this problem, we develop a new Binary Dynamic Search (BiDS) algorithm that is able to use binary search in one dimension to compute the value function of stochastic optimal control problems with controlled resets to a single state and with constraints on the maximum time span in between resets of the system

    A Neuroevolutionary Approach to Stochastic Inventory Control in Multi-Echelon Systems

    Get PDF
    Stochastic inventory control in multi-echelon systems poses hard problems in optimisation under uncertainty. Stochastic programming can solve small instances optimally, and approximately solve larger instances via scenario reduction techniques, but it cannot handle arbitrary nonlinear constraints or other non-standard features. Simulation optimisation is an alternative approach that has recently been applied to such problems, using policies that require only a few decision variables to be determined. However, to find optimal or near-optimal solutions we must consider exponentially large scenario trees with a corresponding number of decision variables. We propose instead a neuroevolutionary approach: using an artificial neural network to compactly represent the scenario tree, and training the network by a simulation-based evolutionary algorithm. We show experimentally that this method can quickly find high-quality plans using networks of a very simple form
    • 

    corecore