96 research outputs found

    Projected Inventory Level Policies for Lost Sales Inventory Systems: Asymptotic Optimality in Two Regimes

    Get PDF
    We consider the canonical periodic review lost sales inventory system with positive lead-times and stochastic i.i.d. demand under the average cost criterion. We introduce a new policy that places orders such that the expected inventory level at the time of arrival of an order is at a fixed level and call it the Projected Inventory Level (PIL) policy. We prove that this policy has a cost-rate superior to the equivalent system where excess demand is back-ordered instead of lost and is therefore asymptotically optimal as the cost of losing a sale approaches infinity under mild distributional assumptions. We further show that this policy dominates the constant order policy for any finite lead-time and is therefore asymptotically optimal as the lead-time approaches infinity for the case of exponentially distributed demand per period. Numerical results show this policy also performs superior relative to other policies

    Learning to Order for Inventory Systems with Lost Sales and Uncertain Supplies

    Full text link
    We consider a stochastic lost-sales inventory control system with a lead time LL over a planning horizon TT. Supply is uncertain, and is a function of the order quantity (due to random yield/capacity, etc). We aim to minimize the TT-period cost, a problem that is known to be computationally intractable even under known distributions of demand and supply. In this paper, we assume that both the demand and supply distributions are unknown and develop a computationally efficient online learning algorithm. We show that our algorithm achieves a regret (i.e. the performance gap between the cost of our algorithm and that of an optimal policy over TT periods) of O(L+T)O(L+\sqrt{T}) when Llog(T)L\geq\log(T). We do so by 1) showing our algorithm cost is higher by at most O(L+T)O(L+\sqrt{T}) for any L0L\geq 0 compared to an optimal constant-order policy under complete information (a well-known and widely-used algorithm) and 2) leveraging its known performance guarantee from the existing literature. To the best of our knowledge, a finite-sample O(T)O(\sqrt{T}) (and polynomial in LL) regret bound when benchmarked against an optimal policy is not known before in the online inventory control literature. A key challenge in this learning problem is that both demand and supply data can be censored; hence only truncated values are observable. We circumvent this challenge by showing that the data generated under an order quantity q2q^2 allows us to simulate the performance of not only q2q^2 but also q1q^1 for all q1<q2q^1<q^2, a key observation to obtain sufficient information even under data censoring. By establishing a high probability coupling argument, we are able to evaluate and compare the performance of different order policies at their steady state within a finite time horizon. Since the problem lacks convexity, we develop an active elimination method that adaptively rules out suboptimal solutions

    Base-stock policies for lost-sales models: Aggregation and asymptotics

    Get PDF
    This paper considers the optimization of the base-stock level for the classical periodic review lost-sales inventory system. The optimal policy for this system is not fully understood and computationally expensive to obtain. Base-stock policies for this system are asymptotically optimal as lost-sales costs approach infinity, easy to implement and prevalent in practice. Unfortunately, the state space needed to evaluate a base-stock policy exactly grows exponentially in both the lead time and the base-stock level. We show that the dynamics of this system can be aggregated into a one-dimensional state space description that grows linearly in the base-stock level only by taking a non-traditional view of the dynamics. We provide asymptotics for the transition probabilities within this single dimensional state space and show that these asymptotics have good convergence properties that are independent of the lead time under mild conditions on the demand distribution. Furthermore, we show that these asymptotics satisfy a certain ow conservation property. These results lead to a new and computationally efficient heuristic to set base-stock levels in lost-sales systems. In a numerical study we demonstrate that this approach performs better than existing heuristics with an average gap with the best base-stock policy of 0.01% across a large test-bed

    Data Driven Optimization: Theory and Applications in Supply Chain Systems

    Full text link
    Supply chain optimization plays a critical role in many business enterprises. In a data driven environment, rather than pre-specifying the underlying demand distribution and then optimizing the system’s objective, it is much more robust to have a nonparametric approach directly leveraging the past observed data. In the supply chain context, we propose and design online learning algorithms that make adaptive decisions based on historical sales (a.k.a. censored demand). We measure the performance of an online learning algorithm by cumulative regret or simply regret, which is defined as the cost difference between the proposed algorithm and the clairvoyant optimal one. In the supply chain context, to design efficient learning algorithms, we typically face two major challenges. First, we need to identify a suitable recurrent state that decouples system dynamics into cycles with good properties: (1) smoothness and rich feedback information necessary to apply the zeroth order optimization method effectively; (2) convexity and gradient information essential for the first order methods. Second, we require the learning algorithms to be adaptive to the physical constraints, e.g., positive inventory carry-over, warehouse capacity constraint, ordering/production capacity constraint, and these constraints limit the policy search space in a dynamic fashion. To design efficient and provably-good data driven supply chain algorithms, we zoom into the detailed structure of each system, and carefully trade off between exploration and exploitation.PHDIndustrial & Operations EngineeringUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/150030/1/haoyuan_1.pd
    corecore