15 research outputs found

    Generalized restless bandits and the knapsack problem for perishable inventories

    Get PDF
    In this paper we introduce the Knapsack Problem for Perishable Inventories concerning the optimal dynamic allocation of a collection of products to a limited knapsack. The motivation for designing such a problem comes from retail revenue management, where different products often have an associated lifetime during which they can only be sold, and the managers can regularly select some products to be allocated to a limited promotion space which is expected to attract more customers than the standard shelves. Another motivation comes from scheduling of requests in modern multi-server data centers so that Quality-of-Service requirements given by completion deadlines are satised. Using the Lagrangian approach we derive an optimal index policy for the Whittle relaxation of the problem in which the knapsack capacity is used only on average. Assuming a certain structure of the optimal policy for the single-inventory control, we prove indexability and derive an efficient, linear-time algorithm for computing the index values. To the best of our knowledge, our paper is the first to provide indexability analysis of a restless bandit with bi-dimensional state (lifetime and inventory level). We illustrate that these index values are numerically close to the true index values when such a structure is not present. We test two index-based heuristics for the original, non-relaxed problem: (1) a conventional index rule, which prescribes to order the products according to their current index values and promote as many products as fit in the knapsack, and (2) a recently proposed index-knapsack heuristic, which employs the index values as a proxy for the price of promotion and proposes to solve a deterministic knapsack problem to select the products. By a systematic computational study we show that the performance of both heuristics is nearly-optimal, and that the index-knapsack heuristic outperforms the conventional index rule

    Marginal productivity index policies for dynamic priority allocation in restless bandit models

    Get PDF
    Esta tesis estudia tres complejos problemas dinámicos y estocásticos de asignación de recursos: (i) Enrutamiento y control de admisión con información retrasada, (ii) Promoción dinámica de productos y el Problema de la mochila para artículos perecederos, y (iii) Control de congestión en “routers” con información del recorrido futuro. Debido a que la solución óptima de estos problemas no es asequible computacionalmente a gran y mediana escala, nos concentramos en cambio en diseñar políticas heurísticas de prioridad que sean computacionalmente tratables y cuyo rendimiento sea cuasi-óptimo. Modelizamos los problemas arriba mencionados como problemas de “multi-armed restless bandit” en el marco de procesos de decisión Markovianos con estructura especial. Empleamos y enriquecemos resultados existentes en la literatura, que constituyen un principio unificador para el diseño de políticas de índices de prioridad basadas en la relajación Lagrangiana y la descomposición de dichos problemas. Esta descomposición permite considerar subproblemas de optimización paramétrica, y en ciertos casos “indexables”, resolverlos de manera óptima mediante el índice de productividad marginal (MP). El índice MP es usado como medida de prioridad dinámica para definir reglas heurísticas de prioridad para los problemas originales intratables. Para cada uno de los problemas bajo consideración realizamos tal descomposición, identificamos las condiciones de indexabilidad, y obtenemos fórmulas para los índices MP o algoritmos computacionalmente tratables para su cálculo. Los índices MP correspondientes a cada uno de estos tres problemas pueden ser interpretados en términos de prioridades como el nivel de: (i) la penalización de dirigir un trabajo a una cola particular, (ii) la necesidad de promocionar un cierto artículo perecedero, y (iii) la utilidad de una transmisión de flujo particular. Además de la contribución práctica de la obtención de reglas heurísticas de prioridad para los tres problemas analizados, las principales contribuciones teóricas son las siguientes: (i) un algoritmo lineal en el tiempo para el cómputo de los índices MP en el problema de control de admisión con información retrasada, igualando, por lo tanto, la complejidad del mejor algoritmo existente para el caso sin retrasos, (ii) un nuevo tipo de política de índice de prioridad basada en la resolución de un problema (determinista) de la mochila, y (iii) una nueva extensión del modelo existente de “multi-armed restless bandit” a través de la incorporación de las llegadas aleatorias de los “restless bandits”.This dissertation addresses three complex stochastic and dynamic resource allocation problems: (i) Admission Control and Routing with Delayed Information, (ii) Dynamic Product Promotion and Knapsack Problem for Perishable Items, and (iii) Congestion Control in Routers with Future-Path Information. Since these problems are intractable for finding an optimal solution at middle and large scale, we instead focus on designing tractable and well-performing heuristic priority rules. We model the above problems as the multi-armed restless bandit problems in the framework of Markov decision processes with special structure. We employ and enrich existing results in the literature, which identified a unifying principle to design dynamic priority index policies based on the Lagrangian relaxation and decomposition of such problems. This decomposition allows one to consider parametric-optimization subproblems and, in certain “indexable” cases, to solve them optimally via the marginal productivity (MP) index. The MP index is then used as a dynamic priority measure to define heuristic priority rules for the original intractable problems. For each of the problems considered we perform such a decomposition, identify indexability conditions, and obtain formulae for the MP indices or tractable algorithms for their computation. The MP indices admit the following priority interpretations in the three respective problems: (i) undesirability for routing a job to a particular queue, (ii) promotion necessity of a particular perishable product, and (iii) usefulness of a particular flow transmission. Apart from the practical contribution of deriving the heuristic priority rules for the three intractable problems considered, our main theoretical contributions are the following: (i) a linear-time algorithm for computing MP indices in the admission control problem with delayed information, matching thus the complexity of the best existing algorithm under no delays, (ii) a new type of priority index policy based on solving a (deterministic) knapsack problem, and (iii) a new extension of the existing multi-armed restless bandit model by incorporating random arrivals of restless bandits

    Analytics for Sustainable and Retail Operations

    Get PDF
    This dissertation focuses on two applications of analytics in sustainable and retail operations. In chapter 2, we design a priority-based inspection strategy for the Environmental Protection Agency (EPA) Region 2. Government regulators such as the U.S. EPA are obligated to inspect facilities regularly to ensure compliance with environmental laws and requirements. Faced with limited budget and resources, regulators can only inspect a small fraction of facilities within a specific time frame. We propose a new inspection strategy that can help environmental regulators prioritize facilities to be inspected under a limited budget. We formulate the problem as a restless multiarmed bandit model and develop an index-based inspection policy. We also demonstrate how to extend the model to incorporate heterogeneous inspection costs and the possibility of environmental disaster. Simulations using data from EPA Region 2 indicate the benefits of our proposed index-based compliance monitoring strategy over other benchmark policies used in academic literature and practice in reducing the harm to the environment and public health. In chapter 3, we partner with a consumer electronics retailer, and show how incorporating substitution and competition effects, two integral components of today's competitive markets, enhance the accuracy of the demand prediction models. The complicated relationship between the demand of the focal product and the substitutes' prices makes linear models incapable of estimating the cross-price elasticities. We suggest a structure-imposed neural network and demonstrate how it can be utilized in multiproduct pricing decision tools. Our imposed structure mitigates the practical concerns around the interpretability of the neural networks, which has hindered their adoption in revenue management.Ph.D

    Dynamic retail assortment models with demand learning for seasonal consumer goods

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Sloan School of Management, 2005.Includes bibliographical references (leaves [104]-108).The main research question we explore in this dissertation is: How should a retailer modify its product assortment over time in order to maximize overall profits for a given selling season? Historically, long development, procurement, and production lead times have constrained fashion retailers to make supply and assortment decisions well in advance of the selling season, when only limited and uncertain demand information is available. As a result, many retailers are seemingly cursed with simultaneously missing sales for want of popular products, while having to use markdowns in order to sell the many unpopular products still accumulating in their stores. Recently however, a few innovative firms, such as Spain-based Zara, Mango and Japan-based World Co. (referred to as "Fast Fashion" retailers), have gone substantially further, implementing product development processes and supply chain architectures allowing them to make most product design and assortment decisions during the selling season. Remarkably, their higher flexibility and responsiveness is partly achieved through an increased reliance on more costly local production relative to the supply networks of more traditional retailers.(cont.) At the operational level, leveraging the ability to introduce and test new products once the season has started motivates a new and important decision problem, which seems crucial to the success of these fast-fashion companies: given the constantly evolving demand information available, which products should be included in the assortment at each point in time? The problem just described seems challenging, in part because it relates to the classical trade-off known as exploration versus exploitation, usually represented via the multiarmed bandit problem. In this thesis we analyze the dynamic assortment problem under different sets of assumptions, including: (i) without lost sales; (ii) with lost sales but observable demand; (iii) with lost sales and censored information; and (iv) with time varying demand rates. In each case we formulate an appropriate model and suggest a (near-optimal) policy that can be implemented in practice, together with associated suboptimality bounds. We also study the incorporation of substitution effects and the extension of the models to a generic family of demand distributions. The common solution approach involves the Lagrangian relaxation and the decomposition of weakly coupled dynamic programs.(cont.) The dissertation makes three contributions: (1) it is the first attempt in providing mathematical optimization models with near-optimal solutions for the dynamic assortment problem faced by a fast-fashion retailer; (2) our analysis contributes to the literature on the multiarmed bandit problem, in particular for its finite-horizon version, we derive a general closed-form dynamic index policy that performs remarkably well; and (3) the solution approach contributes to the emerging literature on duality in dynamic programming.by Felipe Caro.Ph.D

    Supply Side Optimisation in Online Display Advertising

    Get PDF
    On the Internet there are publishers (the supply side) who provide free contents (e.g., news) and services (e.g., email) to attract users. Publishers get paid by selling ad displaying opportunities (i.e., impressions) to advertisers. Advertisers then sell products to users who are converted by ads. Better supply side revenue allows more free content and services to be created, thus, benefiting the entire online advertising ecosystem. This thesis addresses several optimisation problems for the supply side. When a publisher creates an ad-supported website, he needs to decide the percentage of ads first. The thesis reports a large-scale empirical study of Internet ad density over past seven years, then presents a model that includes many factors, especially the competition among similar publishers, and gives an optimal dynamic ad density that generates the maximum revenue over time. This study also unveils the tragedy of the commons in online advertising where users' attention has been overgrazed which results in a global sub-optimum. After deciding the ad density, the publisher retrieves ads from various sources, including contracts, ad networks, and ad exchanges. This forms an exploration-exploitation problem when ad sources are typically unknown before trail. This problem is modelled using Partially Observable Markov Decision Process (POMDP), and the exploration efficiency is increased by utilising the correlation of ads. The proposed method reports 23.4% better than the best performing baseline in the real-world data based experiments. Since some ad networks allow (or expect) an input of keywords, the thesis also presents an adaptive keyword extraction system using BM25F algorithm and the multi-armed bandits model. This system has been tested by a domain service provider in crowdsourcing based experiments. If the publisher selects a Real-Time Bidding (RTB) ad source, he can use reserve price to manipulate auctions for better payoff. This thesis proposes a simplified game model that considers the competition between seller and buyer to be one-shot instead of repeated and gives heuristics that can be easily implemented. The model has been evaluated in a production environment and reported 12.3% average increase of revenue. The documentation of a prototype system for reserve price optimisation is also presented in the appendix of the thesis

    Operational Research: Methods and Applications

    Get PDF
    Throughout its history, Operational Research has evolved to include a variety of methods, models and algorithms that have been applied to a diverse and wide range of contexts. This encyclopedic article consists of two main sections: methods and applications. The first aims to summarise the up-to-date knowledge and provide an overview of the state-of-the-art methods and key developments in the various subdomains of the field. The second offers a wide-ranging list of areas where Operational Research has been applied. The article is meant to be read in a nonlinear fashion. It should be used as a point of reference or first-port-of-call for a diverse pool of readers: academics, researchers, students, and practitioners. The entries within the methods and applications sections are presented in alphabetical order. The authors dedicate this paper to the 2023 Turkey/Syria earthquake victims. We sincerely hope that advances in OR will play a role towards minimising the pain and suffering caused by this and future catastrophes

    Evaluation of the Intelligence Collection and Analysis Process

    Get PDF
    Intelligence is a critical tool in modern security operations that provides insight into current and future operational conditions. It is a concept that transfers to other applications where monitoring activities or situations is imperative, such as ecological research. As technological advances in the past decades lead to increased availability of potential intelligence, we concentrate on source selection to ensure the resulting intelligence is of high quality and fit for purpose. We wish to bring focus to the more varied nature of intelligence than what is currently reflected in models of its collection and evaluation. Therefore, we examine the intelligence collection and analysis process in two separate scenarios; one treats it as a ongoing strategic activity, in another intelligence collection is carried out with an investigative intent. The first problem we formulate concerns source selection with a random time delay in feedback, corresponding to the collection and evaluation time of the intelligence. Both the distributions of such time delay and the outcome of the intelligence evaluation are unknown, giving rise to the classic exploration-exploitation dilemma in a long-run setting. We develop promising approaches to accommodate the novel features of the model based on Gittins indices and the knowledge gradient, and examine the issues presented when incorporating structures of dependence between the time delay and the outcome of the evaluation. Next, we develop a novel intelligence collection problem rooted in tactical level source selection, aiming to piece together an intelligence picture comprised of multiple types of information, for example, where and when an attack is planned. We demonstrate that when all elements of the model are known, dynamic programming provides the optimal policy. When some elements are unknown, which introduces an exploration-exploitation aspect to the model, we find that in certain cases the ability to learn is severely limited

    Operational Research: Methods and Applications

    Get PDF
    Throughout its history, Operational Research has evolved to include a variety of methods, models and algorithms that have been applied to a diverse and wide range of contexts. This encyclopedic article consists of two main sections: methods and applications. The first aims to summarise the up-to-date knowledge and provide an overview of the state-of-the-art methods and key developments in the various subdomains of the field. The second offers a wide-ranging list of areas where Operational Research has been applied. The article is meant to be read in a nonlinear fashion. It should be used as a point of reference or first-port-of-call for a diverse pool of readers: academics, researchers, students, and practitioners. The entries within the methods and applications sections are presented in alphabetical order
    corecore