Search CORE

62 research outputs found

Robust and Heterogenous Odds Ratio: Estimating Price Sensitivity for Unbought Items

Author: Pauphilet Jean
Publication venue
Publication date: 21/06/2021
Field of study

Problem definition: Mining for heterogeneous responses to an intervention is a crucial step for data-driven operations, for instance to personalize treatment or pricing. We investigate how to estimate price sensitivity from transaction-level data. In causal inference terms, we estimate heterogeneous treatment effects when (a) the response to treatment (here, whether a customer buys a product) is binary, and (b) treatment assignments are partially observed (here, full information is only available for purchased items). Methodology/Results: We propose a recursive partitioning procedure to estimate heterogeneous odds ratio, a widely used measure of treatment effect in medicine and social sciences. We integrate an adversarial imputation step to allow for robust inference even in presence of partially observed treatment assignments. We validate our methodology on synthetic data and apply it to three case studies from political science, medicine, and revenue management. Managerial Implications: Our robust heterogeneous odds ratio estimation method is a simple and intuitive tool to quantify heterogeneity in patients or customers and personalize interventions, while lifting a central limitation in many revenue management data

arXiv.org e-Print Archive

London Business School (LBS) Research

The Best Decisions Are Not the Best Advice: Making Adherence-Aware Recommendations

Author: Grand-Clément Julien
Pauphilet Jean
Publication venue
Publication date: 09/12/2023
Field of study

Many high-stake decisions follow an expert-in-loop structure in that a human operator receives recommendations from an algorithm but is the ultimate decision maker. Hence, the algorithm's recommendation may differ from the actual decision implemented in practice. However, most algorithmic recommendations are obtained by solving an optimization problem that assumes recommendations will be perfectly implemented. We propose an adherence-aware optimization framework to capture the dichotomy between the recommended and the implemented policy and analyze the impact of partial adherence on the optimal recommendation. We show that overlooking the partial adherence phenomenon, as is currently being done by most recommendation engines, can lead to arbitrarily severe performance deterioration, compared with both the current human baseline performance and what is expected by the recommendation algorithm. Our framework also provides useful tools to analyze the structure and to compute optimal recommendation policies that are naturally immune against such human deviations, and are guaranteed to improve upon the baseline policy

arXiv.org e-Print Archive

Sparse PCA With Multiple Components

Author: Cory-Wright Ryan
Pauphilet Jean
Publication venue
Publication date: 31/10/2023
Field of study

Sparse Principal Component Analysis (sPCA) is a cardinal technique for obtaining combinations of features, or principal components (PCs), that explain the variance of high-dimensional datasets in an interpretable manner. This involves solving a sparsity and orthogonality constrained convex maximization problem, which is extremely computationally challenging. Most existing works address sparse PCA via methods-such as iteratively computing one sparse PC and deflating the covariance matrix-that do not guarantee the orthogonality, let alone the optimality, of the resulting solution when we seek multiple mutually orthogonal PCs. We challenge this status by reformulating the orthogonality conditions as rank constraints and optimizing over the sparsity and rank constraints simultaneously. We design tight semidefinite relaxations to supply high-quality upper bounds, which we strengthen via additional second-order cone inequalities when each PC's individual sparsity is specified. Further, we derive a combinatorial upper bound on the maximum amount of variance explained as a function of the support. We exploit these relaxations and bounds to propose exact methods and rounding mechanisms that, together, obtain solutions with a bound gap on the order of 0%-15% for real-world datasets with p = 100s or 1000s of features and r \in {2, 3} components. Numerically, our algorithms match (and sometimes surpass) the best performing methods in terms of fraction of variance explained and systematically return PCs that are sparse and orthogonal. In contrast, we find that existing methods like deflation return solutions that violate the orthogonality constraints, even when the data is generated according to sparse orthogonal PCs. Altogether, our approach solves sparse PCA problems with multiple components to certifiable (near) optimality in a practically tractable fashion.Comment: Updated version with improved algorithmics and a new section containing a generalization of the Gershgorin circle theorem; comments or suggestions welcom

arXiv.org e-Print Archive

Prediction with Missing Data

Author: Bertsimas Dimitris
Delarue Arthur
Pauphilet Jean
Publication venue
Publication date: 07/04/2021
Field of study

Missing information is inevitable in real-world data sets. While imputation is well-suited and theoretically sound for statistical inference, its relevance and practical implementation for out-of-sample prediction remains unsettled. We provide a theoretical analysis of widely used data imputation methods and highlight their key deficiencies in making accurate predictions. Alternatively, we propose adaptive linear regression, a new class of models that can be directly trained and evaluated on partially observed data, adapting to the set of available features. In particular, we show that certain adaptive regression models are equivalent to impute-then-regress methods where the imputation and the regression models are learned simultaneously instead of sequentially. We validate our theoretical findings and adaptive regression approach with numerical results with real-world data sets

arXiv.org e-Print Archive

Robust and Heterogenous Odds Ratio: Estimating Price Sensitivity for Unbought Items

Author: Pauphilet J
Publication venue: 'Institute for Operations Research and the Management Sciences (INFORMS)'
Publication date: 15/06/2022
Field of study

Problem definition: Mining for heterogeneous responses to an intervention is a crucial step for data-driven operations, for instance to personalize treatment or pricing. We investigate how to estimate price sensitivity from transaction-level data. In causal inference terms, we estimate heterogeneous treatment effects when (a) the response to treatment (here, whether a customer buys a product) is binary, and (b) treatment assignments are partially observed (here, full information is only available for purchased items). Methodology/Results: We propose a recursive partitioning procedure to estimate heterogeneous odds ratio, a widely used measure of treatment effect in medicine and social sciences. We integrate an adversarial imputation step to allow for robust estimation even in presence of partially observed treatment assignments. We validate our methodology on synthetic data and apply it to three case studies from political science, medicine, and revenue management. Managerial Implications: Our robust heterogeneous odds ratio estimation method is a simple and intuitive tool to quantify heterogeneity in patients or customers and personalize interventions, while lifting a central limitation in many revenue management data

London Business School (LBS) Research

A Stochastic Benders Decomposition Scheme for Large-Scale Data-Driven Network Design

Author: Bertsimas Dimitris
Cory-Wright Ryan
Pauphilet Jean
Petridis Periklis
Publication venue
Publication date: 14/03/2023
Field of study

Network design problems involve constructing edges in a transportation or supply chain network to minimize construction and daily operational costs. We study a data-driven version of network design where operational costs are uncertain and estimated using historical data. This problem is notoriously computationally challenging, and instances with as few as fifty nodes cannot be solved to optimality by current decomposition techniques. Accordingly, we propose a stochastic variant of Benders decomposition that mitigates the high computational cost of generating each cut by sampling a subset of the data at each iteration and nonetheless generates deterministically valid cuts (as opposed to the probabilistically valid cuts frequently proposed in the stochastic optimization literature) via a dual averaging technique. We implement both single-cut and multi-cut variants of this Benders decomposition algorithm, as well as a k-cut variant that uses clustering of the historical scenarios. On instances with 100-200 nodes, our algorithm achieves 4-5% optimality gaps, compared with 13-16% for deterministic Benders schemes, and scales to instances with 700 nodes and 50 commodities within hours. Beyond network design, our strategy could be adapted to generic two-stage stochastic mixed-integer optimization problems where second-stage costs are estimated via a sample average

arXiv.org e-Print Archive

Optimal Low-Rank Matrix Completion: Semidefinite Relaxations and Eigenvector Disjunctions

Author: Bertsimas Dimitris
Cory-Wright Ryan
Lo Sean
Pauphilet Jean
Publication venue
Publication date: 26/01/2024
Field of study

Low-rank matrix completion consists of computing a matrix of minimal complexity that recovers a given set of observations as accurately as possible. Unfortunately, existing methods for matrix completion are heuristics that, while highly scalable and often identifying high-quality solutions, do not possess any optimality guarantees. We reexamine matrix completion with an optimality-oriented eye. We reformulate these low-rank problems as convex problems over the non-convex set of projection matrices and implement a disjunctive branch-and-bound scheme that solves them to certifiable optimality. Further, we derive a novel and often tight class of convex relaxations by decomposing a low-rank matrix as a sum of rank-one matrices and incentivizing that two-by-two minors in each rank-one matrix have determinant zero. In numerical experiments, our new convex relaxations decrease the optimality gap by two orders of magnitude compared to existing attempts, and our disjunctive branch-and-bound scheme solves nxn rank-r matrix completion problems to certifiable optimality in hours for n

arXiv.org e-Print Archive

Hospital-Wide Inpatient Flow Optimization

Author: Bertsimas D
Pauphilet J
Publication venue: INFORMS (Institute for Operations Research and Management Sciences)
Publication date: 25/09/2023
Field of study

An ideal that supports quality and delivery of care is to have hospital operations that are coordinated and optimized across all services in real-time. As a step toward this goal, we propose a multistage adaptive robust optimization approach combined with machine learning techniques. Informed by data and predictions, our framework unifies the bed assignment process across the entire hospital and accounts for present and future inpatient flows, discharges as well as bed requests – from the emergency department, scheduled surgeries and admissions, and outside transfers. We evaluate our approach through simulations calibrated on historical data from a large academic medical center. For the 600-bed institution, our optimization model was solved in seconds, reduced off-service placement by 24% on average, and boarding delays in the emergency department and post-anesthesia units by 35% and 18% respectively. We also illustrate the benefit from using adaptive linear decision rules instead of static assignment decisions

London Business School (LBS) Research

Simple Imputation Rules for Prediction with Missing Data: Theoretical Guarantees vs. Empirical Performance

Author: Bertsimas D
Delarue A
Pauphilet J
Publication venue: OpenReview.net
Publication date: 05/06/2024
Field of study

Missing data is a common issue in real-world datasets. This paper studies the performance of impute-then-regress pipelines by contrasting theoretical and empirical evidence. We establish the asymptotic consistency of such pipelines for a broad family of imputation methods. While common sense suggests that a ‘good’ imputation method produces datasets that are plausible, we show, on the contrary, that, as far as prediction is concerned, crude can be good. Among others, we find that mode-impute is asymptotically sub-optimal, while mean-impute is asymptotically optimal. We then exhaustively assess the validity of these theoretical conclusions on a large corpus of synthetic, semi-real, and real datasets. While the empirical evidence we collect mostly supports our theoretical findings, it also highlights gaps between theory and practice and opportunities for future research, regarding the relevance of the MAR assumption, the complex interdependency between the imputation and regression tasks, and the need for realistic synthetic data generation models

London Business School (LBS) Research

Mixed-Projection Conic Optimization: A New Paradigm for Modeling Rank Constraints

Author: Bertsimas D
Cory-Wright R
Pauphilet J
Publication venue: 'Institute for Operations Research and the Management Sciences (INFORMS)'
Publication date: 02/04/2021
Field of study

We propose a framework for modeling and solving low-rank optimization problems to certifiable optimality. We introduce symmetric projection matrices that satisfy Y 2 =Y , the matrix analog of binary variables that satisfy z2 = z, to model rank constraints. By leveraging regularization and strong duality, we prove that this modeling paradigm yields tractable convex optimization problems over the non-convex set of orthogonal projection matrices. Furthermore, we design outer-approximation algorithms to solve low-rank problems to certifiable optimality, compute lower bounds via their semidenite relaxations, and provide near optimal solutions through rounding and local search techniques. We implement these numerical ingredients and, for the first time, solve low-rank optimization problems to certifiable optimality. Our algorithms also supply certifiably near-optimal solutions for larger problem sizes and outperform existing heuristics, by deriving an alternative to the popular nuclear norm relaxation which generalizes the perspective relaxation from vectors to matrices. Using currently available spatial branch-and-bound codes, not tailored to projection matrices, we can scale our exact (resp. near-exact) algorithms to matrices with up to 30 (resp. 600) rows/columns. All in all, our framework, which we name Mixed-Projection Conic Optimization, solves low-rank problems to certifiable optimality in a tractable and unified fashion

arXiv.org e-Print Archive

DSpace@MIT

London Business School (LBS) Research