57 research outputs found

    Robust and Heterogenous Odds Ratio: Estimating Price Sensitivity for Unbought Items

    Get PDF
    Problem definition: Mining for heterogeneous responses to an intervention is a crucial step for data-driven operations, for instance to personalize treatment or pricing. We investigate how to estimate price sensitivity from transaction-level data. In causal inference terms, we estimate heterogeneous treatment effects when (a) the response to treatment (here, whether a customer buys a product) is binary, and (b) treatment assignments are partially observed (here, full information is only available for purchased items). Methodology/Results: We propose a recursive partitioning procedure to estimate heterogeneous odds ratio, a widely used measure of treatment effect in medicine and social sciences. We integrate an adversarial imputation step to allow for robust inference even in presence of partially observed treatment assignments. We validate our methodology on synthetic data and apply it to three case studies from political science, medicine, and revenue management. Managerial Implications: Our robust heterogeneous odds ratio estimation method is a simple and intuitive tool to quantify heterogeneity in patients or customers and personalize interventions, while lifting a central limitation in many revenue management data

    Sparse PCA With Multiple Components

    Full text link
    Sparse Principal Component Analysis (sPCA) is a cardinal technique for obtaining combinations of features, or principal components (PCs), that explain the variance of high-dimensional datasets in an interpretable manner. This involves solving a sparsity and orthogonality constrained convex maximization problem, which is extremely computationally challenging. Most existing works address sparse PCA via methods-such as iteratively computing one sparse PC and deflating the covariance matrix-that do not guarantee the orthogonality, let alone the optimality, of the resulting solution when we seek multiple mutually orthogonal PCs. We challenge this status by reformulating the orthogonality conditions as rank constraints and optimizing over the sparsity and rank constraints simultaneously. We design tight semidefinite relaxations to supply high-quality upper bounds, which we strengthen via additional second-order cone inequalities when each PC's individual sparsity is specified. Further, we derive a combinatorial upper bound on the maximum amount of variance explained as a function of the support. We exploit these relaxations and bounds to propose exact methods and rounding mechanisms that, together, obtain solutions with a bound gap on the order of 0%-15% for real-world datasets with p = 100s or 1000s of features and r \in {2, 3} components. Numerically, our algorithms match (and sometimes surpass) the best performing methods in terms of fraction of variance explained and systematically return PCs that are sparse and orthogonal. In contrast, we find that existing methods like deflation return solutions that violate the orthogonality constraints, even when the data is generated according to sparse orthogonal PCs. Altogether, our approach solves sparse PCA problems with multiple components to certifiable (near) optimality in a practically tractable fashion.Comment: Updated version with improved algorithmics and a new section containing a generalization of the Gershgorin circle theorem; comments or suggestions welcom

    Prediction with Missing Data

    Full text link
    Missing information is inevitable in real-world data sets. While imputation is well-suited and theoretically sound for statistical inference, its relevance and practical implementation for out-of-sample prediction remains unsettled. We provide a theoretical analysis of widely used data imputation methods and highlight their key deficiencies in making accurate predictions. Alternatively, we propose adaptive linear regression, a new class of models that can be directly trained and evaluated on partially observed data, adapting to the set of available features. In particular, we show that certain adaptive regression models are equivalent to impute-then-regress methods where the imputation and the regression models are learned simultaneously instead of sequentially. We validate our theoretical findings and adaptive regression approach with numerical results with real-world data sets

    Robust and Heterogenous Odds Ratio: Estimating Price Sensitivity for Unbought Items

    Get PDF
    Problem definition: Mining for heterogeneous responses to an intervention is a crucial step for data-driven operations, for instance to personalize treatment or pricing. We investigate how to estimate price sensitivity from transaction-level data. In causal inference terms, we estimate heterogeneous treatment effects when (a) the response to treatment (here, whether a customer buys a product) is binary, and (b) treatment assignments are partially observed (here, full information is only available for purchased items). Methodology/Results: We propose a recursive partitioning procedure to estimate heterogeneous odds ratio, a widely used measure of treatment effect in medicine and social sciences. We integrate an adversarial imputation step to allow for robust estimation even in presence of partially observed treatment assignments. We validate our methodology on synthetic data and apply it to three case studies from political science, medicine, and revenue management. Managerial Implications: Our robust heterogeneous odds ratio estimation method is a simple and intuitive tool to quantify heterogeneity in patients or customers and personalize interventions, while lifting a central limitation in many revenue management data

    A Stochastic Benders Decomposition Scheme for Large-Scale Data-Driven Network Design

    Full text link
    Network design problems involve constructing edges in a transportation or supply chain network to minimize construction and daily operational costs. We study a data-driven version of network design where operational costs are uncertain and estimated using historical data. This problem is notoriously computationally challenging, and instances with as few as fifty nodes cannot be solved to optimality by current decomposition techniques. Accordingly, we propose a stochastic variant of Benders decomposition that mitigates the high computational cost of generating each cut by sampling a subset of the data at each iteration and nonetheless generates deterministically valid cuts (as opposed to the probabilistically valid cuts frequently proposed in the stochastic optimization literature) via a dual averaging technique. We implement both single-cut and multi-cut variants of this Benders decomposition algorithm, as well as a k-cut variant that uses clustering of the historical scenarios. On instances with 100-200 nodes, our algorithm achieves 4-5% optimality gaps, compared with 13-16% for deterministic Benders schemes, and scales to instances with 700 nodes and 50 commodities within hours. Beyond network design, our strategy could be adapted to generic two-stage stochastic mixed-integer optimization problems where second-stage costs are estimated via a sample average

    Hospital-Wide Inpatient Flow Optimization

    Get PDF
    An ideal that supports quality and delivery of care is to have hospital operations that are coordinated and optimized across all services in real-time. As a step toward this goal, we propose a multistage adaptive robust optimization approach combined with machine learning techniques. Informed by data and predictions, our framework unifies the bed assignment process across the entire hospital and accounts for present and future inpatient flows, discharges as well as bed requests – from the emergency department, scheduled surgeries and admissions, and outside transfers. We evaluate our approach through simulations calibrated on historical data from a large academic medical center. For the 600-bed institution, our optimization model was solved in seconds, reduced off-service placement by 24% on average, and boarding delays in the emergency department and post-anesthesia units by 35% and 18% respectively. We also illustrate the benefit from using adaptive linear decision rules instead of static assignment decisions

    Mixed-Projection Conic Optimization: A New Paradigm for Modeling Rank Constraints

    Get PDF
    We propose a framework for modeling and solving low-rank optimization problems to certifiable optimality. We introduce symmetric projection matrices that satisfy Y 2 =Y , the matrix analog of binary variables that satisfy z2 = z, to model rank constraints. By leveraging regularization and strong duality, we prove that this modeling paradigm yields tractable convex optimization problems over the non-convex set of orthogonal projection matrices. Furthermore, we design outer-approximation algorithms to solve low-rank problems to certifiable optimality, compute lower bounds via their semidenite relaxations, and provide near optimal solutions through rounding and local search techniques. We implement these numerical ingredients and, for the first time, solve low-rank optimization problems to certifiable optimality. Our algorithms also supply certifiably near-optimal solutions for larger problem sizes and outperform existing heuristics, by deriving an alternative to the popular nuclear norm relaxation which generalizes the perspective relaxation from vectors to matrices. Using currently available spatial branch-and-bound codes, not tailored to projection matrices, we can scale our exact (resp. near-exact) algorithms to matrices with up to 30 (resp. 600) rows/columns. All in all, our framework, which we name Mixed-Projection Conic Optimization, solves low-rank problems to certifiable optimality in a tractable and unified fashion

    Solving Large-Scale Sparse PCA to Certifiable (Near) Optimality

    Get PDF
    Sparse principal component analysis (PCA) is a popular dimensionality reduction technique for obtaining principal components which are linear combinations of a small subset of the original features. Existing approaches cannot supply certifiably optimal principal components with more than *p* = 100s of variables. By reformulating sparse PCA as a convex mixed-integer semidefinite optimization problem, we design a cutting-plane method which solves the problem to certifiable optimality at the scale of selecting *k* = 5 covariates from *p*= 300 variables, and provides small bound gaps at a larger scale. We also propose a convex relaxation and greedy rounding scheme that provides bound gaps of 1 - 2% in practice within minutes for *p* = 100s or hours for *p*= 1; 000s and is therefore a viable alternative to the exact method at scale. Using real-world financial and medical data sets, we illustrate our approach's ability to derive interpretable principal components tractably at scale

    A new perspective on low-rank optimization

    Get PDF
    A key question in many low-rank problems throughout optimization, machine learning, and statistics is to characterize the convex hulls of simple low-rank sets and judiciously apply these convex hulls to obtain strong yet computationally tractable relaxations. We invoke the matrix perspective function the matrix analog of the perspective function to characterize explicitly the convex hull of epigraphs of simple matrix convex functions under low-rank constraints. Further, we combine the matrix perspective function with orthogonal projection matrices{the matrix analog of binary variables which capture the row-space of a matrix{to develop a matrix perspective reformulation technique that reliably obtains strong relaxations for a variety of low-rank problems, including reduced rank regression, non-negative matrix factorization, and factor analysis. Moreover, we establish that these relaxations can be modeled via semidenite constraints and thus optimized over tractably. The proposed approach parallels and generalizes the perspective reformulation technique in mixed-integer optimization and leads to new relaxations for a broad class of problems

    A unified approach to mixed-integer optimization with logical constraints

    Get PDF
    We propose a united framework to address a family of classical mixed-component analysis, and sparse learning problems. These problems exhibit logical relationships between continuous and discrete variables, which are usually reformulated linearly using a big-M formulation. In this work, we challenge this longstanding modeling practice and express the logical constraints in a non-linear way. By imposing a regularization condition, we reformulate these problems as convex binary optimization problems, which are solvable using an outer-approximation procedure. In numerical experiments, we establish that a general-purpose numerical strategy, which combines cutting-plane, first-order, and local search methods, solves these problems faster and at a larger scale than state-of-the-art mixed-integer linear or second-order cone methods. Our approach successfully solves network design problems with 100s of nodes and provides solutions up to 40% better than the state-of-the-art; sparse portfolio selection problems with up to 3,200 securities compared with 400 securities for previous attempts; and sparse regression problems with up to 100,000 covariates
    • …
    corecore