9 research outputs found

    A Data-Driven Approach to Modeling Choice

    Get PDF
    We visit the following fundamental problem: For a 'generic' model of consumer choice (namely, distributions over preference lists) and a limited amount of data on how consumers actually make decisions (such as marginal preference information), how may one predict revenues from offering a particular assortment of choices? This problem is central to areas within operations research, marketing and econometrics. We present a framework to answer such questions and design a number of tractable algorithms (from a data and computational standpoint) for the same.National Science Foundation (U.S.) (CAREER CNS 0546590

    A Topic Modeling Approach to Ranking

    Full text link
    We propose a topic modeling approach to the prediction of preferences in pairwise comparisons. We develop a new generative model for pairwise comparisons that accounts for multiple shared latent rankings that are prevalent in a population of users. This new model also captures inconsistent user behavior in a natural way. We show how the estimation of latent rankings in the new generative model can be formally reduced to the estimation of topics in a statistically equivalent topic modeling problem. We leverage recent advances in the topic modeling literature to develop an algorithm that can learn shared latent rankings with provable consistency as well as sample and computational complexity guarantees. We demonstrate that the new approach is empirically competitive with the current state-of-the-art approaches in predicting preferences on some semi-synthetic and real world datasets

    Learning mixed membership models with a separable latent structure: theory, provably efficient algorithms, and applications

    Full text link
    In a wide spectrum of problems in science and engineering that includes hyperspectral imaging, gene expression analysis, and machine learning tasks such as topic modeling, the observed data is high-dimensional and can be modeled as arising from a data-specific probabilistic mixture of a small collection of latent factors. Being able to successfully learn the latent factors from the observed data is important for efficient data representation, inference, and prediction. Popular approaches such as variational Bayesian and MCMC methods exhibit good empirical performance on some real-world datasets, but make heavy use of approximations and heuristics for dealing with the highly non-convex and computationally intractable optimization objectives that accompany them. As a consequence, consistency or efficiency guarantees for these algorithms are rather weak. This thesis develops a suite of algorithms with provable polynomial statistical and computational efficiency guarantees for learning a wide class of high-dimensional Mixed Membership Latent Variable Models (MMLVMs). Our approach is based on a natural separability property of the shared latent factors that is known to be either exactly or approximately satisfied by the estimates produced by variational Bayesian and MCMC methods. Latent factors are called separable when each factor contains a novel part that is predominantly unique to that factor. For a broad class of problems, we establish that separability is not only an algorithmically convenient structural condition, but is in fact an inevitable consequence of a having a relatively small number of latent factors in a high-dimensional observation space. The key insight underlying our algorithms is the identification of novel parts of each latent factor as extreme points of certain convex polytopes in a suitable representation space. We show that this can be done efficiently through appropriately defined random projections in the representation space. We establish statistical and computational efficiency bounds that are both polynomial in all the model parameters. Furthermore, the proposed random-projections-based algorithm turns out to be naturally amenable to a low-communication-cost distributed implementation which is attractive for modern web-scale distributed data mining applications. We explore in detail two distinct classes of MMLVMs in this thesis: learning topic models for text documents based on their empirical word frequencies and learning mixed membership ranking models based on pairwise comparison data. For each problem, we demonstrate that separability is inevitable when the data dimension scales up and then establish consistency and efficiency guarantees for identifying all novel parts and estimating the latent factors. As a by-product of this analysis, we obtain the first asymptotic consistency and polynomial sample and computational complexity results for learning permutation-mixture and Mallows-mixture models for rankings based on pairwise comparison data. We demonstrate empirically that the performance of our approach is competitive with the current state-of-the-art on a number of real-world datasets

    Nonparametric choice modeling : applications to operations management

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2011.Cataloged from PDF version of thesis.Includes bibliographical references (p. 257-263).With the recent explosion of choices available to us in every walk of our life, capturing the choice behavior exhibited by individuals has become increasingly important to many businesses. At the core, capturing choice behavior boils down to being able to predict the probability of choosing a particular alternative from an offer set, given historical choice data about an individual or a group of "similar" individuals. For such predictions, one uses what is called a choice model, which models each choice occasion as follows: given an offer set, a preference list over alternatives is sampled according to a certain distribution, and the individual chooses the most preferred alternative according to the sampled preference list. Most existing literature, which dates back to at least the 1920s, considers parametric approaches to choice modeling. The goal of this thesis is to deviate from the existing approaches to propose a nonparametric approach to modeling choice. Apart from the usual advantages, the primary strength of a nonparametric model is its ability to scale with the data - certainly crucial to applications of our interest where choice behavior is highly dynamic. Given this, the main contribution of the thesis is to operationalize the nonparametric approach and demonstrate its success in several important applications. Specifically, we consider two broad setups: (1) solving decision problems using choice models, and (2) learning the choice models. In both setups, data available corresponds to marginal information about the underlying distribution over rankings. So the problems essentially boil down to designing the 'right' criterion to pick a model from one of the (several) distributions that are consistent with the available marginal information. First, we consider a central decision problem in operations management (OM): find an assortment of products that maximizes the revenues subject to a capacity constraint on the size of the assortment. Solving this problem requires two components: (a) predicting revenues for assortments and (b) searching over all subsets of a certain size for the optimal assortment. In order to predict revenues for an assortment, of all models consistent with the data, we use the choice model that results in the 'worst-case' revenue. We derive theoretical guarantees for the predictions, and show that the accuracy of predictions is good for the cases when the choice data comes from several different parametric models. Finally, by applying our approach to real-world sales transaction data from a major US automaker, we demonstrate an improvement in accuracy of around 20% over state-of-the-art parametric approaches. Once we have revenue predictions, we consider the problem of finding the optimal assortment. It has been shown that this problem is provably hard for most of the important families of parametric of choice models, except the multinomial logit (MNL) model. In addition, most of the approximation schemes proposed in the literature are tailored to a specific parametric structure. We deviate from this and propose a general algorithm to find the optimal assortment assuming access to only a subroutine that gives revenue predictions; this means that the algorithm can be applied with any choice model. We prove that when the underlying choice model is the MNL model, our algorithm can find the optimal assortment efficiently. Next, we consider the problem of learning the underlying distribution from the given marginal information. For that, of all the models consistent with the data, we propose to select the sparsest or simplest model, where we measure sparsity as the support size of the distribution. Finding the sparsest distribution is hard in general, so we restrict our search to what we call the 'signature family' to obtain an algorithm that is computationally efficient compared to the brute-force approach. We show that the price one pays for restricting the search to the signature family is minimal by establishing that for a large class of models, there exists a "sparse enough" model in the signature family that fits the given marginal information well. We demonstrate the efficacy of learning sparse models on the well-known American Psychological Association (APA) dataset by showing that our sparse approximation manages to capture useful structural properties of the underlying model. Finally, our results suggest that signature condition can be considered an alternative to the recently popularized Restricted Null Space condition for efficient recovery of sparse models.by Srikanth Jagabathula.Ph.D
    corecore