7 research outputs found

    Modeling toothpaste brand choice: An empirical comparison of artificial neural networks and multinomial probit model

    Get PDF
    Copyright @ 2010 Atlantis PressThe purpose of this study is to compare the performances of Artificial Neural Networks (ANN) and Multinomial Probit (MNP) approaches in modeling the choice decision within fast moving consumer goods sector. To do this, based on 2597 toothpaste purchases of a panel sample of 404 households, choice models are built and their performances are compared on the 861 purchases of a test sample of 135 households. Results show that ANN's predictions are better while MNP is useful in providing marketing insight

    A Neural-embedded Choice Model: TasteNet-MNL Modeling Taste Heterogeneity with Flexibility and Interpretability

    Full text link
    Discrete choice models (DCMs) and neural networks (NNs) can complement each other. We propose a neural network embedded choice model - TasteNet-MNL, to improve the flexibility in modeling taste heterogeneity while keeping model interpretability. The hybrid model consists of a TasteNet module: a feed-forward neural network that learns taste parameters as flexible functions of individual characteristics; and a choice module: a multinomial logit model (MNL) with manually specified utility. TasteNet and MNL are fully integrated and jointly estimated. By embedding a neural network into a DCM, we exploit a neural network's function approximation capacity to reduce specification bias. Through special structure and parameter constraints, we incorporate expert knowledge to regularize the neural network and maintain interpretability. On synthetic data, we show that TasteNet-MNL can recover the underlying non-linear utility function, and provide predictions and interpretations as accurate as the true model; while examples of logit or random coefficient logit models with misspecified utility functions result in large parameter bias and low predictability. In the case study of Swissmetro mode choice, TasteNet-MNL outperforms benchmarking MNLs' predictability; and discovers a wider spectrum of taste variations within the population, and higher values of time on average. This study takes an initial step towards developing a framework to combine theory-based and data-driven approaches for discrete choice modeling

    Enhancing Discrete Choice Models with Representation Learning

    Full text link
    In discrete choice modeling (DCM), model misspecifications may lead to limited predictability and biased parameter estimates. In this paper, we propose a new approach for estimating choice models in which we divide the systematic part of the utility specification into (i) a knowledge-driven part, and (ii) a data-driven one, which learns a new representation from available explanatory variables. Our formulation increases the predictive power of standard DCM without sacrificing their interpretability. We show the effectiveness of our formulation by augmenting the utility specification of the Multinomial Logit (MNL) and the Nested Logit (NL) models with a new non-linear representation arising from a Neural Network (NN), leading to new choice models referred to as the Learning Multinomial Logit (L-MNL) and Learning Nested Logit (L-NL) models. Using multiple publicly available datasets based on revealed and stated preferences, we show that our models outperform the traditional ones, both in terms of predictive performance and accuracy in parameter estimation. All source code of the models are shared to promote open science.Comment: 35 pages, 12 tables, 6 figures, +11 p. Appendi

    Enhancing discrete choice models with representation learning

    Get PDF
    In discrete choice modeling (DCM), model misspecifications may lead to limited predictability and biased parameter estimates. In this paper, we propose a new approach for estimating choice models in which we divide the systematic part of the utility specification into (i) a knowledge-driven part, and (ii) a data-driven one, which learns a new representation from available explanatory variables. Our formulation increases the predictive power of standard DCM without sacrificing their interpretability. We show the effectiveness of our formulation by augmenting the utility specification of the Multinomial Logit (MNL) and the Nested Logit (NL) models with a new non linear representation arising from a Neural Network (NN), leading to new choice models referred to as the Learning Multinomial Logit (L-MNL) and Learning Nested Logit (L-NL) models. Using multiple publicly available datasets based on revealed and stated preferences, we show that our models outperform the traditional ones, both in terms of predictive performance and accuracy in parameter estimation. All source code of the models are shared to promote open science

    An empirical comparison of the validity of a neural net based multinomial logit choice model to alternative model specifications

    No full text
    Applications of choice models to brand purchase data as a rule specify a linear deterministic utility function. We estimate deterministic utility by means of a neural net able to approximate any continuous multivariate function and its derivatives to a desired level of precision. We compare this model to related alternatives both with linear and nonlinear utility functions. Alternatives with nonlinear utility functions are based on generalized additive modeling and Taylor series expansion, respectively. We analyze purchase data of the six largest brands in terms of market share for two product groups. Neural choice models outperform the alternative models studied w.r.t. posterior probabilities. They also attain the best crossvalidated log-likelihood values. These results demonstrate that the increase in complexity caused by the neural choice model is justified by higher validity. In the empirical study the neural choice models imply elasticities different from those obtained by linear utility multinomial logit models for several predictors. Neural choice models discover inversely S-shaped, saturation and interaction effects on utility

    An integrated framework for exploring finite mixture heterogeneity in travel demand and behavior

    Get PDF
    In recent years we have faced a plethora of social trends and new technologies such as shared mobility, micro-mobility, and information and communication technologies, and we will be facing many more in the future (e.g. self-driving cars, disruptive events). In this context, the perennial mission of transportation behavior analysts and modelers - to model behavior/demand so as to understand behavior, help craft responsive policies, and accurately forecast future demand - has become far more challenging. Specifically, behavioral realism and predictive ability are two key goals of modeling (travel) behavior/demand, and a key strategy for achieving those goals has been to introduce some type of heterogeneity in modeling. Thus, this thesis aims to improve our behavioral modeling by accounting for heterogeneity, with clues from the ideas of data/market segmentation, finite mixture, and mixture modeling. The objectives of the thesis are: (1) to build a framework for modeling finite mixture heterogeneity that connects seemingly less related models and various methodological ideas across domains, (2) to tackle various heterogeneity-related research questions in travel behavior and thus show the empirical usefulness of the models under the framework; and (3) to examine the potential, challenges, and implications of the framework with conceptual considerations and practical applications. Five inter-related studies in this thesis illuminate some part(s) of the framework and delineate how key concepts in the framework are connected to each other. (a) The thesis overviews the topics of heterogeneity and mixture modeling in transportation and provides the landscape and details of how we have used mixture modeling. (b) Extending the idea of a finite segmentation approach, the thesis connects and compares three models for treating finite-valued parameter heterogeneity: deterministic segmentation, endogenous switching, and latent class models. The study discusses their similarities and differences from conceptual and empirical standpoints. (c) The thesis explains the confirmatory latent class approach and its potential usefulness, as opposed to the conventional exploratory approach. Adopting this perspective, the study embraces zero-inflated models under the confirmatory latent class approach and demonstrates their empirical value. (d) The thesis introduces the idea of combining latent class and endogenous switching models. Conceptual and empirical differences between the standard latent class model and the proposed approach are discussed. (e) The dissertation illuminates the linkage between finite mixture modeling (specifically in “indirect application”) and the mixture of experts (MoE) architecture, introduced in machine learning. The study proposes to use MoE as a data-driven exploratory tool to capture nonlinear/interaction effects (which are types of parameter heterogeneity), and exhibits its ability using synthetic and empirical data. The thesis concludes with discussions about challenges, potential technical advances, and outlook for the framework. The dissertation is expected to give conceptual/methodological insights on the framework for modeling finite mixture heterogeneity and how various methodologies are connected under the framework. As well, the studies provide rich discussions about study-specific empirical findings and their implications. Thus, the dissertation can help improve our behavior/demand models by serving as a navigational compass for analysts.Ph.D
    corecore