7 research outputs found
Modeling toothpaste brand choice: An empirical comparison of artificial neural networks and multinomial probit model
Copyright @ 2010 Atlantis PressThe purpose of this study is to compare the performances of Artificial Neural Networks (ANN) and Multinomial Probit (MNP) approaches in modeling the choice decision within fast moving consumer goods sector. To do this, based on 2597 toothpaste purchases of a panel sample of 404 households, choice models are built and their performances are compared on the 861 purchases of a test sample of 135 households. Results show that ANN's predictions are better while MNP is useful in providing marketing insight
A Neural-embedded Choice Model: TasteNet-MNL Modeling Taste Heterogeneity with Flexibility and Interpretability
Discrete choice models (DCMs) and neural networks (NNs) can complement each
other. We propose a neural network embedded choice model - TasteNet-MNL, to
improve the flexibility in modeling taste heterogeneity while keeping model
interpretability. The hybrid model consists of a TasteNet module: a
feed-forward neural network that learns taste parameters as flexible functions
of individual characteristics; and a choice module: a multinomial logit model
(MNL) with manually specified utility. TasteNet and MNL are fully integrated
and jointly estimated. By embedding a neural network into a DCM, we exploit a
neural network's function approximation capacity to reduce specification bias.
Through special structure and parameter constraints, we incorporate expert
knowledge to regularize the neural network and maintain interpretability. On
synthetic data, we show that TasteNet-MNL can recover the underlying non-linear
utility function, and provide predictions and interpretations as accurate as
the true model; while examples of logit or random coefficient logit models with
misspecified utility functions result in large parameter bias and low
predictability. In the case study of Swissmetro mode choice, TasteNet-MNL
outperforms benchmarking MNLs' predictability; and discovers a wider spectrum
of taste variations within the population, and higher values of time on
average. This study takes an initial step towards developing a framework to
combine theory-based and data-driven approaches for discrete choice modeling
Enhancing Discrete Choice Models with Representation Learning
In discrete choice modeling (DCM), model misspecifications may lead to
limited predictability and biased parameter estimates. In this paper, we
propose a new approach for estimating choice models in which we divide the
systematic part of the utility specification into (i) a knowledge-driven part,
and (ii) a data-driven one, which learns a new representation from available
explanatory variables. Our formulation increases the predictive power of
standard DCM without sacrificing their interpretability. We show the
effectiveness of our formulation by augmenting the utility specification of the
Multinomial Logit (MNL) and the Nested Logit (NL) models with a new non-linear
representation arising from a Neural Network (NN), leading to new choice models
referred to as the Learning Multinomial Logit (L-MNL) and Learning Nested Logit
(L-NL) models. Using multiple publicly available datasets based on revealed and
stated preferences, we show that our models outperform the traditional ones,
both in terms of predictive performance and accuracy in parameter estimation.
All source code of the models are shared to promote open science.Comment: 35 pages, 12 tables, 6 figures, +11 p. Appendi
Enhancing discrete choice models with representation learning
In discrete choice modeling (DCM), model misspecifications may lead to limited predictability and biased parameter estimates. In this paper, we propose a new approach for estimating choice models in which we divide the systematic part of the utility specification into (i) a knowledge-driven part, and (ii) a data-driven one, which learns a new representation from available explanatory variables. Our formulation increases the predictive power of standard DCM without sacrificing their interpretability. We show the effectiveness of our formulation by augmenting the utility specification of the Multinomial Logit (MNL) and the Nested Logit (NL) models with a new non linear representation arising from a Neural Network (NN), leading to new choice models referred to as the Learning Multinomial Logit (L-MNL) and Learning Nested Logit (L-NL) models. Using multiple publicly available datasets based on revealed and stated preferences, we show that our models outperform the traditional ones, both in terms of predictive performance and accuracy in parameter estimation. All source code of the models are shared to promote open science
An empirical comparison of the validity of a neural net based multinomial logit choice model to alternative model specifications
Applications of choice models to brand purchase data as a rule specify a linear deterministic utility function. We estimate deterministic utility by means of a neural net able to approximate any continuous multivariate function and its derivatives to a desired level of precision. We compare this model to related alternatives both with linear and nonlinear utility functions. Alternatives with nonlinear utility functions are based on generalized additive modeling and Taylor series expansion, respectively. We analyze purchase data of the six largest brands in terms of market share for two product groups. Neural choice models outperform the alternative models studied w.r.t. posterior probabilities. They also attain the best crossvalidated log-likelihood values. These results demonstrate that the increase in complexity caused by the neural choice model is justified by higher validity. In the empirical study the neural choice models imply elasticities different from those obtained by linear utility multinomial logit models for several predictors. Neural choice models discover inversely S-shaped, saturation and interaction effects on utility
An integrated framework for exploring finite mixture heterogeneity in travel demand and behavior
In recent years we have faced a plethora of social trends and new technologies such as shared mobility, micro-mobility, and information and communication technologies, and we will be facing many more in the future (e.g. self-driving cars, disruptive events). In this context, the perennial mission of transportation behavior analysts and modelers - to model behavior/demand so as to understand behavior, help craft responsive policies, and accurately forecast future demand - has become far more challenging.
Specifically, behavioral realism and predictive ability are two key goals of modeling (travel) behavior/demand, and a key strategy for achieving those goals has been to introduce some type of heterogeneity in modeling. Thus, this thesis aims to improve our behavioral modeling by accounting for heterogeneity, with clues from the ideas of data/market segmentation, finite mixture, and mixture modeling. The objectives of the thesis are: (1) to build a framework for modeling finite mixture heterogeneity that connects seemingly less related models and various methodological ideas across domains, (2) to tackle various heterogeneity-related research questions in travel behavior and thus show the empirical usefulness of the models under the framework; and (3) to examine the potential, challenges, and implications of the framework with conceptual considerations and practical applications.
Five inter-related studies in this thesis illuminate some part(s) of the framework and delineate how key concepts in the framework are connected to each other. (a) The thesis overviews the topics of heterogeneity and mixture modeling in transportation and provides the landscape and details of how we have used mixture modeling. (b) Extending the idea of a finite segmentation approach, the thesis connects and compares three models for treating finite-valued parameter heterogeneity: deterministic segmentation, endogenous switching, and latent class models. The study discusses their similarities and differences from conceptual and empirical standpoints. (c) The thesis explains the confirmatory latent class approach and its potential usefulness, as opposed to the conventional exploratory approach. Adopting this perspective, the study embraces zero-inflated models under the confirmatory latent class approach and demonstrates their empirical value. (d) The thesis introduces the idea of combining latent class and endogenous switching models. Conceptual and empirical differences between the standard latent class model and the proposed approach are discussed. (e) The dissertation illuminates the linkage between finite mixture modeling (specifically in “indirect application”) and the mixture of experts (MoE) architecture, introduced in machine learning. The study proposes to use MoE as a data-driven exploratory tool to capture nonlinear/interaction effects (which are types of parameter heterogeneity), and exhibits its ability using synthetic and empirical data. The thesis concludes with discussions about challenges, potential technical advances, and outlook for the framework.
The dissertation is expected to give conceptual/methodological insights on the framework for modeling finite mixture heterogeneity and how various methodologies are connected under the framework. As well, the studies provide rich discussions about study-specific empirical findings and their implications. Thus, the dissertation can help improve our behavior/demand models by serving as a navigational compass for analysts.Ph.D