157 research outputs found

    Dynamic Conversion Behavior at E-Commerce Sites

    Get PDF
    This paper develops a model of conversion behavior (i.e., converting store visits into purchases) that predicts each customer\u27s probability of purchasing based on an observed history of visits and purchases. We offer an individual-level probability model that allows for different forms of customer heterogeneity in a very flexible manner. Specifically, we decompose an individual\u27s conversion behavior into two components: one for accumulating visit effects and another for purchasing threshold effects. Each component is allowed to vary across households as well as over time. Visit effects capture the notion that store visits can play different roles in the purchasing process. For example, some visits are motivated by planned purchases, while others are associated with hedonic browsing (akin to window shopping); our model is able to accommodate these (and several other) types of visit-purchase relationships in a logical, parsimonious manner. The purchasing threshold captures the psychological resistance to online purchasing that may grow or shrink as a customer gains more experience with the purchasing process at a given website. We test different versions of the model that vary in the complexity of these two key components and also compare our general framework with popular alternatives such as logistic regression. We find that the proposed model offers excellent statistical properties, including its performance in a holdout validation sample, and also provides useful managerial diagnostics about the patterns underlying online buyer behavior

    Aggregation Bias in Sponsored Search Data: The Curse and the Cure

    Get PDF
    Recently there has been significant interest in studying consumer behavior in sponsored search advertising (SSA). Researchers have typically used daily data from search engines containing measures such as average bid, average ad position, total impressions, clicks, and cost for each keyword in the advertiser’s campaign. A variety of random utility models have been estimated using such data and the results have helped researchers explore the factors that drive consumer click and conversion propensities. However, virtually every analysis of this kind has ignored the intraday variation in ad position. We show that estimating random utility models on aggregated (daily) data without accounting for this variation will lead to systematically biased estimates. Specifically, the impact of ad position on click-through rate (CTR) is attenuated and the predicted CTR is higher than the actual CTR. We analytically demonstrate the existence of the bias and show the effect of the bias on the equilibrium of the SSA auction. Using a large data set from a major search engine, we measure the magnitude of bias and quantify the losses suffered by the search engine and an advertiser using aggregate data. The search engine revenue loss can be as high as 11% due to aggregation bias. We also present a few data summarization techniques that can be used by search engines to reduce or eliminate the bias

    New Perspectives on Customer “Death” Using a Generalization of the Pareto/NBD Model

    Get PDF
    Several researchers have proposed models of buyer behavior in noncontractual settings that assume that customers are “alive” for some period of time and then become permanently inactive. The best-known such model is the Pareto/NBD, which assumes that customer attrition (dropout or “death”) can occur at any point in calendar time. A recent alternative model, the BG/NBD, assumes that customer attrition follows a Bernoulli “coin-flipping” process that occurs in “transaction time” (i.e., after every purchase occasion). Although the modification results in a model that is much easier to implement, it means that heavy buyers have more opportunities to “die.” In this paper, we develop a model with a discrete-time dropout process tied to calendar time. Specifically, we assume that every customer periodically “flips a coin” to determine whether she “drops out” or continues as a customer. For the component of purchasing while alive, we maintain the assumptions of the Pareto/NBD and BG/NBD models. This periodic death opportunity (PDO) model allows us to take a closer look at how assumptions about customer death influence model fit and various metrics typically used by managers to characterize a cohort of customers. When the time period after which each customer makes her dropout decision (which we call period length) is very small, we show analytically that the PDO model reduces to the Pareto/NBD. When the period length is longer than the calibration period, the dropout process is “shut off,” and the PDO model collapses to the negative binomial distribution (NBD) model. By systematically varying the period length between these limits, we can explore the full spectrum of models between the “continuous-time-death” Pareto/NBD and the naïve “no-death” NBD. In covering this spectrum, the PDO model performs at least as well as either of these models; our empirical analysis demonstrates the superior performance of the PDO model on two data sets. We also show that the different models provide significantly different estimates of both purchasing-related and death-related metrics for both data sets, and these differences can be quite dramatic for the death-related metrics. As more researchers and managers make managerial judgments that directly relate to the death process, we assert that the model employed to generate these metrics should be chosen carefully

    Estimating CLV Using Aggregated Data: The Tuscan Lifestyles Case Revisited

    Get PDF
    The Tuscan Lifestyles case (Mason, 2003) offers a simple twist on the standard view of how to value a newly acquired customer, highlighting how standard retention-based approaches to the calculation of expected customer lifetime value (CLV) are not applicable in a noncontractual setting. Using the data presented in the case (a series of annual histograms showing the aggregate distribution of purchases for two different cohorts of customers newly “acquired” by a catalog marketer), it is a simple exercise to compute an estimate of “expected 5 year CLV.” If we wish to arrive at an estimate of CLV that includes the customer\u27s “life” beyond five years or are interested in, say, sorting out the purchasing process (while “alive”) from the attrition process, we need to use a formal model of buying behavior that can be applied on such coarse data. To tackle this problem, we utilize the Pareto/NBD model developed by Schmittlein, Morrison, and Colombo (1987). However, existing analytical results do not allow us to estimate the model parameters using the data summaries presented in the case. We therefore derive an expression that enables us to do this. The resulting parameter estimates and subsequent calculations offer useful insights that could not have been obtained without the formal model. For instance, we were able to decompose the lifetime value into four factors, namely purchasing while active, dropout, surge in sales in the first year and monetary value of the average purchase. We observed a kind of “triple jeopardy” in that the more valuable cohort proved to be better on the three most critical factors

    Customer-Base Analysis using Repeated Cross-Sectional Summary (RCSS) Data

    Get PDF
    We address a critical question that many firms are facing today: Can customer data be stored and analyzed in an easy-to-manage and scalable manner without significantly compromising the inferences that can be made about the customers’ transaction activity? We address this question in the context of customer-base analysis. A number of researchers have developed customer-base analysis models that perform very well given detailed individual-level data. We explore the possibility of estimating these models using aggregated data summaries alone, namely repeated cross-sectional summaries (RCSS) of the transaction data. Such summaries are easy to create, visualize, and distribute, irrespective of the size of the customer base. An added advantage of the RCSS data structure is that individual customers cannot be identified, which makes it desirable from a data privacy and security viewpoint as well. We focus on the widely used Pareto/NBD model and carry out a comprehensive simulation study covering a vast spectrum of market scenarios. We find that the RCSS format of four quarterly histograms serves as a suitable substitute for individual-level data. We confirm the results of the simulations on a real dataset of purchasing from an online fashion retailer

    Multi-Attribute Loss Aversion and Reference Dependence: Evidence from the Performing Arts Industry

    Get PDF
    We study the prevalence of multiattribute loss aversion and reference effects in a revenue management setting based on data of individual-level purchases over a series of concert performances. The reference dependence that drives consumer choice is not only based on the price but also on observed sales (as a fraction of the seating capacity) during their past visits. We find that consumers suffer from loss aversion on both prices and seats sold: consumers incur significant utility loss when prices are above their references or when the actual seat sales are lower than their references. We suggest pricing policies that can address consumer decisions driven by such reference dependence and loss aversion

    An Exploratory Look at Supermarket Shopping Paths

    Get PDF
    We present analyses of an extraordinary new dataset that reveals the path taken by individual shoppers in an actual grocery store, as provided by RFID (radio frequency identification) tags located on their shopping carts. The analysis is performed using a multivariate clustering algorithm not yet seen in the marketing literature that is able to handle data sets with unique (and numerous) spatial constraints. This allows us to take into account physical impediments (such as the location of aisles and other inaccessible areas of the store) to ensure that we only deal with feasible paths. We also recognize that time spent in the store plays an important role, leading to different cluster configurations for short, medium, and long trips. The resulting three sets of clusters identify a total of 14 canonical path types that are typical of grocery store travel, and we carefully describe (and cross-validate) each set of clusters These results dispel certain myths about shopper travel behavior that common intuition perpetuates, including behavior related to aisles, end-cap displays, and the racetrack. We briefly relate these results to previous research (using much more limited datasets) covering travel behavior in retails stores and other related settings

    Pricing Theater Seats: The Value of Price Commitment and Monotone Discounting

    Get PDF
    We examine the value of price commitment in a non-profit organization using individual-level purchases over a series of concert performances. To decide on a pricing policy, the performing arts organization must be able to accurately measure when each ticket will be sold and what type of audience will purchase the tickets for each performance. We use a competing hazards framework to model the timing of ticket purchases when customer segments differ in their valuations and arrival times. We show that the customer purchase likelihoods change based on the prices observed earlier in the season. Hence, price commitment can aid in improving sales, revenues, and customer visits. In particular, we show that price commitment to a decreasing monotone discount policy can improve the revenues in the range 2.1%–6.7% per concert

    A Cross-Cohort Changepoint Model for Customer-Base Analysis

    Get PDF
    We introduce a new methodology that can capture and explain differences across a series of cohorts of new customers in a repeat-transaction setting. More specifically, this new framework, which we call a vector changepoint model, exploits the underlying regime structure in a sequence of acquired customer cohorts to make predictive statements about new cohorts for which the firm has little or no longitudinal transaction data. To accomplish this, we develop our model within a hierarchical Bayesian framework to uncover evidence of (latent) regime changes for each cohort-level parameter separately, while disentangling cross-cohort changes from calendar-time changes. Calibrating the model using multicohort donation data from a nonprofit organization, we find that holdout predictions for new cohorts using this model have greater accuracy—and greater diagnostic value—compared to a variety of strong benchmarks. Our modeling approach also highlights the perils of pooling data across cohorts without accounting for cross-cohort shifts, thus enabling managers to quantify their uncertainty about potential regime changes and avoid “old data” aggregation bias
    • …
    corecore