963 research outputs found

    Customer purchase behavior prediction in E-commerce: a conceptual framework and research agenda

    Get PDF
    Digital retailers are experiencing an increasing number of transactions coming from their consumers online, a consequence of the convenience in buying goods via E-commerce platforms. Such interactions compose complex behavioral patterns which can be analyzed through predictive analytics to enable businesses to understand consumer needs. In this abundance of big data and possible tools to analyze them, a systematic review of the literature is missing. Therefore, this paper presents a systematic literature review of recent research dealing with customer purchase prediction in the E-commerce context. The main contributions are a novel analytical framework and a research agenda in the field. The framework reveals three main tasks in this review, namely, the prediction of customer intents, buying sessions, and purchase decisions. Those are followed by their employed predictive methodologies and are analyzed from three perspectives. Finally, the research agenda provides major existing issues for further research in the field of purchase behavior prediction online

    Modeling Conversions in Online Advertising

    Get PDF
    This work investigates online purchasers and how to predict such sales. Advertising as a field has long been required to pay for itself---money spent reaching potential consumers will evaporate if that potential is not realized. Academic marketers look at advertising through a traditional lens, measuring input (advertising) and output (purchases) with methods from TV and print advertising. Online advertising practitioners have developed their own models for predicting purchases. Moreover, online advertising generates an enormous amount of data, long the province of statisticians. My work sits at the intersection of these three areas: marketing, statistics and computer science. Academic statisticians have approached the modeling of response to advertising through a proportional hazard framework. We extend that work and modify the underlying software to allow estimation of voluminous online data sets. We investigate a data visualization technique that allows online advertising histories to be compared easily. We also provide a framework to use existing clustering algorithms to better understand the paths to conversion taken by consumers. We modify an existing solution to the number-of-clusters problem to allow application to mixed-variable data sets. Finally, we marry the leading edge of online advertising conversion attribution (Engagement Mapping) to the proportional hazard model, showing how this tool can be used to find optimal settings for advertiser models of conversion attribution

    Using predictive modeling for targeted marketing in a non-contractual retail setting

    Get PDF

    Implications of Computational Cognitive Models for Information Retrieval

    Get PDF
    This dissertation explores the implications of computational cognitive modeling for information retrieval. The parallel between information retrieval and human memory is that the goal of an information retrieval system is to find the set of documents most relevant to the query whereas the goal for the human memory system is to access the relevance of items stored in memory given a memory probe (Steyvers & Griffiths, 2010). The two major topics of this dissertation are desirability and information scent. Desirability is the context independent probability of an item receiving attention (Recker & Pitkow, 1996). Desirability has been widely utilized in numerous experiments to model the probability that a given memory item would be retrieved (Anderson, 2007). Information scent is a context dependent measure defined as the utility of an information item (Pirolli & Card, 1996b). Information scent has been widely utilized to predict the memory item that would be retrieved given a probe (Anderson, 2007) and to predict the browsing behavior of humans (Pirolli & Card, 1996b). In this dissertation, I proposed the theory that desirability observed in human memory is caused by preferential attachment in networks. Additionally, I showed that documents accessed in large repositories mirror the observed statistical properties in human memory and that these properties can be used to improve document ranking. Finally, I showed that the combination of information scent and desirability improves document ranking over existing well-established approaches

    Customer lifetime value: a framework for application in the insurance industry - building a business process to generate and maintain an automatic estimation agent

    Get PDF
    Research Project submited as partial fulfilment for the Master Degree in Statistics and Information Management, specialization in Knowledge Management and Business IntelligenceIn recent years the topic of Customer Lifetime Value (CLV) or in its expanded version, Customer Equity (CE) has become popular as a strategic tool across several industries, in particular in retail and services. Although the core concepts of CLV modelling have been studied for several years and the mathematics that underpins the concept is well understood, the application to specific industries is not trivial. The complexities associated with the development of a CLV programme as a business process are not insignificant causing a myriad of obstacles to its implementation. This research project builds a framework to develop and implement the CLV concept as maintainable business process with the focus on the Insurance Industry, in particular for the nonlife line of business. Key concepts, as churn modelling, portfolio stationary premiums, fiscal policies and balance sheet information must be integrated into the CLV framework. In addition, an automatic estimation machine (AEM) is developed to standardize CLV calculations. The concept of AEM is important, given that CLV information “must be fit for purpose”, when used in other business processes. The field work is carried out in a Portuguese Bancassurance Company which is part of an important Portuguese financial Group. Firstly this is done by investigating how to translate and apply the known CLV concepts into the insurance industry context. Secondly, a sensitivity study is done to establish the optimum parameters strategy. This is done by incorporating and comparing several Datamining concepts applied to churn prediction and customer base segmentation. Scenarios for balance sheet information usage and others actuarial concepts are analyzed to calibrate the Cash Flow component of the CLV framework. Thirdly, an Automatic Estimation Agent is defined for application to the current or the expanding firm portfolio, the advantages of using the SOA approach for deployment is also verified. Additionally a comparative impact study is done between two valuation views: the Premium/Cost driven versus the CLV driven. Finally a framework for a BPM is presented, not only for building the AEM but also for its maintenance according to an explicit performance threshold.O tema do valor embebido do Cliente (Customer Lifetime Value ou CLV), ou na sua versão expandida, valoração patrimonial do Cliente (Customer Equity), adquiriu alguma relevância como ferramenta estratégica em várias indústrias, em particular na Distribuição e Serviços. Embora os principais conceitos subjacentes ao CLV tenham sido já desenvolvidos e a matemática financeira possa ser considerada trivial, a sua aplicação prática não o é. As complexidades associadas ao desenvolvimento de um programa de CLV, especialmente na forma de Processo de Negócio não são insignificantes, existindo uma miríade de obstáculos à sua implementação. Este projecto de pesquisa desenvolve o enquadramento de adaptação, actividades e processos necessários para a aplicação do conceito à Industria de Seguros, especificamente para uma empresa que actue no Sector Não Vida. Conceitos-chave, como a modelação da erosão das carteiras, a estacionaridade dos prémios, as políticas fiscais e informação de balanço terão de ser integrados no âmbito do programa de modelação do valor embebido do Cliente. Um dos entregáveis será uma “máquina automática de estimação” do valor embebido, essa ferramenta servirá para padronizar os cálculos do CLV, para além disso é importante, dado que a informação do CLV será utilizada noutros processos de negócio, como por exemplo a distribuição ou vendas. O trabalho de campo é realizado numa empresa de Seguros tipo Bancassurance pertença de um Grupo Financeiro Português relevante. O primeiro passo do trabalho será a compressão do conceito do CLV e como aplicá-lo aos Seguros. Em segundo lugar, será feito um estudo de sensibilidade para determinar a estratégia óptima de parâmetros através de aplicação de técnicas de modelação. Em terceiro lugar serão abordados alguns detalhes da máquina automática de estimação e a sua utilização do ponto de vista dos Serviços e Sistemas de Negócio ( e.g. via SOA). Em paralelo será realizado um estudo de impacto comparativo entre as duas visões de avaliação do negócio: Rácio de Sinistralidade vs CLV. Por último será apresentado um desenho de processo para a manutenção continuada da utilização deste conceito no suporte ao negócio

    Analysis of Clickstream Data

    Get PDF
    This thesis is concerned with providing further statistical development in the area of web usage analysis to explore web browsing behaviour patterns. We received two data sources: web log files and operational data files for the websites, which contained information on online purchases. There are many research question regarding web browsing behaviour. Specifically, we focused on the depth-of-visit metric and implemented an exploratory analysis of this feature using clickstream data. Due to the large volume of data available in this context, we chose to present effect size measures along with all statistical analysis of data. We introduced two new robust measures of effect size for two-sample comparison studies for Non-normal situations, specifically where the difference of two populations is due to the shape parameter. The proposed effect sizes perform adequately for non-normal data, as well as when two distributions differ from shape parameters. We will focus on conversion analysis, to investigate the causal relationship between the general clickstream information and online purchasing using a logistic regression approach. The aim is to find a classifier by assigning the probability of the event of online shopping in an e-commerce website. We also develop the application of a mixture of hidden Markov models (MixHMM) to model web browsing behaviour using sequences of web pages viewed by users of an e-commerce website. The mixture of hidden Markov model will be performed in the Bayesian context using Gibbs sampling. We address the slow mixing problem of using Gibbs sampling in high dimensional models, and use the over-relaxed Gibbs sampling, as well as forward-backward EM algorithm to obtain an adequate sample of the posterior distributions of the parameters. The MixHMM provides an advantage of clustering users based on their browsing behaviour, and also gives an automatic classification of web pages based on the probability of observing web page by visitors in the website

    Information selection and belief updating in hypothesis evaluation

    Get PDF
    This thesis is concerned with the factors underlying both selection and use of evidence in the testing of hypotheses. The work it describes examines the role played in hypothesis evaluation by background knowledge about the probability of events in the environment as well as the influence of more general constraints. Experiments on information choice showed that subjects were sensitive both to explicitly presented probabilistic information and to the likelihood of evidence with regard to background beliefs. It is argued - in contrast with other views in the literature - that subjects' choice of evidence to test hypotheses is rational allowing for certain constraints on subjects' cognitive representations. The majority of experiments in this thesis, however, are focused on the issue of how the information which subjects receive when testing hypotheses affects their beliefs. A major finding is that receipt of early information creates expectations which influence the response to later information. This typically produces a recency effect in which presenting strong evidence after weak evidence affects beliefs more than if the same evidence is presented in the opposite order. These findings run contrary to the view of the belief revision process which is prevalent in the literature in which it is generally assumed that the effects of successive pieces of information are independent. The experiments reported here also provide evidence that processes of selective attention influence evidence interpretation: subjects tend to focus on the most informative part of the evidence and may switch focus from one part of the evidence to another as the task progresses. in some cases, such changes of attention can eliminate the recency effect. In summary, the present research provides new evidence about the role of background beliefs, expectations and cognitive constraints in the selection and use of information to test hypotheses. Several new findings emerge which require revision to current accounts of information integration in the belief revision literature.Faculty of Human Sciences at the University of Plymout

    Data acquisition and cost-effective predictive modeling: targeting offers for electronic commerce

    Get PDF
    Electronic commerce is revolutionizing the way we think about data modeling, by making it possible to integrate the processes of (costly) data acquisition and model induction. The opportunity for improving modeling through costly data acquisition presents itself for a diverse set of electronic commerce modeling tasks, from personalization to customer lifetime value modeling; we illustrate with the running example of choosing offers to display to web-site visitors, which captures important aspects in a familiar setting. Considering data acquisition costs explicitly can allow the building of predictive models at significantly lower costs, and a modeler may be able to improve performance via new sources of information that previously were too expensive to consider. However, existing techniques for integrating modeling and data acquisition cannot deal with the rich environment that electronic commerce presents. We discuss several possible data acquisition settings, the challenges involved in the integration with modeling, and various research areas that may supply parts of an ultimate solution. We also present and demonstrate briefly a unified framework within which one can integrate acquisitions of different types, with any cost structure and any predictive modeling objectiveNYU, Stern School of Business, IOMS Department, Center for Digital Economy Researc
    corecore