2,586 research outputs found
Mutual information and sensitivity analysis for feature selection in customer targeting: a comparative study
WOS:000454945400004Feature selection is a highly relevant task in any data-driven knowledge discovery project. The present research focuses on analysing the advantages and disadvantages of using mutual information (MI) and data-based sensitivity analysis (DSA) for feature selection in classification problems, by applying both to a bank telemarketing case. A logistic regression model is built on the tuned set of features identified by each of the two techniques as the most influencing set of features on the success of a telemarketing contact, in a total of 13 features for MI and 9 for DSA. The latter performs better for lower values of false positives while the former is slightly better for a higher false-positive ratio. Thus, MI becomes a better choice if the intention is reducing slightly the cost of contacts without risking losing a high number of successes. However, DSA achieved good prediction results with less features.info:eu-repo/semantics/acceptedVersio
Uncertainty representation and risk management for direct segmented marketing
Mining for truly responsive customers has become an integral part of customer portfolio management, and, combined with operational tactics to reach these customers, requires an integrated approach to meeting customer needs that often involves the application of concepts from traditionally distinct fields: marketing, statistics, and operations research. This article brings such concepts together to address customer value and revenue maximization as well as risk minimization for direct marketing decision making problems under uncertainty. We focus on customer lift optimization given the uncertainty associated with lift estimation models, and develop risk management and operational tools for the multiple treatment (recommendation) problem using stochastic and robust optimization techniques. Results from numerical experiments are presented to illustrate the effect of incorporating uncertainty on the performance of recommendation models
Recommended from our members
The solar energy consumer agent decision (SECAD) model : addressing complexity through GIS-integrated agent-based modeling
textThis thesis presents a step-by-step implementation of the Solar Energy Consumer Agent Decision (SECAD) model: an empirically-grounded multi-agent model of residential solar photovoltaic (PV) adoption with an integrated geospatial topology. Solar PV diffusion is a complex system with geographic heterogeneity, uncertain information, high financial risk, and important social interaction and feedback effects between consumers. A key limitation for agentbased models in human socio-technical systems is the integration of empirical patterns in the model structure, initialization, and validation efforts. This limitation is addressed though highly granular and interlocking data-streams from the geographic, social network, financial, demographic, and decision-making process of real households in the study. The fitted and validation model is used to simulate implementation of potential policies to inform decision-makers: i) Targeted informational dissemination campaigns, ii) Tiered rebates, iii) Locational pricing, and iv) Alternative rebate schedules. Informational campaigns can increase cumulative installations by as much as 12%, but vary greatly in their effectiveness based on which agents are targeted. Simulations suggest that by lowering the cost barrier to lower wealth households through a slightly higher rebate (+0.25 higher offering increased the percentage of adopters in the target area from less than 1% to over 10%. Relative to flatter rebate schedules, sharply decreasing schedules are effective in terms of motivating adoption but inefficient in small markets. It is our hope that this work will provide a working example for other agent-based models of human socio-technical systems as well as provide insight into the likely outcomes of novel policy-levers such as those described above.Energy and Earth ResourcesPublic Affair
Display Advertising with Real-Time Bidding (RTB) and Behavioural Targeting
The most significant progress in recent years in online display advertising is what is known as the Real-Time Bidding (RTB) mechanism to buy and sell ads. RTB essentially facilitates buying an individual ad impression in real time while it is still being generated from a user’s visit. RTB not only scales up the buying process by aggregating a large amount of available inventories across publishers but, most importantly, enables direct targeting of individual users. As such, RTB has fundamentally changed the landscape of digital marketing. Scientifically, the demand for automation, integration and optimisation in RTB also brings new research opportunities in information retrieval, data mining, machine learning and other related fields. In this monograph, an overview is given of the fundamental infrastructure, algorithms, and technical solutions of this new frontier of computational advertising. The covered topics include user response prediction, bid landscape forecasting, bidding algorithms, revenue optimisation, statistical arbitrage, dynamic pricing, and ad fraud detection
Display Advertising with Real-Time Bidding (RTB) and Behavioural Targeting
The most significant progress in recent years in online display advertising is what is known as the Real-Time Bidding (RTB) mechanism to buy and sell ads. RTB essentially facilitates buying an individual ad impression in real time while it is still being generated from a user’s visit. RTB not only scales up the buying process by aggregating a large amount of available inventories across publishers but, most importantly, enables direct targeting of individual users. As such, RTB has fundamentally changed the landscape of digital marketing. Scientifically, the demand for automation, integration and optimisation in RTB also brings new research opportunities in information retrieval, data mining, machine learning and other related fields. In this monograph, an overview is given of the fundamental infrastructure, algorithms, and technical solutions of this new frontier of computational advertising. The covered topics include user response prediction, bid landscape forecasting, bidding algorithms, revenue optimisation, statistical arbitrage, dynamic pricing, and ad fraud detection
Feature selection strategies for improving data-driven decision support in bank telemarketing
The usage of data mining techniques to unveil previously undiscovered knowledge has
been applied in past years to a wide number of domains, including banking and marketing. Raw
data is the basic ingredient for successfully detecting interesting patterns. A key aspect of raw
data manipulation is feature engineering and it is related with the correct characterization or
selection of relevant features (or variables) that conceal relations with the target goal.
This study is particularly focused on feature engineering, aiming at the unfolding
features that best characterize the problem of selling long-term bank deposits through
telemarketing campaigns. For the experimental setup, a case-study from a Portuguese bank,
ranging the 2008-2013 year period and encompassing the recent global financial crisis, was
addressed. To assess the relevance of such problem, a novel literature analysis using text
mining and the latent Dirichlet allocation algorithm was conducted, confirming the existence of a
research gap for bank telemarketing.
Starting from a dataset containing typical telemarketing contacts and client information,
research followed three different and complementary strategies: first, by enriching the dataset
with social and economic context features; then, by including customer lifetime value related
features; finally, by applying a divide and conquer strategy for splitting the problem in smaller
fractions, leading to optimized sub-problems. Each of the three approaches improved previous
results in terms of model metrics related to prediction performance. The relevance of the
proposed features was evaluated, confirming the obtained models as credible and valuable for
telemarketing campaign managers.A utilização de técnicas de data mining para a descoberta de conhecimento tem sido
aplicada nos últimos anos a uma grande variedade de domínios, incluindo banca e marketing.
Os dados no seu estado primitivo constituem o ingrediente básico para a deteção de padrões
de informação. Um aspeto chave da manipulação de dados em bruto consiste na "engenharia
de atributos", que compreende uma correta definição e seleção de atributos relevantes (ou
variáveis) que se relacionem com o alvo da descoberta de conhecimento.
Este trabalho foca-se numa abordagem de "engenharia de atributos" para definir as
variáveis que melhor caraterizam o problema de vender depósitos bancários a prazo através de
campanhas de telemarketing. Sendo um estudo empírico, foi utilizado um caso de estudo de
um banco português, abrangendo o período 2008-2013, que inclui os efeitos da crise financeira
internacional. Para aferir da importância deste problema, foi realizada uma inovadora análise
da literatura recorrendo a text mining e ao algoritmo latent Dirichlet allocation, confirmando a
existência de uma lacuna nesta matéria.
Utilizando como base um conjunto de dados de contactos de telemarketing e
informação sobre os clientes, três estratégias diferentes e complementares foram propostas:
primeiro, os dados foram enriquecidos com atributos socioeconómicos; posteriormente, foram
adicionadas características associadas ao valor do cliente ao longo do seu tempo de vida;
finalmente, o problema foi dividido em problemas mais específicos, permitindo abordagens
otimizadas a cada subproblema. Cada abordagem melhorou as métricas associadas à
capacidade preditiva do modelo. Adicionalmente, a relevância dos atributos foi avaliada,
confirmando os modelos obtidos como credíveis e valiosos para gestores de campanhas de telemarketing
Automating lead scoring with machine learning: An experimental study
Companies often gather a tremendous amount of data, such as browsing behavior, email activities and other contact data. This data can be the source of important competitive advantage by utilizing it in estimating a contact\u27s purchase probability using predictive analytics. The calculated purchase probability can then be used by companies to solve different business problems, such as optimizing their sales processes. The purpose of this article is to study how machine learning can be used to perform lead scoring as a special application case of making use of purchase probability. Historical behavioral data is used as training data for the classification algorithm, and purchase moments are used to limit the behavioral data for the contacts that have purchased a product in the past. Different ways of aggregating time-series data are tested to ensure that limiting the activities for buyers does not result in model bias. The results suggest that it is possible to estimate the purchase probability of leads using supervised learning algorithms, such as random forest, and that it is possible to obtain business insights from the results using visual analytic
- …