657 research outputs found

    Deposit subscribe Prediction using Data Mining Techniques based Real Marketing Dataset

    Full text link
    Recently, economic depression, which scoured all over the world, affects business organizations and banking sectors. Such economic pose causes a severe attrition for banks and customer retention becomes impossible. Accordingly, marketing managers are in need to increase marketing campaigns, whereas organizations evade both expenses and business expansion. In order to solve such riddle, data mining techniques is used as an uttermost factor in data analysis, data summarizations, hidden pattern discovery, and data interpretation. In this paper, rough set theory and decision tree mining techniques have been implemented, using a real marketing data obtained from Portuguese marketing campaign related to bank deposit subscription [Moro et al., 2011]. The paper aims to improve the efficiency of the marketing campaigns and helping the decision makers by reducing the number of features, that describes the dataset and spotting on the most significant ones, and predict the deposit customer retention criteria based on potential predictive rules

    Improving the accuracy of predicting bank depositor' behavior using decision tree

    Get PDF
    Telemarketing is a widely adopted direct marketing technique in banks. Since customers hardly respond positively, data prediction models can help in selecting the most likely prospective customers. We aim to develop a classifier accuracy to predict which customer will subscribe to a long-term deposit proposed by a bank. Accordingly, this paper focuses on a combination of resampling, in order to reduce the imbalanced data, using feature selection, to reduce the complexity of data computing and dimension reduction of inefficiency data modeling. The performed operation has shown an improvement in the performance of the classification algorithm in terms of accuracy. The experimental results were run on a real bank dataset and the J48 decision tree achieved 94.39% accuracy prediction, with 0.975 sensitivity and 0.709 specificity, showing better results when compared to other approaches reported in the existing literature, such as logistic regression (91.79 accuracy; 0.975 sensitivity; 0.495 specificity) and Naive Bayes classifier (90.82% accuracy; 0.961 sensitivity; 0.507 specificity). Furthermore, our resampling and feature selection approach resulted in improved accuracy (94.39%) when compared to a state-of-the-art approach based on a fuzzy algorithm (92.89%).info:eu-repo/semantics/publishedVersio

    Identifying Prospective Clients for Long-Term Bank Deposit

    Get PDF
    The numerous characteristics of customers are often kept in bank databases, which are utilized to understand who they are. But it has been found in recent years that utilizing different Data Mining and Feature Selection (PCA) methods, customer traits and other factors connected to bank services have a big influence on consumers\u27 decisions. Business analytics is an approach to conducting business that uses transactional data from an organization to acquire knowledge of how business operations can be enhanced by employing data mining methods to determine existing patterns that a firm can incorporate to generate significant data-driven choices to choose significant variables. In this project, we apply data mining techniques for the prediction of long- term bank deposits employing a well-known bank data collection. From PCA it is seen that customers’ income level, pout come, p days, and previous (first PC) in general, may seem to have a higher impact on prospective clients, but this is indeed not the real. Also, the Banks’ prior campaign and the social elements (Age, Marital Status, Education, Campaign, Duration) of the clients are primarily essential compared to other variables. Again k-means clustering is employed with reduced data by PCA to determine groups of potential customers which gives 87.76% accuracy scores

    Telemarketing outcome prediction using an Ensemblebased machine learning technique

    Get PDF
    Business organisations often use telemarketing, which is a form of direct marketing strategy to reach a wide range of customers within a short time. However, such marketing strategies need to target an appropriate subset of customers to offer them products/services instead of contacting everyone as people often get annoyed and disengaged when they receive pre-emptive communication. Machine learning techniques can aid in this scenario to select customers who are likely to positively respond to a telemarketing campaign. Business organisations can use their CRM-based customer information and embed machine learning techniques in the data analysis process to develop an automated decisionmaking system, which can recommend the set of customers to be communicated. A few works in the literature have used machine learning techniques to predict the outcome of telemarketing, however, the majority of them used a single classifier algorithm or used only a balanced dataset. To address this issue, this article proposes an ensemble-based machine learning technique to predict the outcome of telemarking, which works well even with an imbalanced dataset and achieves 90.29% accuracy

    A Data-Driven Approach to Predict the Success of Bank Telemarketing

    Get PDF
    We propose a data mining (DM) approach to predict the success of telemarketing calls for selling bank long-term deposits. A Portuguese retail bank was addressed, with data collected from 2008 to 2013, thus including the effects of the recent finan- cial crisis. We analyzed a large set of 150 features related with bank client, product and social-economic attributes. A semi-automatic feature selection was explored in the modeling phase, performed with the data prior to July 2012 and that allowed to select a reduced set of 22 features. We also compared four DM models: logistic regression, decision trees (DT), neural network (NN) and support vector machine. Using two metrics, area of the receiver operating characteristic curve (AUC) and area of the LIFT cumulative curve (ALIFT), the four models were tested on an eval- uation phase, using the most recent data (after July 2012) and a rolling windows scheme. The NN presented the best results (AUC=0.8 and ALIFT=0.7), allowing to reach 79% of the subscribers by selecting the half better classified clients. Also, two knowledge extraction methods, a sensitivity analysis and a DT, were applied to the NN model and revealed several key attributes (e.g., Euribor rate, direction of the call and bank agent experience). Such knowledge extraction confirmed the obtained model as credible and valuable for telemarketing campaign managers

    Data Mining for Potential Customer Segmentation in the Marketing Bank Dataset

    Get PDF
    Direct marketing is an effort made by the Bank to increase sales of its products and services, but the Bank sometimes has to contact a customer or prospective customer more than once to ascertain whether the customer or prospective customer is willing to subscribe to a product or service. To overcome this ineffective process several data mining methods are proposed. This study compares several data mining methods such as Naïve Bayes, K-NN, Random Forest, SVM, J48, AdaBoost J48 which prior to classification the SMOTE pre-processing technique was done in order to eliminate the class imbalance problem in the Bank Marketing dataset instance. The SMOTE + Random Forest method in this study produced the highest accuracy value of 92.61%

    A data mining approach for bank telemarketing using the rminer package and R tool

    Get PDF
    Due to the global financial crisis, credit on international markets became more restricted for banks, turning attention to internal clients and their deposits to gather funds. This driver led to a demand for knowledge about client’s behavior towards deposits and especially their response to telemarketing campaigns. This work describes a data mining approach to extract valuable knowledge from recent Portuguese bank telemarketing campaign data. Such approach was guided by the CRISP- -DM methodology and the data analysis was conducted using the rminer package and R tool. Three classification models were tested (i.e., Decision Trees, Naïve Bayes and Support Vector Machines) and compared using two relevant criteria: ROC and Lift curve analysis. Overall, the Support Vector Machine obtained the best results and a sensitive analysis was applied to extract useful knowledge from this model, such as the best months for contacts and the influence of the last campaign result and having or not a mortgage credit on a successful deposit subscription

    Feature selection strategies for improving data-driven decision support in bank telemarketing

    Get PDF
    The usage of data mining techniques to unveil previously undiscovered knowledge has been applied in past years to a wide number of domains, including banking and marketing. Raw data is the basic ingredient for successfully detecting interesting patterns. A key aspect of raw data manipulation is feature engineering and it is related with the correct characterization or selection of relevant features (or variables) that conceal relations with the target goal. This study is particularly focused on feature engineering, aiming at the unfolding features that best characterize the problem of selling long-term bank deposits through telemarketing campaigns. For the experimental setup, a case-study from a Portuguese bank, ranging the 2008-2013 year period and encompassing the recent global financial crisis, was addressed. To assess the relevance of such problem, a novel literature analysis using text mining and the latent Dirichlet allocation algorithm was conducted, confirming the existence of a research gap for bank telemarketing. Starting from a dataset containing typical telemarketing contacts and client information, research followed three different and complementary strategies: first, by enriching the dataset with social and economic context features; then, by including customer lifetime value related features; finally, by applying a divide and conquer strategy for splitting the problem in smaller fractions, leading to optimized sub-problems. Each of the three approaches improved previous results in terms of model metrics related to prediction performance. The relevance of the proposed features was evaluated, confirming the obtained models as credible and valuable for telemarketing campaign managers.A utilização de técnicas de data mining para a descoberta de conhecimento tem sido aplicada nos últimos anos a uma grande variedade de domínios, incluindo banca e marketing. Os dados no seu estado primitivo constituem o ingrediente básico para a deteção de padrões de informação. Um aspeto chave da manipulação de dados em bruto consiste na "engenharia de atributos", que compreende uma correta definição e seleção de atributos relevantes (ou variáveis) que se relacionem com o alvo da descoberta de conhecimento. Este trabalho foca-se numa abordagem de "engenharia de atributos" para definir as variáveis que melhor caraterizam o problema de vender depósitos bancários a prazo através de campanhas de telemarketing. Sendo um estudo empírico, foi utilizado um caso de estudo de um banco português, abrangendo o período 2008-2013, que inclui os efeitos da crise financeira internacional. Para aferir da importância deste problema, foi realizada uma inovadora análise da literatura recorrendo a text mining e ao algoritmo latent Dirichlet allocation, confirmando a existência de uma lacuna nesta matéria. Utilizando como base um conjunto de dados de contactos de telemarketing e informação sobre os clientes, três estratégias diferentes e complementares foram propostas: primeiro, os dados foram enriquecidos com atributos socioeconómicos; posteriormente, foram adicionadas características associadas ao valor do cliente ao longo do seu tempo de vida; finalmente, o problema foi dividido em problemas mais específicos, permitindo abordagens otimizadas a cada subproblema. Cada abordagem melhorou as métricas associadas à capacidade preditiva do modelo. Adicionalmente, a relevância dos atributos foi avaliada, confirmando os modelos obtidos como credíveis e valiosos para gestores de campanhas de telemarketing

    A Comparison of Two Modeling Techniques in Customer Targeting For Bank Telemarketing

    Get PDF
    Customer targeting is the key to the success of bank telemarketing. To compare the flexible discriminant analysis and the logistic regression in customer targeting, a survey dataset from a Portuguese bank was used. For the flexible discriminant analysis model, the backward elimination of explanatory variables was used with several rounds of manual re-defining of dummy variables. For the logistic regression model, the automatic stepwise selection was performed to decide which explanatory variables should be left in the final model. Ten-fold stratified cross validation was performed to estimate the model parameters and accuracies. Although employing different sets of explanatory variables, the flexible discriminant analysis model and the logistic regression model show equally satisfactory performances in customer classification based on the areas under the receiver operating characteristic curves. Focusing on the predicted “right” customers, the logistic regression model shows slightly better classification and higher overall correct prediction rate
    corecore