50 research outputs found

    Churn prediction using customers' implicit behavioral patterns and deep learning

    Get PDF
    The processes of market globalization are rapidly changing the competitive conditions of the business and financial sectors. With the emergence of new competitors and increasing investments in the banking services, an environment of closer customer relationships is the demand of today’s economics. In such a scenario, the concept of customer’s willingness to change the service provider – i.e. churn, has become a competitive domain for organizations to work on. In the banking sector, the task to retain the valuable customers has forced management to preemptively work on customers data and devise strategies to engage the customers and thereby reducing the churn rate. Valuable information can be extracted and implicit behavior patterns can be derived from the customers’ transaction and demographic data. Our prediction model, which is jointly using the time and location based sequence features has shown significant improvement in the customer churn prediction. Various supervised models had been developed in the past to predict churning customers; our model is using the features which are derived jointly from location and time stamped data. These sequenced based feature vectors are then used in the neural network for the churn prediction. In this study, we have found that time sequenced data used in a recurrent neural network based Long Short Term Memory (LSTM) model can predict with better precision and recall values when compared with baseline model. The feature vector output of our LSTM model combined with other demographic and computed behavioral features of customers gave better prediction results. We have also iv proposed and developed a model to find out whether connection between the customers can assist in the churn prediction using Graph convolutional networks (GCN); which incorporate customer network connections defined over three dimension

    Feature selection strategies for improving data-driven decision support in bank telemarketing

    Get PDF
    The usage of data mining techniques to unveil previously undiscovered knowledge has been applied in past years to a wide number of domains, including banking and marketing. Raw data is the basic ingredient for successfully detecting interesting patterns. A key aspect of raw data manipulation is feature engineering and it is related with the correct characterization or selection of relevant features (or variables) that conceal relations with the target goal. This study is particularly focused on feature engineering, aiming at the unfolding features that best characterize the problem of selling long-term bank deposits through telemarketing campaigns. For the experimental setup, a case-study from a Portuguese bank, ranging the 2008-2013 year period and encompassing the recent global financial crisis, was addressed. To assess the relevance of such problem, a novel literature analysis using text mining and the latent Dirichlet allocation algorithm was conducted, confirming the existence of a research gap for bank telemarketing. Starting from a dataset containing typical telemarketing contacts and client information, research followed three different and complementary strategies: first, by enriching the dataset with social and economic context features; then, by including customer lifetime value related features; finally, by applying a divide and conquer strategy for splitting the problem in smaller fractions, leading to optimized sub-problems. Each of the three approaches improved previous results in terms of model metrics related to prediction performance. The relevance of the proposed features was evaluated, confirming the obtained models as credible and valuable for telemarketing campaign managers.A utilização de técnicas de data mining para a descoberta de conhecimento tem sido aplicada nos últimos anos a uma grande variedade de domínios, incluindo banca e marketing. Os dados no seu estado primitivo constituem o ingrediente básico para a deteção de padrões de informação. Um aspeto chave da manipulação de dados em bruto consiste na "engenharia de atributos", que compreende uma correta definição e seleção de atributos relevantes (ou variáveis) que se relacionem com o alvo da descoberta de conhecimento. Este trabalho foca-se numa abordagem de "engenharia de atributos" para definir as variáveis que melhor caraterizam o problema de vender depósitos bancários a prazo através de campanhas de telemarketing. Sendo um estudo empírico, foi utilizado um caso de estudo de um banco português, abrangendo o período 2008-2013, que inclui os efeitos da crise financeira internacional. Para aferir da importância deste problema, foi realizada uma inovadora análise da literatura recorrendo a text mining e ao algoritmo latent Dirichlet allocation, confirmando a existência de uma lacuna nesta matéria. Utilizando como base um conjunto de dados de contactos de telemarketing e informação sobre os clientes, três estratégias diferentes e complementares foram propostas: primeiro, os dados foram enriquecidos com atributos socioeconómicos; posteriormente, foram adicionadas características associadas ao valor do cliente ao longo do seu tempo de vida; finalmente, o problema foi dividido em problemas mais específicos, permitindo abordagens otimizadas a cada subproblema. Cada abordagem melhorou as métricas associadas à capacidade preditiva do modelo. Adicionalmente, a relevância dos atributos foi avaliada, confirmando os modelos obtidos como credíveis e valiosos para gestores de campanhas de telemarketing

    Predictive Modelling of Retail Banking Transactions for Credit Scoring, Cross-Selling and Payment Pattern Discovery

    Get PDF
    Evaluating transactional payment behaviour offers a competitive advantage in the modern payment ecosystem, not only for confirming the presence of good credit applicants or unlocking the cross-selling potential between the respective product and service portfolios of financial institutions, but also to rule out bad credit applicants precisely in transactional payments streams. In a diagnostic test for analysing the payment behaviour, I have used a hybrid approach comprising a combination of supervised and unsupervised learning algorithms to discover behavioural patterns. Supervised learning algorithms can compute a range of credit scores and cross-sell candidates, although the applied methods only discover limited behavioural patterns across the payment streams. Moreover, the performance of the applied supervised learning algorithms varies across the different data models and their optimisation is inversely related to the pre-processed dataset. Subsequently, the research experiments conducted suggest that the Two-Class Decision Forest is an effective algorithm to determine both the cross-sell candidates and creditworthiness of their customers. In addition, a deep-learning model using neural network has been considered with a meaningful interpretation of future payment behaviour through categorised payment transactions, in particular by providing additional deep insights through graph-based visualisations. However, the research shows that unsupervised learning algorithms play a central role in evaluating the transactional payment behaviour of customers to discover associations using market basket analysis based on previous payment transactions, finding the frequent transactions categories, and developing interesting rules when each transaction category is performed on the same payment stream. Current research also reveals that the transactional payment behaviour analysis is multifaceted in the financial industry for assessing the diagnostic ability of promotion candidates and classifying bad credit applicants from among the entire customer base. The developed predictive models can also be commonly used to estimate the credit risk of any credit applicant based on his/her transactional payment behaviour profile, combined with deep insights from the categorised payment transactions analysis. The research study provides a full review of the performance characteristic results from different developed data models. Thus, the demonstrated data science approach is a possible proof of how machine learning models can be turned into cost-sensitive data models

    Modelling customers credit card behaviour using bidirectional LSTM neural networks

    Get PDF
    With the rapid growth of consumer credit and the huge amount of financial data developing effective credit scoring models is very crucial. Researchers have developed complex credit scoring models using statistical and artificial intelligence (AI) techniques to help banks and financial institutions to support their financial decisions. Neural networks are considered as a mostly wide used technique in finance and business applications. Thus, the main aim of this paper is to help bank management in scoring credit card clients using machine learning by modelling and predicting the consumer behaviour with respect to two aspects: the probability of single and consecutive missed payments for credit card customers. The proposed model is based on the bidirectional Long-Short Term Memory (LSTM) model to give the probability of a missed payment during the next month for each customer. The model was trained on a real credit card dataset and the customer behavioural scores are analysed using classical measures such as accuracy, Area Under the Curve, Brier score, Kolmogorov–Smirnov test, and H-measure. Calibration analysis of the LSTM model scores showed that they can be considered as probabilities of missed payments. The LSTM model was compared to four traditional machine learning algorithms: support vector machine, random forest, multi-layer perceptron neural network, and logistic regression. Experimental results show that, compared with traditional methods, the consumer credit scoring method based on the LSTM neural network has significantly improved consumer credit scoring

    Cross channel fraud detection framework in financial services using recurrent neural networks

    Get PDF
    The reliability and performance of real time fraud detection techniques has been a major concern for the financial institutions as traditional fraud detection models couldn’t cope with the emerging new and innovative attacks that deceive banks. The problems are further exacerbated with evolving customer behaviour as existing fraud detection models unable to cope with class imbalance problem and longer feedback loop. This thesis looks at the holistic view of fraud detection and proposes a conceptual fraud detection framework that can detect anomalous transaction quickly and accurately, as well as dynamically evolve to maintain the efficiency with minimum input from subject matter expert. The framework is used to analyse Internet Banking (IB) transactions and contextual information to reduce the false positives and improve fraud detection rates. Based on the proposed framework, Long Short-Term Memory (LSTM) based Recurrent Neural Network model for detecting fraud in remote banking is implemented and performance is evaluated against Support Vector Machine (SVM) and Markov models. The main research element is to model events as state vectors so that sequence-based learning can be applied, followed by a weak classifier to deal with noise. Firstly, the study focuses on Feature Engineering where along raw attributes such as IP Address, Amount and other, two novel features for remote banking fraud are evaluated, i.e., the time spend on a page and the time between page transition. The second focus is on modelling which is performed on an anonymised real-life dataset, provided by a large financial institution in Europe. The results of the modelling demonstrate that given the labelled dataset all models can detect payment fraud with acceptable accuracy. Various tests proved that the LSTM model achieves a F1 score of 97.7% whereas the SVM and Markov model achieve 93.5% and 95.0% respectively. As the time elapsed, the LSTM model performance significantly improves as the sequence of events became larger. As the dataset increases that time it takes to train traditional models becomes a bottleneck. This proves the hypothesis that the events across banking channels can be modelled as time series data and then sequence-based learners such as Recurrent Neural Network (RNN) can be applied to improve or reduce the False Positive Rate (FPR) and False Negative Rate (FNR)

    A multi-attribute data mining model for rule extraction and service operations benchmarking

    Get PDF
    The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.Purpose Customer differences and similarities play a crucial role in service operations, and service industries need to develop various strategies for different customer types. This study aims to understand the behavioral pattern of customers in the banking industry by proposing a hybrid data mining approach with rule extraction and service operation benchmarking. Design/methodology/approach The authors analyze customer data to identify the best customers using a modified recency, frequency and monetary (RFM) model and K-means clustering. The number of clusters is determined with a two-step K-means quality analysis based on the Silhouette, Davies–Bouldin and Calinski–Harabasz indices and the evaluation based on distance from average solution (EDAS). The best–worst method (BWM) and the total area based on orthogonal vectors (TAOV) are used next to sort the clusters. Finally, the associative rules and the Apriori algorithm are used to derive the customers' behavior patterns. Findings As a result of implementing the proposed approach in the financial service industry, customers were segmented and ranked into six clusters by analyzing 20,000 records. Furthermore, frequent customer financial behavior patterns were recognized based on demographic characteristics and financial transactions of customers. Thus, customer types were classified as highly loyal, loyal, high-interacting, low-interacting and missing customers. Eventually, appropriate strategies for interacting with each customer type were proposed. Originality/value The authors propose a novel hybrid multi-attribute data mining approach for rule extraction and the service operations benchmarking approach by combining data mining tools with a multilayer decision-making approach. The proposed hybrid approach has been implemented in a large-scale problem in the financial services industry

    Collaborative-demographic hybrid for financial: product recommendation

    Get PDF
    Internship Report presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced AnalyticsDue to the increased availability of mature data mining and analysis technologies supporting CRM processes, several financial institutions are striving to leverage customer data and integrate insights regarding customer behaviour, needs, and preferences into their marketing approach. As decision support systems assisting marketing and commercial efforts, Recommender Systems applied to the financial domain have been gaining increased attention. This thesis studies a Collaborative- Demographic Hybrid Recommendation System, applied to the financial services sector, based on real data provided by a Portuguese private commercial bank. This work establishes a framework to support account managers’ advice on which financial product is most suitable for each of the bank’s corporate clients. The recommendation problem is further developed by conducting a performance comparison for both multi-output regression and multiclass classification prediction approaches. Experimental results indicate that multiclass architectures are better suited for the prediction task, outperforming alternative multi-output regression models on the evaluation metrics considered. Withal, multiclass Feed-Forward Neural Networks, combined with Recursive Feature Elimination, is identified as the topperforming algorithm, yielding a 10-fold cross-validated F1 Measure of 83.16%, and achieving corresponding values of Precision and Recall of 84.34%, and 85.29%, respectively. Overall, this study provides important contributions for positioning the bank’s commercial efforts around customers’ future requirements. By allowing for a better understanding of customers’ needs and preferences, the proposed Recommender allows for more personalized and targeted marketing contacts, leading to higher conversion rates, corporate profitability, and customer satisfaction and loyalty
    corecore