1,040 research outputs found

    Default or profit scoring credit systems? Evidence from European and US peer-to-peer lending markets

    Get PDF
    For the emerging peer-to-peer (P2P) lending markets to survive, they need to employ credit-risk management practices such that an investor base is profitable in the long run. Traditionally, credit-risk management relies on credit scoring that predicts loans’ probability of default. In this paper, we use a profit scoring approach that is based on modeling the annualized adjusted internal rate of returns of loans. To validate our profit scoring models with traditional credit scoring models, we use data from a European P2P lending market, Bondora, and also a random sample of loans from the Lending Club P2P lending market. We compare the out-of-sample accuracy and profitability of the credit and profit scoring models within several classes of statistical and machine learning models including the following: logistic and linear regression, lasso, ridge, elastic net, random forest, and neural networks. We found that our approach outperforms standard credit scoring models for Lending Club and Bondora loans. More specifically, as opposed to credit scoring models, returns across all loans are 24.0% (Bondora) and 15.5% (Lending Club) higher, whereas accuracy is 6.7% (Bondora) and 3.1% (Lending Club) higher for the proposed profit scoring models. Moreover, our results are not driven by manual selection as profit scoring models suggest investing in more loans. Finally, even if we consider data sampling bias, we found that the set of superior models consists almost exclusively of profit scoring models. Thus, our results contribute to the literature by suggesting a paradigm shift in modeling credit-risk in the P2P market to prefer profit as opposed to credit-risk scoring models

    Internet Financial Credit Risk Assessment with Sliding Window and Attention Mechanism LSTM Model

    Get PDF
    With the accelerated pace of market-oriented reform, Internet finance has gained a broad and healthy development environment. Existing studies lack consideration of time trends in financial risk, and treating all features equally may lead to inaccurate predictions. To address the above problems, we propose an LSTM model based on sliding window and attention mechanism. The model uses sliding windows to enable the model to effectively exploit the contextual relevance of loan data. And we introduce the attention mechanism into the model, which enables the model to focus on important information. The result on the Lending Club public desensitization dataset shows that our model outperforms ARIMA, SVM, ANN, LSTM, and GRU models

    Internet Financial Credit Risk Assessment with Sliding Window and Attention Mechanism LSTM Model

    Get PDF
    With the accelerated pace of market-oriented reform, Internet finance has gained a broad and healthy development environment. Existing studies lack consideration of time trends in financial risk, and treating all features equally may lead to inaccurate predictions. To address the above problems, we propose an LSTM model based on sliding window and attention mechanism. The model uses sliding windows to enable the model to effectively exploit the contextual relevance of loan data. And we introduce the attention mechanism into the model, which enables the model to focus on important information. The result on the Lending Club public desensitization dataset shows that our model outperforms ARIMA, SVM, ANN, LSTM, and GRU models

    The Value of Alternative Data in Credit Risk Prediction: Evidence from a Large Field Experiment

    Get PDF
    Recently, the high penetration of mobile devices and internet access offers a new source of fine-grained user behavior data (aka “alternative data”) to improve the financial credit risk assessment. This paper conducts a comprehensive evaluation of the value of alternative data on microloan platforms with a large field experiment. Our machine-learning-based empirical analyses demonstrate that alternative data can significantly improve the prediction accuracy of borrowers’ default behavior and increase platform profits. Cellphone usage and mobility trace information perform the best among the multiple sources of alternative data. Moreover, we find that our proposed framework helps financial institutions extend their service to more lower-income and less-educated loan applicants from less-developed geographical areas – those historically disadvantaged population who have been largely neglected in the past. Our study demonstrates the tremendous potential of leveraging alternative data to alleviate such inequality in the financial service markets, while in the meantime achieving higher platform revenues

    Credit Risk Scoring: A Stacking Generalization Approach

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Statistics and Information Management, specialization in Risk Analysis and ManagementCredit risk regulation has been receiving tremendous attention, as a result of the effects of the latest global financial crisis. According to the developments made in the Internal Rating Based approach, under the Basel guidelines, banks are allowed to use internal risk measures as key drivers to assess the possibility to grant a loan to an applicant. Credit scoring is a statistical approach used for evaluating potential loan applications in both financial and banking institutions. When applying for a loan, an applicant must fill out an application form detailing its characteristics (e.g., income, marital status, and loan purpose) that will serve as contributions to a credit scoring model which produces a score that is used to determine whether a loan should be granted or not. This enables faster and consistent credit approvals and the reduction of bad debt. Currently, many machine learning and statistical approaches such as logistic regression and tree-based algorithms have been used individually for credit scoring models. Newer contemporary machine learning techniques can outperform classic methods by simply combining models. This dissertation intends to be an empirical study on a publicly available bank loan dataset to study banking loan default, using ensemble-based techniques to increase model robustness and predictive power. The proposed ensemble method is based on stacking generalization an extension of various preceding studies that used different techniques to further enhance the model predictive capabilities. The results show that combining different models provides a great deal of flexibility to credit scoring models

    Credit Scoring with AHP and Fuzzy Comprehensive Evaluation Based on Behavioural Data from Weibo Platform

    Get PDF
    It is increasingly necessary to evaluate the customers\u27 credit. In the era of big data, Information on the Internet is commonly used to judge the credit worthiness of customers. Some users\u27 credit information is incomplete or unavailable, so credit managers cannot judge the true credit situation of these users. However, with the support of social data especially behavioural data and credit evaluation system, this problem can be effectively solved. This study used Weibo to obtain the behavioural data of Chinese users for credit evaluation. Two methods are used to calculate the credit scores of Weibo users, which are the analytic hierarchy process (AHP) and fuzzy comprehensive evaluation methods. By analysing social processes and inviting experts to make decisions, we constructed a credit evaluation system to expose users\u27 behavioural characteristics. We found that the three key indexes determining the user’s social credit are personal identification, behavioural characteristics and interaction among friends. Then, AHP was used to determine the weight of each index. Finally, a static algorithm was proposed to compute the credit evaluation system of Weibo users using fuzzy comprehensive evaluation methods

    Understanding, Analyzing and Predicting Online User Behavior

    Get PDF
    abstract: Due to the growing popularity of the Internet and smart mobile devices, massive data has been produced every day, particularly, more and more users’ online behavior and activities have been digitalized. Making a better usage of the massive data and a better understanding of the user behavior become at the very heart of industrial firms as well as the academia. However, due to the large size and unstructured format of user behavioral data, as well as the heterogeneous nature of individuals, it leveled up the difficulty to identify the SPECIFIC behavior that researchers are looking at, HOW to distinguish, and WHAT is resulting from the behavior. The difference in user behavior comes from different causes; in my dissertation, I am studying three circumstances of behavior that potentially bring in turbulent or detrimental effects, from precursory culture to preparatory strategy and delusory fraudulence. Meanwhile, I have access to the versatile toolkit of analysis: econometrics, quasi-experiment, together with machine learning techniques such as text mining, sentiment analysis, and predictive analytics etc. This study creatively leverages the power of the combined methodologies, and apply it beyond individual level data and network data. This dissertation makes a first step to discover user behavior in the newly boosting contexts. My study conceptualize theoretically and test empirically the effect of cultural values on rating and I find that an individualist cultural background are more likely to lead to deviation and more expression in review behaviors. I also find evidence of strategic behavior that users tend to leverage the reporting to increase the likelihood to maximize the benefits. Moreover, it proposes the features that moderate the preparation behavior. Finally, it introduces a unified and scalable framework for delusory behavior detection that meets the current needs to fully utilize multiple data sources.Dissertation/ThesisDoctoral Dissertation Business Administration 201

    Machine Learning Approach for Credit Score Predictions

    Get PDF
    This paper addresses the problem of managing the significant rise in requests for credit products that banking and financial institutions face. The aim is to propose an adaptive, dynamic heterogeneous ensemble credit model that integrates the XGBoost and Support Vector Machine models to improve the accuracy and reliability of risk assessment credit scoring models. The method employs machine learning techniques to recognise patterns and trends from past data to anticipate future occurrences. The proposed approach is compared with existing credit score models to validate its efficacy using five popular evaluation metrics, Accuracy, ROC AUC, Precision, Recall and F1_Score. The paper highlights credit scoring models’ challenges, such as class imbalance, verification latency and concept drift. The results show that the proposed approach outperforms the existing models regarding the evaluation metrics, achieving a balance between predictive accuracy and computational cost. The conclusion emphasises the significance of the proposed approach for the banking and financial sector in developing robust and reliable credit scoring models to evaluate the creditworthiness of their clients

    Analysis of the probability of default in peer-to-peer lending. Application of different classification techniques.

    Get PDF
    In this thesis, peer-to-peer lending is explored and analyzed with the objective of fitting a model to accurately predict if borrowers default on their loans or not. The foundation for the thesis is a dataset from LendingClub, a peer-to-peer lending platform based in San Francisco, USA. Detailed information of borrowers’ financial history, personal characteristics and the specifics of each loan is used to predict the probability of default for the various loans in the portfolio. Methods used include elastic net regularization of logistic regression, boosting of decision trees, and bagging with random forests. The results are compared using accuracy metrics and a profitability measure, before a final model selection is carried out.Masteroppgave i statistikkMAMN-STATSTAT39
    • …
    corecore