1,096 research outputs found

    Analyzing Three Different Tuning Strategies for Random Forest Hyperparameters for Fraud Detection

    Get PDF
    Technology is advancing rapidly, and more tasks are becoming online than ever. Along with the benefits comes the disadvantages of this great advancement. While online services relieve from the struggle of in person activities, it also puts you on the risk of getting deceived by the fraudsters. This paper aims to detect the fraudulent transactions made online from a bank using a synthetically produced dataset. A random forest model has been trained to predict the fraudulent transactions. To achieve the best performance, the hyperparameters of the model have been tuned using three different tuning methods. As it turns out, grid search proved to be the best tuning strategy in terms of the mean cv score, precision, recall, f1-score and accuracy. It only lacked in providing the best run time, where Bayesian Optimization scored well than the others

    Financial Fraud Detection using Improved Artificial Humming Bird Algorithm with Modified Extreme Learning Machine

    Get PDF
    More and more industries, including the financial sector, are moving their operations online as internet usage continues to rise at an exponential rate. As a result, financial fraud is on the rise in all its guises and in all parts of the world, causing enormous economic damage. The purpose of financial fraud detection systems is to identify potential dangers, such as unauthorised access or unusual attacks. In recent years, this problem has been attacked using a variety of machine learning and data mining methods. Aalgorithms, on the other hand, are better able to deal with only a small quantity of labelled data and a large amount of unlabeled data, making them useful in situations where it would be impractical to rely solely on supervised learning algorithms to train a good-performing classifier. In this research, we propose a Semi-supervised Extreme Learning Machine (SKELM) built on top of the weighted kernel, which we call SELMWK. For the purpose of detecting financial fraud, this research proposes an enhanced artificial hummingbird algorithm (IAHA). The algorithm combines two essential techniques to enhance its capacity for optimisation. To begin, the Chebyshev chaotic map is used to seed the first population of artificial hummingbirds, which boosts the population's overall ability to do global searches. Second, the guided foraging phase incorporates the Levy flight to enlarge the search field and forestall early convergence. The experimental results demonstration that the suggested technique recovers the Internet monetary fraud detections

    Automated Machine Learning implementation framework in the banking sector

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Business AnalyticsAutomated Machine Learning is a subject in the Machine Learning field, designed to give the possibility of Machine Learning use to non-expert users, it aroused from the lack of subject matter experts, trying to remove humans from these topic implementations. The advantages behind automated machine learning are leaning towards the removal of human implementation, fastening the machine learning deployment speed. The organizations will benefit from effective solutions benchmarking and validations. The use of an automated machine learning implementation framework can deeply transform an organization adding value to the business by freeing the subject matter experts of the low-level machine learning projects, letting them focus on high level projects. This will also help the organization reach new competence, customization, and decision-making levels in a higher analytical maturity level. This work pretends, firstly to investigate the impact and benefits automated machine learning implementation in the banking sector, and afterwards develop an implementation framework that could be used by banking institutions as a guideline for the automated machine learning implementation through their departments. The autoML advantages and benefits are evaluated regarding business value and competitive advantage and it is presented the implementation in a fictitious institution, considering all the need steps and the possible setbacks that could arise. Banking institutions, in their business have different business processes, and since most of them are old institutions, the main concerns are related with the automating their business process, improving their analytical maturity and sensibilizing their workforce to the benefits of the implementation of new forms of work. To proceed to a successful implementation plan should be known the institution particularities, adapt to them and ensured the sensibilization of the workforce and management to the investments that need to be made and the changes in all levels of their organizational work that will come from that, that will lead to a lot of facilities in everyone’s daily work

    An academic review: applications of data mining techniques in finance industry

    Get PDF
    With the development of Internet techniques, data volumes are doubling every two years, faster than predicted by Moore’s Law. Big Data Analytics becomes particularly important for enterprise business. Modern computational technologies will provide effective tools to help understand hugely accumulated data and leverage this information to get insights into the finance industry. In order to get actionable insights into the business, data has become most valuable asset of financial organisations, as there are no physical products in finance industry to manufacture. This is where data mining techniques come to their rescue by allowing access to the right information at the right time. These techniques are used by the finance industry in various areas such as fraud detection, intelligent forecasting, credit rating, loan management, customer profiling, money laundering, marketing and prediction of price movements to name a few. This work aims to survey the research on data mining techniques applied to the finance industry from 2010 to 2015.The review finds that Stock prediction and Credit rating have received most attention of researchers, compared to Loan prediction, Money Laundering and Time Series prediction. Due to the dynamics, uncertainty and variety of data, nonlinear mapping techniques have been deeply studied than linear techniques. Also it has been proved that hybrid methods are more accurate in prediction, closely followed by Neural Network technique. This survey could provide a clue of applications of data mining techniques for finance industry, and a summary of methodologies for researchers in this area. Especially, it could provide a good vision of Data Mining Techniques in computational finance for beginners who want to work in the field of computational finance

    Explainable Artificial Intelligence and Causal Inference based ATM Fraud Detection

    Full text link
    Gaining the trust of customers and providing them empathy are very critical in the financial domain. Frequent occurrence of fraudulent activities affects these two factors. Hence, financial organizations and banks must take utmost care to mitigate them. Among them, ATM fraudulent transaction is a common problem faced by banks. There following are the critical challenges involved in fraud datasets: the dataset is highly imbalanced, the fraud pattern is changing, etc. Owing to the rarity of fraudulent activities, Fraud detection can be formulated as either a binary classification problem or One class classification (OCC). In this study, we handled these techniques on an ATM transactions dataset collected from India. In binary classification, we investigated the effectiveness of various over-sampling techniques, such as the Synthetic Minority Oversampling Technique (SMOTE) and its variants, Generative Adversarial Networks (GAN), to achieve oversampling. Further, we employed various machine learning techniques viz., Naive Bayes (NB), Logistic Regression (LR), Support Vector Machine (SVM), Decision Tree (DT), Random Forest (RF), Gradient Boosting Tree (GBT), Multi-layer perceptron (MLP). GBT outperformed the rest of the models by achieving 0.963 AUC, and DT stands second with 0.958 AUC. DT is the winner if the complexity and interpretability aspects are considered. Among all the oversampling approaches, SMOTE and its variants were observed to perform better. In OCC, IForest attained 0.959 CR, and OCSVM secured second place with 0.947 CR. Further, we incorporated explainable artificial intelligence (XAI) and causal inference (CI) in the fraud detection framework and studied it through various analyses.Comment: 34 pages; 21 Figures; 8 Table

    A rule-based machine learning model for financial fraud detection

    Get PDF
    Financial fraud is a growing problem that poses a significant threat to the banking industry, the government sector, and the public. In response, financial institutions must continuously improve their fraud detection systems. Although preventative and security precautions are implemented to reduce financial fraud, criminals are constantly adapting and devising new ways to evade fraud prevention systems. The classification of transactions as legitimate or fraudulent poses a significant challenge for existing classification models due to highly imbalanced datasets. This research aims to develop rules to detect fraud transactions that do not involve any resampling technique. The effectiveness of the rule-based model (RBM) is assessed using a variety of metrics such as accuracy, specificity, precision, recall, confusion matrix, Matthew’s correlation coefficient (MCC), and receiver operating characteristic (ROC) values. The proposed rule-based model is compared to several existing machine learning models such as random forest (RF), decision tree (DT), multi-layer perceptron (MLP), k-nearest neighbor (KNN), naive Bayes (NB), and logistic regression (LR) using two benchmark datasets. The results of the experiment show that the proposed rule-based model beat the other methods, reaching accuracy and precision of 0.99 and 0.99, respectively
    • …
    corecore