389 research outputs found

    Probabilistic XGBoost Threshold Classification with Autoencoder for Credit Card Fraud Detection

    Get PDF
    Due to the imbalanced data of outnumbered legitimate transactions than the fraudulent transaction, the detection of fraud is a challenging task to find an effective solution. In this study, autoencoder with probabilistic threshold shifting of XGBoost (AE-XGB) for credit card fraud detection is designed. Initially, AE-XGB employs autoencoder the prevalent dimensionality reduction technique to extract data features from latent space representation. Then the reconstructed lower dimensional features utilize eXtreame Gradient Boost (XGBoost), an ensemble boosting algorithm with probabilistic threshold to classify the data as fraudulent or legitimate. In addition to AE-XGB, other existing ensemble algorithms such as Adaptive Boosting (AdaBoost), Gradient Boosting Machine (GBM), Random Forest, Categorical Boosting (CatBoost), LightGBM and XGBoost are compared with optimal and default threshold. To validate the methodology, we used IEEE-CIS fraud detection dataset for our experiment. Class imbalance and high dimensionality characteristics of dataset reduce the performance of model hence the data is preprocessed and trained. To evaluate the performance of the model, evaluation indicators such as precision, recall, f1-score, g-mean and Mathews Correlation Coefficient (MCC) are accomplished. The findings revealed that the performance of the proposed AE-XGB model is effective in handling imbalanced data and able to detect fraudulent transactions with 90.4% of recall and 90.5% of f1-score from incoming new transactions

    Unsupervised Intrusion Detection with Cross-Domain Artificial Intelligence Methods

    Get PDF
    Cybercrime is a major concern for corporations, business owners, governments and citizens, and it continues to grow in spite of increasing investments in security and fraud prevention. The main challenges in this research field are: being able to detect unknown attacks, and reducing the false positive ratio. The aim of this research work was to target both problems by leveraging four artificial intelligence techniques. The first technique is a novel unsupervised learning method based on skip-gram modeling. It was designed, developed and tested against a public dataset with popular intrusion patterns. A high accuracy and a low false positive rate were achieved without prior knowledge of attack patterns. The second technique is a novel unsupervised learning method based on topic modeling. It was applied to three related domains (network attacks, payments fraud, IoT malware traffic). A high accuracy was achieved in the three scenarios, even though the malicious activity significantly differs from one domain to the other. The third technique is a novel unsupervised learning method based on deep autoencoders, with feature selection performed by a supervised method, random forest. Obtained results showed that this technique can outperform other similar techniques. The fourth technique is based on an MLP neural network, and is applied to alert reduction in fraud prevention. This method automates manual reviews previously done by human experts, without significantly impacting accuracy

    One-Class Adversarial Nets for Fraud Detection

    Full text link
    Many online applications, such as online social networks or knowledge bases, are often attacked by malicious users who commit different types of actions such as vandalism on Wikipedia or fraudulent reviews on eBay. Currently, most of the fraud detection approaches require a training dataset that contains records of both benign and malicious users. However, in practice, there are often no or very few records of malicious users. In this paper, we develop one-class adversarial nets (OCAN) for fraud detection using training data with only benign users. OCAN first uses LSTM-Autoencoder to learn the representations of benign users from their sequences of online activities. It then detects malicious users by training a discriminator with a complementary GAN model that is different from the regular GAN model. Experimental results show that our OCAN outperforms the state-of-the-art one-class classification models and achieves comparable performance with the latest multi-source LSTM model that requires both benign and malicious users in the training phase.Comment: Update Fig 2, add Fig 7, and add reference

    Anomaly Detection In Blockchain

    Get PDF
    Anomaly detection has been a well-studied area for a long time. Its applications in the financial sector have aided in identifying suspicious activities of hackers. However, with the advancements in the financial domain such as blockchain and artificial intelligence, it is more challenging to deceive financial systems. Despite these technological advancements many fraudulent cases have still emerged. Many artificial intelligence techniques have been proposed to deal with the anomaly detection problem; some results appear to be considerably assuring, but there is no explicit superior solution. This thesis leaps to bridge the gap between artificial intelligence and blockchain by pursuing various anomaly detection techniques on transactional network data of a public financial blockchain named 'Bitcoin'. This thesis also presents an overview of the blockchain technology and its application in the financial sector in light of anomaly detection. Furthermore, it extracts the transactional data of bitcoin blockchain and analyses for malicious transactions using unsupervised machine learning techniques. A range of algorithms such as isolation forest, histogram based outlier detection (HBOS), cluster based local outlier factor (CBLOF), principal component analysis (PCA), K-means, deep autoencoder networks and ensemble method are evaluated and compared

    Predicting Fraudulence Transaction under Data Imbalance using Neural Network (Deep Learning)

    Get PDF
    The number of financial transactions has the potential to cause many violations of the law (fraud). Conventional machine learning has been widely used, including logistic regression, random forest, and gradient boosted. However, the machine learning can work as long as the dataset contains fraud. Many new financial technology companies need to anticipate the potential for fraud, which they have not experienced much. This potential for a crime can also be experienced by old service providers with a low frequency of previous fraud. With the data imbalance, traditional machine learningis likely to produce false negatives so that they do not accurately predict potential fraud. This study optimizes the machine learning approach based on Neural Networks to improve model accuracy through the integration of KNIME and Python Programming with KERAS and TensorFlow models. The study also conducts a comparative analysis to scrutinize the performance of Adam and Adamax Optimizer. Using data from European cardholders in 2013, this study proves that workflows and neural network algorithms can detect with up to 95% accuracy even with a very small fraud sample of only 0.17% or 492 of 284,807 transactions. In addition, the Adam optimizer performs higher accuracy than the Adamax optimizer. The implication is that this supervisory technology innovation can be developed to minimize transaction crimes in the financial services sector

    A rule-based machine learning model for financial fraud detection

    Get PDF
    Financial fraud is a growing problem that poses a significant threat to the banking industry, the government sector, and the public. In response, financial institutions must continuously improve their fraud detection systems. Although preventative and security precautions are implemented to reduce financial fraud, criminals are constantly adapting and devising new ways to evade fraud prevention systems. The classification of transactions as legitimate or fraudulent poses a significant challenge for existing classification models due to highly imbalanced datasets. This research aims to develop rules to detect fraud transactions that do not involve any resampling technique. The effectiveness of the rule-based model (RBM) is assessed using a variety of metrics such as accuracy, specificity, precision, recall, confusion matrix, Matthew’s correlation coefficient (MCC), and receiver operating characteristic (ROC) values. The proposed rule-based model is compared to several existing machine learning models such as random forest (RF), decision tree (DT), multi-layer perceptron (MLP), k-nearest neighbor (KNN), naive Bayes (NB), and logistic regression (LR) using two benchmark datasets. The results of the experiment show that the proposed rule-based model beat the other methods, reaching accuracy and precision of 0.99 and 0.99, respectively

    Artificial neural network technique for improving prediction of credit card default: A stacked sparse autoencoder approach

    Get PDF
    Presently, the use of a credit card has become an integral part of contemporary banking and financial system. Predicting potential credit card defaulters or debtors is a crucial business opportunity for financial institutions. For now, some machine learning methods have been applied to achieve this task. However, with the dynamic and imbalanced nature of credit card default data, it is challenging for classical machine learning algorithms to proffer robust models with optimal performance. Research has shown that the performance of machine learning algorithms can be significantly improved when provided with optimal features. In this paper, we propose an unsupervised feature learning method to improve the performance of various classifiers using a stacked sparse autoencoder (SSAE). The SSAE was optimized to achieve improved performance. The proposed SSAE learned excellent feature representations that were used to train the classifiers. The performance of the proposed approach is compared with an instance where the classifiers were trained using the raw data. Also, a comparison is made with previous scholarly works, and the proposed approach showed superior performance over other methods

    Unveiling the frontiers of deep learning: innovations shaping diverse domains

    Full text link
    Deep learning (DL) enables the development of computer models that are capable of learning, visualizing, optimizing, refining, and predicting data. In recent years, DL has been applied in a range of fields, including audio-visual data processing, agriculture, transportation prediction, natural language, biomedicine, disaster management, bioinformatics, drug design, genomics, face recognition, and ecology. To explore the current state of deep learning, it is necessary to investigate the latest developments and applications of deep learning in these disciplines. However, the literature is lacking in exploring the applications of deep learning in all potential sectors. This paper thus extensively investigates the potential applications of deep learning across all major fields of study as well as the associated benefits and challenges. As evidenced in the literature, DL exhibits accuracy in prediction and analysis, makes it a powerful computational tool, and has the ability to articulate itself and optimize, making it effective in processing data with no prior training. Given its independence from training data, deep learning necessitates massive amounts of data for effective analysis and processing, much like data volume. To handle the challenge of compiling huge amounts of medical, scientific, healthcare, and environmental data for use in deep learning, gated architectures like LSTMs and GRUs can be utilized. For multimodal learning, shared neurons in the neural network for all activities and specialized neurons for particular tasks are necessary.Comment: 64 pages, 3 figures, 3 table
    corecore