8 research outputs found

    Incremental learning strategies for credit cards fraud detection.

    Get PDF
    very second, thousands of credit or debit card transactions are processed in financial institutions. This extensive amount of data and its sequential nature make the problem of fraud detection particularly challenging. Most analytical strategies used in production are still based on batch learning, which is inadequate for two reasons: Models quickly become outdated and require sensitive data storage. The evolving nature of bank fraud enshrines the importance of having up-to-date models, and sensitive data retention makes companies vulnerable to infringements of the European General Data Protection Regulation. For these reasons, evaluating incremental learning strategies is recommended. This paper designs and evaluates incremental learning solutions for real-world fraud detection systems. The aim is to demonstrate the competitiveness of incremental learning over conventional batch approaches and, consequently, improve its accuracy employing ensemble learning, diversity and transfer learning. An experimental analysis is conducted on a full-scale case study including five months of e-commerce transactions and made available by our industry partner, Worldline

    Transfer Learning Strategies for Credit Card Fraud Detection.

    Get PDF
    Credit card fraud jeopardizes the trust of customers in e-commerce transactions. This led in recent years to major advances in the design of automatic Fraud Detection Systems (FDS) able to detect fraudulent transactions with short reaction time and high precision. Nevertheless, the heterogeneous nature of the fraud behavior makes it difficult to tailor existing systems to different contexts (e.g. new payment systems, different countries and/or population segments). Given the high cost (research, prototype development, and implementation in production) of designing data-driven FDSs, it is crucial for transactional companies to define procedures able to adapt existing pipelines to new challenges. From an AI/machine learning perspective, this is known as the problem of transfer learning. This paper discusses the design and implementation of transfer learning approaches for e-commerce credit card fraud detection and their assessment in a real setting. The case study, based on a six-month dataset (more than 200 million e-commerce transactions) provided by the industrial partner, relates to the transfer of detection models developed for a European country to another country. In particular, we present and discuss 15 transfer learning techniques (ranging from naive baselines to state-of-the-art and new approaches), making a critical and quantitative comparison in terms of precision for different transfer scenarios. Our contributions are twofold: (i) we show that the accuracy of many transfer methods is strongly dependent on the number of labeled samples in the target domain and (ii) we propose an ensemble solution to this problem based on self-supervised and semi-supervised domain adaptation classifiers. The thorough experimental assessment shows that this solution is both highly accurate and hardly sensitive to the number of labeled samples

    Application of Machine Learning Techniques in Credit Card Fraud Detection

    Full text link
    Credit card fraud is an ever-growing problem in today’s financial market. There has been a rapid increase in the rate of fraudulent activities in recent years causing a substantial financial loss to many organizations, companies, and government agencies. The numbers are expected to increase in the future, because of which, many researchers in this field have focused on detecting fraudulent behaviors early using advanced machine learning techniques. However, the credit card fraud detection is not a straightforward task mainly because of two reasons: (i) the fraudulent behaviors usually differ for each attempt and (ii) the dataset is highly imbalanced, i.e., the frequency of majority samples (genuine cases) outnumbers the minority samples (fraudulent cases). When providing input data of a highly unbalanced class distribution to the predictive model, the model tends to be biased towards the majority samples. As a result, it tends to misrepresent a fraudulent transaction as a genuine transaction. To tackle this problem, data-level approach, where different resampling methods such as undersampling, oversampling, and hybrid strategies, have been implemented along with an algorithmic approach where ensemble models such as bagging and boosting have been applied to a highly skewed dataset containing 284807 transactions. Out of these transactions, only 492 transactions are labeled as fraudulent. Predictive models such as logistic regression, random forest, and XGBoost in combination with different resampling techniques have been applied to predict if a transaction is fraudulent or genuine. The performance of the model is evaluated based on recall, precision, f1-score, precision-recall (PR) curve, and receiver operating characteristics (ROC) curve. The experimental results showed that random forest in combination with a hybrid resampling approach of Synthetic Minority Over-sampling Technique (SMOTE) and Tomek Links removal performed better than other models

    A Methodology for Detecting Credit Card Fraud

    Get PDF
    Fraud detection has appertained to many industries such as banking, retails, financial services, healthcare, etc. As we know, fraud detection is a set of campaigns undertaken to avert the acquisition of illegal means to obtain money or property under false pretense. With an unlimited and growing number of ways fraudsters commit fraud crimes, detecting online fraud was so tricky to achieve. This research work aims to examine feasible ways to identify credit card fraudulent activities that negatively impact financial institutes. In the United States, an average of U.S consumers lost a median of $429 from credit card fraud in 2017, according to “CPO magazine. Almost 79% of consumers who experienced credit card fraud did not suffer any financial impact whatsoever” [35]. One of the questions is, who is paying for these losses if not the consumers? The answer to this question is the financial institutions. According to the Federal Trade Commission report, credit card theft has increased by 44.6% from 2019 to 2020, and the amount of money lost to credit card fraud in the year 2020 is about 149 million in total loss. Without any delay, financial institutes should implement technology safeguards and cybersecurity to decrease the impact of credit card fraud activities. To compare our chosen machine learning algorithms with machine learning techniques that already exist, we carried out a comparative analysis and we were able to determine which algorithm can best predict fraudulent transactions by recognizing a pattern that is different from others. We trained our algorithms over two sampling methods (undersampling and oversampling) of the credit card fraud dataset and, the best algorithm is drawn to predict frauds. AUC score and other metrics was used to compare and contrast the results of our algorithms. The following results are concluded based on our study: 1. Our study proposed algorithms such as Random Forest, Decision Trees and Xgboost, K-Means, Logistic Regression and Neural Network have performed better than other machine learning algorithms researchers have used in previous studies to predict credit card frauds. 2. Our ensemble tree algorithms such as Random Forest, Decision Trees and Xgboost came out to be the best model that can predict credit card fraud with AUC score of 1.00%, 0.99% and 0.99% respectively. 3. The best algorithm for this study shows a lot of improvements with the oversampling dataset with overall performance of 1.00% AUC score

    A simulation-driven approach to non-compliance

    Get PDF
    This dissertation proposes a methodological framework for the use of simulation-based methods to investigate questions of non-compliance in a legal context. Its aim is to generate observed or previously unobserved instances of non-compliance and use them to improve compliance and trust in a given socio-economic infrastructure. The framework consists of three components: a normative system implemented as an agent-based model, a profit-driven agent generating instances of non-compliance, and a formalization process transforming the generated behavior into a formal model.The most sophisticated ways of law-breaking are typically associated with economic crime. For this reason, we investigated three case studies in the financial domain. The first case study develops an agent-based model investigating the collective response of compliant agents to market disturbances originated by fraudulent activity, as during the U.S. subprime mortgage crisis in 2007. The second case study investigates the price evolution in the Bitcoin market under the influence of the price manipulation that occurred in 2017/18. The third case study investigates Ponzi schemes on smart contracts. All case studies showed a high level of agreement with qualitative and quantitative observations. Identification, extraction, and formalization of non-compliant behavior generated via simulation is a central topic in the later chapters of the thesis. We introduce a method that considers fraudulent schemes as neighborhoods of profitable non-compliant behavior. We illustrate the method on a grid environment with a path-finding agent. This simplified case study has been chosen as it captures fundamental features of non-compliance, yet, further generalization is needed for real-world scenarios

    A simulation-driven approach to non-compliance

    Get PDF
    This dissertation proposes a methodological framework for the use of simulation-based methods to investigate questions of non-compliance in a legal context. Its aim is to generate observed or previously unobserved instances of non-compliance and use them to improve compliance and trust in a given socio-economic infrastructure. The framework consists of three components: a normative system implemented as an agent-based model, a profit-driven agent generating instances of non-compliance, and a formalization process transforming the generated behavior into a formal model.The most sophisticated ways of law-breaking are typically associated with economic crime. For this reason, we investigated three case studies in the financial domain. The first case study develops an agent-based model investigating the collective response of compliant agents to market disturbances originated by fraudulent activity, as during the U.S. subprime mortgage crisis in 2007. The second case study investigates the price evolution in the Bitcoin market under the influence of the price manipulation that occurred in 2017/18. The third case study investigates Ponzi schemes on smart contracts. All case studies showed a high level of agreement with qualitative and quantitative observations. Identification, extraction, and formalization of non-compliant behavior generated via simulation is a central topic in the later chapters of the thesis. We introduce a method that considers fraudulent schemes as neighborhoods of profitable non-compliant behavior. We illustrate the method on a grid environment with a path-finding agent. This simplified case study has been chosen as it captures fundamental features of non-compliance, yet, further generalization is needed for real-world scenarios

    A POWER INDEX BASED FRAMEWORKFOR FEATURE SELECTION PROBLEMS

    Get PDF
    One of the most challenging tasks in the Machine Learning context is the feature selection. It consists in selecting the best set of features to use in the training and prediction processes. There are several benefits from pruning the set of actually operational features: the consequent reduction of the computation time, often a better quality of the prediction, the possibility to use less data to create a good predictor. In its most common form, the problem is called single-view feature selection problem, to distinguish it from the feature selection task in Multi-view learning. In the latter, each view corresponds to a set of features and one would like to enact feature selection on each view, subject to some global constraints. A related problem in the context of Multi-View Learning, is Feature Partitioning: it consists in splitting the set of features of a single large view into two or more views so that it becomes possible to create a good predictor based on each view. In this case, the best features must be distributed between the views, each view should contain synergistic features, while features that interfere disruptively must be placed in different views. In the semi-supervised multi-view task known as Co-training, one requires also that each predictor trained on an individual view is able to teach something to the other views: in classification tasks for instance, one view should learn to classify unlabelled examples based on the guess provided by the other views. There are several ways to address these problems. A set of techniques is inspired by Coalitional Game Theory. Such theory defines several useful concepts, among which two are of high practical importance: the concept of power index and the concept of interaction index. When used in the context of feature selection, they take the following meaning: the power index is a (context-dependent) synthesis measure of the prediction\u2019s capability of a feature, the interaction index is a (context-dependent) synthesis measure of the interaction (constructive/disruptive interference) between two features: it can be used to quantify how the collaboration between two features enhances their prediction capabilities. An important point is that the powerindex of a feature is different from the predicting power of the feature in isolation: it takes into account, by a suitable averaging, the context, i.e. the fact that the feature is acting, together with other features, to train a model. Similarly, the interaction index between two features takes into account the context, by suitably averaging the interaction with all the other features. In this work we address both the single-view and the multi-view problems as follows. The single-view feature selection problem, is formalized as the problem of maximization of a pseudo-boolean function, i.e. a real valued set function (that maps sets of features into a performance metric). Since one has to enact a search over (a considerable portion of) the Boolean lattice (without any special guarantees, except, perhaps, positivity) the problem is in general NP-hard. We address the problem producing candidate maximum coalitions through the selection of the subset of features characterized by the highest power indices and using the coalition to approximate the actual maximum. Although the exact computation of the power indices is an exponential task, the estimates of the power indices for the purposes of the present problem can be achieved in polynomial time. The multi-view feature selection problem is formalized as the generalization of the above set-up to the case of multi-variable pseudo-boolean functions. The multi-view splitting problem is formalized instead as the problem of maximization of a real function defined over the partition lattice. Also this problem is typically NP-hard. However, candidate solutions can be found by suitably partitioning the top power-index features and keeping in different views the pairs of features that are less interactive or negatively interactive. The sum of the power indices of the participating features can be used to approximate the prediction capability of the view (i.e. they can be used as a proxy for the predicting power). The sum of the feature pair interactivity across views can be used as proxy for the orthogonality of the views. Also the capability of a view to pass information (to teach) to other views, within a co-training procedure can benefit from the use of power indices based on a suitable definition of information transfer (a set of features { a coalition { classifies examples that are subsequently used in the training of a second set of features). As to the feature selection task, not only we demonstrate the use of state of the art power index concepts (e.g. Shapley Value and Banzhaf along the 2lines described above Value), but we define new power indices, within the more general class of probabilistic power indices, that contains the Shapley and the Banzhaf Values as special cases. Since the number of features to select is often a predefined parameter of the problem, we also introduce some novel power indices, namely k-Power Index (and its specializations k-Shapley Value, k-Banzhaf Value): they help selecting the features in a more efficient way. For the feature partitioning, we use the more general class of probabilistic interaction indices that contains the Shapley and Banzhaf Interaction Indices as members. We also address the problem of evaluating the teaching ability of a view, introducing a suitable teaching capability index. The last contribution of the present work consists in comparing the Game Theory approach to the classical Greedy Forward Selection approach for feature selection. In the latter the candidate is obtained by aggregating one feature at time to the current maximal coalition, by choosing always the feature with the maximal marginal contribution. In this case we show that in typical cases the two methods are complementary, and that when used in conjunction they reduce one another error in the estimate of the maximum value. Moreover, the approach based on game theory has two advantages: it samples the space of all possible features\u2019 subsets, while the greedy algorithm scans a selected subspace excluding totally the rest of it, and it is able, for each feature, to assign a score that describes a context-aware measure of importance in the prediction process
    corecore