Search CORE

19 research outputs found

Inspecting Credit Card Fraud Identification Via Data Mining Classification Methods And Machine Learning Algorithms

Author: Dr. Narendra Sharma
Ms. Smita Tripathi
Publication venue: ASSOC ADVANCEMENT ZOOLOGY , AZADANAGAR COLONY RUSTAMPUR, GORAKHPUR, INDIA, 273001
Publication date: 11/01/2024
Field of study

Increased global fraud cases and significant losses for both the financial sector and people are brought about by the quick adoption of online-based transactional activity. While credit card fraud is one of the most common and concerning financial industry crimes, internet shoppers are concerned about it more than any other. To investigate the patterns and traits of suspicious and non-suspicious transactions using normalised and anomaly data, data mining techniques were mostly used. Nevertheless, classifiers were utilised in machine learning (ML) techniques to automatically determine which transactions were fraudulent and which were not. Thus, by figuring out the patterns in the data, the combination of data mining and machine learning algorithms was able to distinguish between real and pretend transactions

Journal Of Advanced Zoology

Ensemble of Example-Dependent Cost-Sensitive Decision Trees

Author: Aouada Djamila
Bahnsen Alejandro Correa
Ottersten Bjorn
Publication venue
Publication date: 01/01/2015
Field of study

Several real-world classification problems are example-dependent cost-sensitive in nature, where the costs due to misclassification vary between examples and not only within classes. However, standard classification methods do not take these costs into account, and assume a constant cost of misclassification errors. In previous works, some methods that take into account the financial costs into the training of different algorithms have been proposed, with the example-dependent cost-sensitive decision tree algorithm being the one that gives the highest savings. In this paper we propose a new framework of ensembles of example-dependent cost-sensitive decision-trees. The framework consists in creating different example-dependent cost-sensitive decision trees on random subsamples of the training set, and then combining them using three different combination approaches. Moreover, we propose two new cost-sensitive combination approaches; cost-sensitive weighted voting and cost-sensitive stacking, the latter being based on the cost-sensitive logistic regression method. Finally, using five different databases, from four real-world applications: credit card fraud detection, churn modeling, credit scoring and direct marketing, we evaluate the proposed method against state-of-the-art example-dependent cost-sensitive techniques, namely, cost-proportionate sampling, Bayes minimum risk and cost-sensitive decision trees. The results show that the proposed algorithms have better results for all databases, in the sense of higher savings.Comment: 13 pages, 6 figures, Submitted for possible publicatio

arXiv.org e-Print Archive

Crossref

Open Repository and Bibliography - Luxembourg

PARAMETER ASOSIASI UNTUK MENENTUKAN KORELASI JURUSAN DAN INDEKS PRESTASI KUMULATIF

Author: Buaton Relita
Effendi Syahril
Jollyta Deny
Mawengkang Herman
Zarlis Muhammad
Publication venue: 'PPPM STMIK Nusa Mandiri'
Publication date: 15/03/2019
Field of study

One of the problems in higher education is the mistake of prospective students in majors selection. This is caused by not paying attention to the suitability of the major in the original school with the chosen major in higher education so that it impacts not only non optimal processing and learning outcomes, such as the low GPA, but also on social life, such as increasing unemployment. The selection of the right major is very important and to help prospective students in choosing it requires an online system that can be accessed by everyone and select original school majors to see conformity with majors in higher education. This system uses association rules and parameters of support and confidence in data mining. The purpose of this research is to determine the correlation between majors in the original school, majors in higher education and the achievement of the GPA through the use of support and confidence parameters that process the knowledge base in the form of an alumni database on the online system created. Training or testing was conducted on 10,254 data in the database and produced new information and knowledge that between the majors of the original school, the choice of majors in higher education and GPA had a strong correlation with the value of confidence reaching 100%

ejournal.nusamandiri.ac.id (STMIK Nusa Mandiri)

Machine Learning en la detección de fraudes de comercio electrónico aplicado a los servicios bancarios

Author: Alvarez Fredi
Publication venue: 'Fundacion Universidad de Palermo'
Publication date: 01/01/2020
Field of study

One of the main risks to which financial institutions are subject are electronic fraud attacks. Billions of dollars in losses are absorbed each year by financial institutions due to fraudulent transactions.This article presents a model that considers the main challenges to design a fraud detection system: a) highly unbalanced classes, b) stationary distribution of data and c) incorporation of online feedback from fraud investigators on transactions labeled suspicious. The implementation of the model in a test dataset allowed to successfully predicting the majority of cases of fraudulent transactions with a minimum percentage of false negatives.Uno de los principales riesgos a los que están sometidas las entidades financieras son los ataques de fraudes electrónicos. Billones de dólares en pérdidas son absorbidas cada año por las entidades financieras debido a transacciones fraudulentas. Este artículo plantea un modelo que considera los principales retos en el diseño de un sistema de detección de fraudes: a) clases altamente desequilibradas, b) distribución de estacionaria de los datos y c) la incorporación en línea de la retroalimentación de los investigadores de fraude ante las transacciones etiquetadas como sospechosas. La implementación del modelo en un conjunto de datos de prueba permitió predecir exitosamente la mayoría de casos de transacciones fraudulentas con un mínimo porcentaje de falsos negativos

Portal de Publicaciones Periodicas UP (Universidad de Palermo)

DIALNET

Credit Card Fraud Detection Using Machine Learning As Data Mining Technique

Author: Ahamed Hassain Malim Nurul Hashimah
Sagadevan Saravanan
Yee Ong Shu
Publication venue: Journal of Telecommunication, Electronic and Computer Engineering (JTEC)
Publication date: 29/01/2018
Field of study

The rapid participation in online based transactional activities raises the fraudulent cases all over the world and causes tremendous losses to the individuals and financial industry. Although there are many criminal activities occurring in financial industry, credit card fraudulent activities are among the most prevalent and worried about by online customers. Thus, countering the fraud activities through data mining and machine learning is one of the prominent approaches introduced by scholars intending to prevent the losses caused by these illegal acts. Primarily, data mining techniques were employed to study the patterns and characteristics of suspicious and non-suspicious transactions based on normalized and anomalies data. On the other hand, machine learning (ML) techniques were employed to predict the suspicious and non-suspicious transactions automatically by using classifiers. Therefore, the combination of machine learning and data mining techniques were able to identify the genuine and non-genuine transactions by learning the patterns of the data. This paper discusses the supervised based classification using Bayesian network classifiers namely K2, Tree Augmented Naïve Bayes (TAN), and Naïve Bayes, logistics and J48 classifiers. After preprocessing the dataset using normalization and Principal Component Analysis, all the classifiers achieved more than 95.0% accuracy compared to results attained before preprocessing the dataset

Universiti Teknikal Malaysia Melaka: UTeM Open Journal System

Comparative Analysis of Different Distributions Dataset by Using Data Mining Techniques on Credit Card Fraud Detection

Author: Layth Hazim
Oğuz Ata
Publication venue: 'Mechanical Engineering Faculty in Slavonski Brod'
Publication date: 01/01/2020
Field of study

Banks suffer multimillion-dollars losses each year for several reasons, the most important of which is due to credit card fraud. The issue is how to cope with the challenges we face with this kind of fraud. Skewed "class imbalance" is a very important challenge that faces this kind of fraud. Therefore, in this study, we explore four data mining techniques, namely naïve Bayesian (NB),Support Vector Machine (SVM), K-Nearest Neighbor (KNN) and Random Forest (RF), on actual credit card transactions from European cardholders. This paper offers four major contributions. First, we used under-sampling to balance the dataset because of the high imbalance class, implying skewed distribution. Second, we applied NB, SVM, KNN, and RF to under-sampled class to classify the transactions into fraudulent and genuine followed by testing the performance measures using a confusion matrix and comparing them. Third, we adopted cross-validation (CV) with 10 folds to test the accuracy of the four models with a standard deviation followed by comparing the results for all our models. Next, we examined these models against the entire dataset (skewed) using the confusion matrix and AUC (Area Under the ROC Curve) ranking measure to conclude the final results to determine which would be the best model for us to use with a particular type of fraud. The results showing the best accuracy for the NB, SVM, KNN and RF classifiers are 97,80%; 97,46%; 98,16% and 98,23%, respectively. The comparative results have been done by using four-division datasets (75:25), (90:10), (66:34) and (80:20) displayed that the RF performs better than NB, SVM, and KNN, and the results when utilizing our proposed models on the entire dataset (skewed), achieved preferable outcomes to the under-sampled dataset

HRČAK - Portal of Croatian Scientific and Professional Journals

Hrčak - Portal of scientific journals of Croatia

Maximizing gain in high-throughput screening using conformal prediction

Author: Avid AM
Bender Andreas
Norinder U
Svensson Fredrik
Publication venue: Journal of Cheminformatics
Publication date: 01/02/2018
Field of study

Iterative screening has emerged as a promising approach to increase the efficiency of screening campaigns compared to traditional high throughput approaches. By learning from a subset of the compound library, inferences on what compounds to screen next can be made by predictive models, resulting in more efficient screening. One way to evaluate screening is to consider the cost of screening compared to the gain associated with finding an active compound. In this work, we introduce a conformal predictor coupled with a gain-cost function with the aim to maximise gain in iterative screening. Using this setup we were able to show that by evaluating the predictions on the training data, very accurate predictions on what settings will produce the highest gain on the test data can be made. We evaluate the approach on 12 bioactivity datasets from PubChem training the models using 20% of the data. Depending on the settings of the gain-cost function, the settings generating the maximum gain were accurately identified in 8–10 out of the 12 datasets. Broadly, our approach can predict what strategy generates the highest gain based on the results of the cost-gain evaluation: to screen the compounds predicted to be active, to screen all the remaining data, or not to screen any additional compounds. When the algorithm indicates that the predicted active compounds should be screened, our approach also indicates what confidence level to apply in order to maximize gain. Hence, our approach facilitates decision-making and allocation of the resources where they deliver the most value by indicating in advance the likely outcome of a screening campaign.The research at Swetox (UN) was supported by Knut and Alice Wallenberg Foundation and Swedish Research Council FORMAS. AMA was supported by AstraZeneca

Crossref

Directory of Open Access Journals

UCL Discovery

Apollo (Cambridge)