9 research outputs found

    Synthetic Demographic Data Generation for Card Fraud Detection Using GANs

    Full text link
    Using machine learning models to generate synthetic data has become common in many fields. Technology to generate synthetic transactions that can be used to detect fraud is also growing fast. Generally, this synthetic data contains only information about the transaction, such as the time, place, and amount of money. It does not usually contain the individual user's characteristics (age and gender are occasionally included). Using relatively complex synthetic demographic data may improve the complexity of transaction data features, thus improving the fraud detection performance. Benefiting from developments of machine learning, some deep learning models have potential to perform better than other well-established synthetic data generation methods, such as microsimulation. In this study, we built a deep-learning Generative Adversarial Network (GAN), called DGGAN, which will be used for demographic data generation. Our model generates samples during model training, which we found important to overcame class imbalance issues. This study can help improve the cognition of synthetic data and further explore the application of synthetic data generation in card fraud detection

    An overview of decision table literature.

    Get PDF
    The present report contains an overview of the literature on decision tables since its origin. The goal is to analyze the dissemination of decision tables in different areas of knowledge, countries and languages, especially showing these that present the most interest on decision table use. In the first part a description of the scope of the overview is given. Next, the classification results by topic are explained. An abstract and some keywords are included for each reference, normally provided by the authors. In some cases own comments are added. The purpose of these comments is to show where, how and why decision tables are used. Other examined topics are the theoretical or practical feature of each document, as well as its origin country and language. Finally, the main body of the paper consists of the ordered list of publications with abstract, classification and comments.

    Algoritma Decision Table Menggunakan Inner Join Bersyarat untuk Klasifikasi Hasil Penilaian Angka Kredit Perekayasa

    Get PDF
    Salah satu syarat yang diperlukan dalam kenaikan pangkat dan jabatan perekayasa adalah surat penetapan angka kredit atau PAK. Untuk memperolehnya, perekayasa menyerahkan daftar usulan PAK (DUPAK) – yang menjelaskan kegiatan-kegiatan yang telah dilakukan selama periode tertentu – kepada sekretariat Tim Penilai untuk dilakukan proses penilaian. Proses penilaian DUPAK dilakukan secara manual sehingga seringkali bermasalah, seperti kesalahan dalam mencatat data perekayasa dan merekap hasil penilaian. Untuk mengatasi masalah tersebut, dibutuhkan aplikasi yang dapat digunakan oleh sekretariat. Makalah ini membahas tentang rancangan aplikasi Sistem Administrasi Penilaian PAK Perekayasa (SAPPP) dengan penekanan pada penggunaan teknologi Ajax pada antarmuka aplikasi untuk kemudahan interaksi pengguna. Hasil rancangan memperlihatkan bahwa aplikasi SAPPP dengan dukungan Ajax dapat membantu kelancaran proses penilaian dan mengatasi permasalahan yang sebelumnya terjadi. Kata kunci: perekayasa, penilaian, DUPAK, decision table, klasifikasi, pengambilan keputusan   Abstract   One of the requirements required in the promotion and position of engineer is a letter of determination of credit score. To obtain it, the engineer submits a list of proposed credit scores to the Appraiser Secretariat for the assessment process. Manually appraisal processes are often problematic, such as errors in recording data, assessing and assigning promotional recommendations. To overcome these problems, the application of the Administration System for Assessment and Designation of Engineer Credit Rate (SAPPP) is designed and developed. The development method is emphasized on the implementation of the Decision Table (DT) algorithm using conditional inner join, this method transforms the composition of the credit code into the decision rule to obtain the classification of the assessment results used in decision making. The development results show that SAPPP applications with the support of classification and visualization system can help the process of appraisal and determination of credit numbers of engineers more effectively and efficiently. Keywords: engineer, assessment,  proposed credit scores, decision table, classification, decision makin

    CrystalClear: Active visualization of association rules

    Get PDF
    Effective visualization is an important aspect of active data mining. In the context of association rules, this need has been driven by the large amount of rules produced from a run of the algorithm. To be able to address real user needs, the rules need to be summarized and organized so that it can be interpreted and applied in a timely manner. In this paper, we propose two visualization techniques that is an improvement over those used by existing data mining packages. In particular, we address the visualization of "differences" in the set of rules due to incremental changes in the data source. We show that visualization in this aspect is important to active data mining as it uncovers new insights not possible from inspecting individual data mining results

    Using plantar pressure for free-living posture recognition and sedentary behaviour monitoring

    Get PDF
    Health authorities in numerous countries and even the World Health Organization (WHO) are concerned with low levels of physical activity and increasing sedentary behaviour amongst the general population. In fact, emerging evidences identify sedentary behaviour as a ubiquitous characteristic of contemporary lifestyles. This has major implications for the general health of people worldwide particularly for the prevalence of non-communicable conditions (NCDs) such as cardiovascular disease, diabetes and cancer and their risk factors such as raised blood pressure, raised blood sugar and overweight. Moreover, sedentary time appears to be uniquely associated with health risks independent of physical activity intensity levels. However, habitual sedentary behaviour may prove complex to be accurately measured as it occurs across different domains, including work, transport, domestic duties and even lei¬sure. Since sedentary behaviour is mostly reflect as too much sitting, one of the main concerns is being able to distinguish among different activities, such as sitting and standing. Widely used devices such as accelerometer-based activity monitors have a limited ability to detect sedentary activities accurately. Thus, there is a need of a viable large-scale method to efficiently monitor sedentary behaviour. This thesis proposes and demonstrates how a plantar pressure based wearable device and machine learning classification techniques have significant capability to monitor daily life sedentary behaviour. Firstly, an in-depth review of research and market ready plantar pressure and force technologies is performed to assess their measurement capabilities and limitations to measure sedentary behaviour. Afterwards, a novel methodology for measuring daily life sedentary behaviour using plantar pressure data and a machine learning predictive model is developed. The proposed model and its algorithm are constructed using a dataset of 20 participants collected at both laboratory-based and free-living conditions. Sitting and standing variations are included in the analysis as well as the addition of a potential novel activities, such as leaning. Video footage is continuously collected using of a wearable camera as an equivalent of direct observation to allow the labelling of the training data for the machine learning model. The optimal parameters of the model such as feature set, epoch length, type of classifier is determined by experimenting with multiple iterations. Different number and location of plantar pressure sensors are explored to determine the optimal trade-off between low computational cost and accurate performance. The model s performance is calculated using both subject dependent and subject independent validation by performing 10-fold stratified cross-validation and leave-one-user-out validation respectively. Furthermore, the proposed model activity performance for daily life monitoring is validated against the current criterion (i.e. direct observation) and against the de facto standard, the activPAL. The results show that the proposed machine learning classification model exhibits excel-lent recall rates of 98.83% with subject dependent training and 95.93% with independent training. This work sets the groundwork for developing a future plantar pressure wearable device for daily life sedentary behaviour monitoring in free-living conditions that uses the proposed ma-chine leaning classification model. Moreover, this research also considers important design characteristics of wearable devices such as low computational cost and improved performance, addressing the current gap in the physical activity and sedentary behaviour wearable market

    MevaL: A Visual Machine Learning Model Evaluation Tool for Financial Crime Detection

    Get PDF
    Data Science and Machine Learning are two valuable allies to fight financial crime,the domain where Feedzai seeks to leverage its value proposition in support of its mission:to make banking and commerce safe. Data is at the core of both fields and this domain, sostructuring instances for visual consumption provides an effective way of understandingthe data and communicating insights.The development of a solution for each project and use case requires a careful andeffective Machine Learning Model Evaluation stage, as it is the major source of feedbackbefore deployment. The tooling for this stage available at Feedzai can be improved,accelerated, visually supported, and diversified to enable data scientists to boost theirdaily work and the quality of the models.In this work, I propose to collect and compile internal and external input, in terms ofworkflow and Model Evaluation, in a proposal hierarchically segmented by well-definedobjectives and tasks, to instantiate the proposal in a Python package, and to iteratively val-idate the package with Feedzai’s data scientists. Therefore, the first contribution is MevaL,a Python package for Model Evaluation with visual support, integrated into Feedzai’s DataScience environment by design. In fact, MevaL is already being leveraged as a visualization package on two internal reporting projects that are serving some of Feedzai’s majorclients.In addition to MevaL, the second contribution of this work is the Model EvaluationTopology developed to ensure clear communication and design of features.A Ciência de Dados e a Aprendizagem Automática [277] são duas valiosas aliadas no combate à criminalidade económico-financeira, o domínio em que a Feedzai procura potenciar a sua proposta de valor em prol da sua missão: tornar o sistema bancário e o comércio seguros. Além disso, os dados estão no centro das duas áreas e deste domínio.Assim, a estruturação visual dos mesmos fornece uma maneira eficaz de os entender e transmitir informação.O desenvolvimento de uma solução para cada projeto e caso de uso requer um estágiocuidadoso e eficaz de Avaliação de Modelos de Aprendizagem Automática, pois esteestágio coincide com a principal fonte de retorno (feedback) antes da implementaçãoda solução. As ferramentas de Avaliação de Modelos disponíveis na Feedzai podem seraprimoradas, aceleradas, suportadas visualmente e diversificadas para permitir que oscientistas de dados impulsionem o seu trabalho diário e a qualidade destes modelos.Neste trabalho, proponho a recolha e compilação de informação interna e externa, em termos de fluxo de trabalho e Avaliação de Modelos, numa proposta hierarquicamente segmentada por objetivos e tarefas bem definidas, a instanciação desta proposta num pacote Python e a validação iterativa deste pacote em colaboração com os cientistas de dados da Feedzai. Posto isto, a primeira contribuição deste trabalho é o MevaL, um pacote Python para Avaliação de Modelos com suporte visual, integrado no ambiente de Ciência de Dados da Feedzai. Na verdade, o MevaL já está a ser utilizado como um pacote de visualização em dois projetos internos de preparação de relatórios automáticos para alguns dos principais clientes da Feedzai.Além do MevaL, a segunda contribuição deste trabalho é a Topologia de Avaliação de Modelos desenvolvida para garantir uma comunicação clara e o design enquadrado das diferentes funcionalidades

    Visualizing decision table classifiers

    No full text
    corecore