4 research outputs found

    Handling class imbalance in credit card fraud using resampling methods

    Get PDF
    Credit card based online payments has grown intensely, compelling the financial organisations to implement and continuously improve their fraud detection system. However, credit card fraud dataset is heavily imbalanced and different types of misclassification errors may have different costs and it is essential to control them, to a certain degree, to compromise those errors. Classification techniques are the promising solutions to detect the fraud and non-fraud transactions. Unfortunately, in a certain condition, classification techniques do not perform well when it comes to huge numbers of differences in minority and majority cases. Hence in this study, resampling methods, Random Under Sampling, Random Over Sampling and Synthetic Minority Oversampling Technique, were applied in the credit card dataset to overcome the rare events in the dataset. Then, the three resampled datasets were classified using classification techniques. The performances were measured by their sensitivity, specificity, accuracy, precision, area under curve (AUC) and error rate. The findings disclosed that by resampling the dataset, the models were more practicable, gave better performance and were statistically better

    Forecasting rainfall distribution using artificial neural networks for Johor rivers

    No full text
    The study is conducted to forecast the rainfall distribution in the areas around Johor, Malaysia. Although there are many other factors, we will be using the rainfall distribution factor only. The forecasting method that is going to be used in this study is the Artificial Neural Networks (ANN) which will be trained using back propagation learning algorithm. To produce the best model, several propagation models will be constructed in the algorithm. The value of learning rate parameter and momentum parameter will also be used and constantly changed based on the number of hidden nodes. The data is prepared and filtered using data pre-processing. Data pre-processing includes data cleaning, normalisation, transformation, feature extraction and selection. The product of data pre-processing is the final training set. At the end of the experiment, the best model was selected and the strength of the relationship of each model based on their activation functions that have been used was compared. The result of the model produces the minimum error value and has a stronger relationship between the actual data value and forecast data value is the best model among the best

    A systematic literature review on features of deep learning in big data analytics

    No full text
    The aims of this study are to identify the existing features of DL approaches for using in BDA and identify the key features that affect the effectiveness of DL approaches. Method: A Systematic Literature Review (SLR) was carried out and reported based on the preferred reporting items for systematic reviews. 4065 papers were retrieved by manual search in four databases which are Google Scholar, Taylor & Francis, Springer Link and Science Direct. 34 primary studies were finally included. Result: From these studies, 70% were journal articles, 25% were conference papers and 5% were contributions from the studies consisted of book chapters. Five features of DL were identified and analyzed. The features are (1) hierarchical layer, (2) high-level abstraction, (3) process high volume of data, (4) universal model and (5) does not over fit the training data. Conclusion: This review delivers the evidence that DL in BDA is an active research area. The review provides researchers with some guidelines for future research on this topic. It also provides broad information on DL in BDA which could be useful for practitioners

    Classification of malware mnalytics techniques: A systematic literature review

    No full text
    Malware is a variety of forms of hostile or intrusive software that being thrown around online. Data analytics is the process of examining data sets in order to draw conclusions about information they contain, increasingly with the aid of specialized systems and software. Objectives: The aims of the study are to identify the types of malware analytics and identify the purpose of malware analytics. Method: A Systematic Literature Review (SLR) was carried out and reported based on the preferred reporting itemsfor systematic reviews. 1114 papers were retrieved by manual search in six databases which are IEEE, Science Direct, Taylor and Francis, ACM, Wiley and Springer Link. 53 primary studies were finally included. Results: From these studies, 70% were conference papers and 30% were journal articles. Five classification of malware analytics techniques were identified and analysed. The classifications are (1) descriptive analytics, (2) diagnostic analytics, (3) predictive analytics, (4) prescriptive analytics and(5) visual analytics. Conclusion: This review delivers the evidence that malware analytics is an active research area. The review provides researchers with some guidelines for future research on this topic. It also provides broad information on malware analytics techniques which could be useful for practitioners
    corecore