26 research outputs found

    High Accuracy Phishing Detection Based on Convolutional Neural Networks

    Get PDF
    The persistent growth in phishing and the rising volume of phishing websites has led to individuals and organizations worldwide becoming increasingly exposed to various cyber-attacks. Consequently, more effective phishing detection is required for improved cyber defence. Hence, in this paper we present a deep learning-based approach to enable high accuracy detection of phishing sites. The proposed approach utilizes convolutional neural networks (CNN) for high accuracy classification to distinguish genuine sites from phishing sites. We evaluate the models using a dataset obtained from 6,157 genuine and 4,898 phishing websites. Based on the results of extensive experiments, our CNN based models proved to be highly effective in detecting unknown phishing sites. Furthermore, the CNN based approach performed better than traditional machine learning classifiers evaluated on the same dataset, reaching 98.2% phishing detection rate with an F1-score of 0.976. The method presented in this pa-per compares favourably to the state-of-the art in deep learning based phishing website detection

    Intelligent Detection for Cyber Phishing Attacks using Fuzzy rule-Based Systems

    Get PDF
    Cyber phishing attacks are increasing rapidly, causing the world economy monetary losses. Although various phishing detections have been proposed to prevent phishing, there is still a lack of accuracy such as false positives and false negatives causing inadequacy in online transactions. This study constructs a fuzzy rule model utilizing combined features based on a fuzzy inference system to tackle the foreseen inaccuracy in online transactions. The importance of the intelligent detection of cyber phishing is to discriminate emerging phishing websites with a higher accuracy. The experimental results achieved an excellent accuracy compared to the reported results in the field, which demonstrates the effectiveness of the fuzzy rule model and the feature-set. The findings indicate that the new approach can be used to discriminate between phishing and legitimate websites. This paper contributes by constructing a fuzzy rule model using a combined effective feature-set that has shown an excellent performance. Phishing deceptions evolve rapidly and should therefore be updated regularly to keep ahead with the changes

    Intelligent Security for Phishing Online using Adaptive Neuro Fuzzy Systems

    Get PDF
    Anti-phishing detection solutions employed in industry use blacklist-based approaches to achieve low false-positive rates, but blacklist approaches utilizes website URLs only. This study analyses and combines phishing emails and phishing web-forms in a single framework, which allows feature extraction and feature model construction. The outcome should classify between phishing, suspicious, legitimate and detect emerging phishing attacks accurately. The intelligent phishing security for online approach is based on machine learning techniques, using Adaptive Neuro-Fuzzy Inference System and a combination sources from which features are extracted. An experiment was performed using two-fold cross validation method to measure the system’s accuracy. The intelligent phishing security approach achieved a higher accuracy. The finding indicates that the feature model from combined sources can detect phishing websites with a higher accuracy. This paper contributes to phishing field a combined feature which sources in a single framework. The implication is that phishing attacks evolve rapidly; therefore, regular updates and being ahead of phishing strategy is the way forward

    An Improved Associative Classification Algorithm based on Incremental Rules

    Get PDF
    In Associative classification (AC), the step of rule generation is necessarily exhaustive because of the inherited search problems from the association rule. Besides which, the entire rules set must be induced prior constructing the classifier. This article proposes a new AC algorithm called Dynamic Covering Associative Classification (DCAC) that learns each rule from a training dataset, removes its classified instances, and then learns the next rule from the remaining unclassified data rather than the original training dataset. This ensures that the exhaustive steps of rule evaluation and candidate generation will no longer be needed, thereby maintaining a real time rule generation process. The proposed algorithm constantly amends the support and confidence for each rule rather restricting itself with the support and confidence computed from the original dataset. Experiments on 20 datasets from different domains showed that the proposed algorithm generates higher quality and more accurate classifiers than other AC rule induction approaches

    A Dynamic Self-Structuring Neural Network

    Get PDF
    Creating a neural network based classification model is commonly accomplished using the trial and error technique. However, the trial and error structuring method have several difficulties such as time and availability of experts. In this article, an algorithm that simplifies structuring neural network classification models has been proposed. The algorithm aims at creating a large enough structure to learn models from the training dataset that can be generalised well on the testing dataset. Our algorithm dynamically tunes the structure parameters during the training phase aiming to derive accurate non-overfitting classifiers. The proposed algorithm has been applied to phishing websites classification problem and it shows competitive results with respect to various evaluation measures such as Harmonic Mean (F1-score), precision, accuracy, etc

    Towards Adversarial Phishing Detection

    Get PDF

    Voting margin: A scheme for error-tolerant k nearest neighbors classifiers for machine learning

    Get PDF
    Machine learning (ML) techniques such as classifiers are used in many applications, some of which are related to safety or critical systems. In this case, correct processing is a strict requirement and thus ML algorithms (such as for classification) must be error tolerant. A naive approach to implement error tolerant classifiers is to resort to general protection techniques such as modular redundancy. However, modular redundancy incurs in large overheads in many metrics such as hardware utilization and power consumption that may not be acceptable in applications that run on embedded or battery powered systems. Another option is to exploit the algorithmic properties of the classifier to provide protection and error tolerance at a lower cost. This paper explores this approach for a widely used classifier, the k Nearest Neighbors ( k NNs), and proposes an efficient scheme to protect it against errors. The proposed technique is based on a time-based modular redundancy (TBMR) scheme. The proposed scheme exploits the intrinsic redundancy of k NNs to drastically reduce the number of re-computations needed to detect errors. This is achieved by noting that when voting among the k nearest neighbors has a large majority, an error in one of the voters cannot change the result, hence voting margin (VM). This observation has been refined and extended in the proposed VM scheme to also avoid re-computations in some cases in which the majority vote is tight. The VM scheme has been implemented and evaluated with publicly available data sets that cover a wide range of applications and settings. The results show that by exploiting the intrinsic redundancy of the classifier, the proposed scheme is able to reduce the cost compared to modular redundancy by more than 60 percent in all configurations evaluated.Pedro Reviriego and Josée Alberto Hernández would like to acknowledge the support of the TEXEO project TEC2016-80339-R funded by the Spanish Ministry of Economy and Competitivity and of the Madrid Community research project TAPIR-CM Grant no. P2018/TCS-4496

    Deep Learning-Based Attack Detection and Classification in Android Devices.

    Get PDF
    The increasing proliferation of Androidbased devices, which currently dominate the market with a staggering 72% global market share, has made them a prime target for attackers. Consequently, the detection of Android malware has emerged as a critical research area. Both academia and industry have explored various approaches to develop robust and efficient solutions for Android malware detection and classification, yet it remains an ongoing challenge. In this study, we present a supervised learning technique that demonstrates promising results in Android malware detection. The key to our approach lies in the creation of a comprehensive labeled dataset, comprising over 18,000 samples classified into five distinct categories: Adware, Banking, SMS, Riskware, and Benign applications. The effectiveness of our proposed model is validated using well-established datasets such as CICMalDroid2020, CICMalDroid2017, and CICAndMal2017. Comparing our results with state-of-the-art techniques in terms of precision, recall, efficiency, and other relevant factors, our approach outperforms other semi-supervised methods in specific parameters. However, we acknowledge that our model does not exhibit significant deviations when compared to alternative approaches concerning certain aspects. Overall, our research contributes to the ongoing efforts in the development of advanced techniques for Android malware detection and classification. We believe that our findings will inspire further investigations, leading to enhanced security measures and protection for Android devices in the face of evolving threats.Partial funding for open access charge: Universidad de Málag

    Selective Neuron Re-Computation (SNRC) for Error-Tolerant Neural Networks

    Get PDF
    Artificial Neural networks (ANNs) are widely used to solve classification problems for many machine learning applications. When errors occur in the computational units of an ANN implementation due to for example radiation effects, the result of an arithmetic operation can be changed, and therefore, the predicted classification class may be erroneously affected. This is not acceptable when ANNs are used in many safety-critical applications, because the incorrect classification may result in a system failure. Existing error-tolerant techniques usually rely on physically replicating parts of the ANN implementation or incurring in a significant computation overhead. Therefore, efficient protection schemes are needed for ANNs that are run on a processor and used in resource-limited platforms. A technique referred to as Selective Neuron Re-Computation (SNRC), is proposed in this paper. As per the ANN structure and algorithmic properties, SNRC can identify the cases in which the errors have no impact on the outcome; therefore, errors only need to be handled by re-computation when the classification result is detected as unreliable. Compared with existing temporal redundancy-based protection schemes, SNRC saves more than 60 percent of the re-computation (more than 90 percent in many cases) overhead to achieve complete error protection as assessed over a wide range of datasets. Different activation functions are also evaluated.This research was supported by the National Science Foundation Grants CCF-1953961 and 1812467, by the ACHILLES project PID2019-104207RB-I00 and the Go2Edge network RED2018-102585-T funded by the Spanish Ministry of Science and Innovation and by the Madrid Community research project TAPIR-CM P2018/TCS-4496.Publicad

    Categorization of Phishing Detection Features And Using the Feature Vectors to Classify Phishing Websites

    Get PDF
    abstract: Phishing is a form of online fraud where a spoofed website tries to gain access to user's sensitive information by tricking the user into believing that it is a benign website. There are several solutions to detect phishing attacks such as educating users, using blacklists or extracting phishing characteristics found to exist in phishing attacks. In this thesis, we analyze approaches that extract features from phishing websites and train classification models with extracted feature set to classify phishing websites. We create an exhaustive list of all features used in these approaches and categorize them into 6 broader categories and 33 finer categories. We extract 59 features from the URL, URL redirects, hosting domain (WHOIS and DNS records) and popularity of the website and analyze their robustness in classifying a phishing website. Our emphasis is on determining the predictive performance of robust features. We evaluate the classification accuracy when using the entire feature set and when URL features or site popularity features are excluded from the feature set and show how our approach can be used to effectively predict specific types of phishing attacks such as shortened URLs and randomized URLs. Using both decision table classifiers and neural network classifiers, our results indicate that robust features seem to have enough predictive power to be used in practice.Dissertation/ThesisMasters Thesis Computer Science 201
    corecore