137 research outputs found

    An Evasion Attack against ML-based Phishing URL Detectors

    Full text link
    Background: Over the year, Machine Learning Phishing URL classification (MLPU) systems have gained tremendous popularity to detect phishing URLs proactively. Despite this vogue, the security vulnerabilities of MLPUs remain mostly unknown. Aim: To address this concern, we conduct a study to understand the test time security vulnerabilities of the state-of-the-art MLPU systems, aiming at providing guidelines for the future development of these systems. Method: In this paper, we propose an evasion attack framework against MLPU systems. To achieve this, we first develop an algorithm to generate adversarial phishing URLs. We then reproduce 41 MLPU systems and record their baseline performance. Finally, we simulate an evasion attack to evaluate these MLPU systems against our generated adversarial URLs. Results: In comparison to previous works, our attack is: (i) effective as it evades all the models with an average success rate of 66% and 85% for famous (such as Netflix, Google) and less popular phishing targets (e.g., Wish, JBHIFI, Officeworks) respectively; (ii) realistic as it requires only 23ms to produce a new adversarial URL variant that is available for registration with a median cost of only $11.99/year. We also found that popular online services such as Google SafeBrowsing and VirusTotal are unable to detect these URLs. (iii) We find that Adversarial training (successful defence against evasion attack) does not significantly improve the robustness of these systems as it decreases the success rate of our attack by only 6% on average for all the models. (iv) Further, we identify the security vulnerabilities of the considered MLPU systems. Our findings lead to promising directions for future research. Conclusion: Our study not only illustrate vulnerabilities in MLPU systems but also highlights implications for future study towards assessing and improving these systems.Comment: Draft for ACM TOP

    Look Before You Leap: Detecting Phishing Web Pages by Exploiting Raw URL And HTML Characteristics

    Get PDF
    Cybercriminals resort to phishing as a simple and cost-effective medium to perpetrate cyber-attacks on today's Internet. Recent studies in phishing detection are increasingly adopting automated feature selection over traditional manually engineered features. This transition is due to the inability of existing traditional methods to extrapolate their learning to new data. To this end, in this paper, we propose WebPhish, a deep learning technique using automatic feature selection extracted from the raw URL and HTML of a web page. This approach is the first of its kind, which uses the concatenation of URL and HTML embedding feature vectors as input into a Convolutional Neural Network model to detect phishing attacks on web pages. Extensive experiments on a real-world dataset yielded an accuracy of 98 percent, outperforming other state-of-the-art techniques. Also, WebPhish is a client-side strategy that is completely language-independent and can conduct lightweight phishing detection regardless of the web page's textual language

    Deep Learning for Phishing Detection: Taxonomy, Current Challenges and Future Directions

    Get PDF
    This work was supported in part by the Ministry of Higher Education under the Fundamental Research Grant Scheme under Grant FRGS/1/2018/ICT04/UTM/01/1; and in part by the Faculty of Informatics and Management, University of Hradec Kralove, through SPEV project under Grant 2102/2022.Phishing has become an increasing concern and captured the attention of end-users as well as security experts. Existing phishing detection techniques still suffer from the de ciency in performance accuracy and inability to detect unknown attacks despite decades of development and improvement. Motivated to solve these problems, many researchers in the cybersecurity domain have shifted their attention to phishing detection that capitalizes on machine learning techniques. Deep learning has emerged as a branch of machine learning that becomes a promising solution for phishing detection in recent years. As a result, this study proposes a taxonomy of deep learning algorithm for phishing detection by examining 81 selected papers using a systematic literature review approach. The paper rst introduces the concept of phishing and deep learning in the context of cybersecurity. Then, taxonomies of phishing detection and deep learning algorithm are provided to classify the existing literature into various categories. Next, taking the proposed taxonomy as a baseline, this study comprehensively reviews the state-of-the-art deep learning techniques and analyzes their advantages as well as disadvantages. Subsequently, the paper discusses various issues that deep learning faces in phishing detection and proposes future research directions to overcome these challenges. Finally, an empirical analysis is conducted to evaluate the performance of various deep learning techniques in a practical context, and to highlight the related issues that motivate researchers in their future works. The results obtained from the empirical experiment showed that the common issues among most of the state-of-the-art deep learning algorithms are manual parameter-tuning, long training time, and de cient detection accuracy.Ministry of Higher Education under the Fundamental Research Grant Scheme FRGS/1/2018/ICT04/UTM/01/1Faculty of Informatics and Management, University of Hradec Kralove, through SPEV project 2102/202

    Shielding against Web Application Attacks - Detection Techniques and Classification

    Get PDF
    The field of IoT web applications is facing a range of security risks and system attacks due to the increasing complexity and size of home automation datasets. One of the primary concerns is the identification of Distributed Denial of Service (DDoS) attacks in home automation systems. Attackers can easily access various IoT web application assets by entering a home automation dataset or clicking a link, making them vulnerable to different types of web attacks. To address these challenges, the cloud has introduced the Edge of Things paradigm, which uses multiple concurrent deep models to enhance system stability and enable easy data revelation updates. Therefore, identifying malicious attacks is crucial for improving the reliability and security of IoT web applications. This paper uses a Machine Learning algorithm that can accurately identify web attacks using unique keywords. Smart home devices are classified into four classes based on their traffic predictability levels, and a neural system recognition model is proposed to classify these attacks with a high degree of accuracy, outperforming other classification models. The application of deep learning in identifying and classifying attacks has significant theoretical and scientific value for web security investigations. It also provides innovative ideas for intelligent security detection by classifying web visitors, making it possible to identify and prevent potential security threats

    Catching the Phish: Detecting Phishing Attacks using Recurrent Neural Networks (RNNs)

    Get PDF
    The emergence of online services in our daily lives has been accompanied by a range of malicious attempts to trick individuals into performing undesired actions, often to the benefit of the adversary. The most popular medium of these attempts is phishing attacks, particularly through emails and websites. In order to defend against such attacks, there is an urgent need for automated mechanisms to identify this malevolent content before it reaches users. Machine learning techniques have gradually become the standard for such classification problems. However, identifying common measurable features of phishing content (e.g., in emails) is notoriously difficult. To address this problem, we engage in a novel study into a phishing content classifier based on a recurrent neural network (RNN), which identifies such features without human input. At this stage, we scope our research to emails, but our approach can be extended to apply to websites. Our results show that the proposed system outperforms state-of-the-art tools. Furthermore, our classifier is efficient and takes into account only the text and, in particular, the textual structure of the email. Since these features are rarely considered in email classification, we argue that our classifier can complement existing classifiers with high information gain

    CLASSIFICATION OF PHISHING ATTACKS IN SOCIAL MEDIA USING ASSOCIATIVE RULE MINING AUGMENTED WITH FIREFLY ALGORITHM

    Get PDF
    Social media has significantly grown as a preferred medium of communication for individuals and groups. It is also a tool for disseminating information to the public. Social media offers several advantages, most especially contacting millions of people at the same time. Social media attacks such as phishing evolved as a result of messaging and disseminating capabilities of social media network sites. This challenge of continuous attacks has attracted the attention of many researchers to propose different techniques to detect and classify both phishing attacks and legitimate messages. Studies in the literature revealed that some of the models proposed for phishing attacks may not be perfect to stop adversaries and, there are still different phishing attacks that hindered the robust nature of social media. This study proposed associative rule mining augmented with the Firefly algorithm which attained a high degree of accuracy in both phishing attack messages and legitimate messages

    A Security Model for the Classification of Suspicious Data Using Machine Learning Techniques

    Get PDF
    Cybercrime first emerged in 1981 and gained significant attention in the 20th century. The proliferation of technology and our increasing reliance on the internet have been major factors contributing to the growth of cybercrime. Different countries face varying types and levels of cyber-attacks, with developing countries often dealing with different types of attacks compared to developed countries. The response to cybercrime is usually based on the resources and technological capabilities available in each country. For example, sophisticated attacks involving machine learning may not be common in countries with limited technological advancements. Despite the variations in technology and resources, cybercrime remains a costly issue worldwide, projected to reach around 8 trillion by 2023. Preventing and combating cybercrime has become crucial in our society. Machine learning techniques, such as convolutional neural networks (CNN), recurrent neural networks (RNN), and more, have gained popularity in the fight against cybercrime. Researchers and authors have made significant contributions in protecting and predicting cybercrime. Nowadays, many corporations implement cyber defense strategies based on machine learning to safeguard their data. In this study, we utilized five different machine learning algorithms, including CNN, LSTM, RNN, GRU, and MLP DNN, to address cybercrime. The models were trained and tested using the InSDN public dataset. Each model provided different levels of trained and test accuracy percentages
    • …
    corecore