4 research outputs found

    Cyber Threat Intelligence-Based Malicious URL Detection Model Using Ensemble Learning

    Get PDF
    Web applications have become ubiquitous for many business sectors due to their platform independence and low operation cost. Billions of users are visiting these applications to accomplish their daily tasks. However, many of these applications are either vulnerable to web defacement attacks or created and managed by hackers such as fraudulent and phishing websites. Detecting malicious websites is essential to prevent the spreading of malware and protect end-users from being victims. However, most existing solutions rely on extracting features from the website’s content which can be harmful to the detection machines themselves and subject to obfuscations. Detecting malicious Uniform Resource Locators (URLs) is safer and more efficient than content analysis. However, the detection of malicious URLs is still not well addressed due to insufficient features and inaccurate classification. This study aims at improving the detection accuracy of malicious URL detection by designing and developing a cyber threat intelligence-based malicious URL detection model using two-stage ensemble learning. The cyber threat intelligence-based features are extracted from web searches to improve detection accuracy. Cybersecurity analysts and users reports around the globe can provide important information regarding malicious websites. Therefore, cyber threat intelligence-based (CTI) features extracted from Google searches and Whois websites are used to improve detection performance. The study also proposed a two-stage ensemble learning model that combines the random forest (RF) algorithm for preclassification with multilayer perceptron (MLP) for final decision making. The trained MLP classifier has replaced the majority voting scheme of the three trained random forest classifiers for decision making. The probabilistic output of the weak classifiers of the random forest was aggregated and used as input for the MLP classifier for adequate classification. Results show that the extracted CTI-based features with the two-stage classification outperform other studies’ detection models. The proposed CTI-based detection model achieved a 7.8% accuracy improvement and 6.7% reduction in false-positive rates compared with the traditional URL-based model

    An investigation of phishing awareness and education over time: When and how to best remind users

    Get PDF
    Security awareness and education programmes are rolled out in more and more organisations. However, their effectiveness over time and, correspondingly, appropriate intervals to remind users’ awareness and knowledge are an open question. In an attempt to address this open question, we present a field investigation in a German organisation from the public administration sector. With overall 409 employees, we evaluated (a) the effectiveness of their newly deployed security awareness and education programme in the phishing context over time and (b) the effectiveness of four different reminder measures – administered after the initial effect had worn off to a degree that no significant improvement to before its deployment was detected anymore. We find a significantly improved performance of correctly identifying phishing and legitimate emails directly after and four months after the programme’s deployment. This was not the case anymore after six months, indicating that reminding users after half a year is recommended. The investigation of the reminder measures indicates that measures based on videos and interactive examples perform best, lasting for at least another six months

    An investigation of phishing awareness and education over time: When and how to best remind users

    Get PDF
    Security awareness and education programmes are rolled out in more and more organisations. However, their effectiveness over time and, correspondingly, appropriate intervals to remind users’ awareness and knowledge are an open question. In an attempt to address this open question, we present a field investigation in a German organisation from the public administration sector. With overall 409 employees, we evaluated (a) the effectiveness of their newly deployed security awareness and education programme in the phishing context over time and (b) the effectiveness of four different reminder measures – administered after the initial effect had worn off to a degree that no significant improvement to before its deployment was detected anymore. We find a significantly improved performance of correctly identifying phishing and legitimate emails directly after and four months after the programme’s deployment. This was not the case anymore after six months, indicating that reminding users after half a year is recommended. The investigation of the reminder measures indicates that measures based on videos and interactive examples perform best, lasting for at least another six months

    Multi-modal Features Representation-based Convolutional Neural Network Model for Malicious Website Detection

    Get PDF
    Web applications have proliferated across various business sectors, serving as essential tools for billions of users in their daily lives activities. However, many of these applications are malicious which is a major threat to Internet users as they can steal sensitive information, install malware, and propagate spam. Detecting malicious websites by analyzing web content is ineffective due to the complexity of extraction of the representative features, the huge data volume, the evolving nature of the malicious patterns, the stealthy nature of the attacks, and the limitations of traditional classifiers. Uniform Resource Locators (URL) features are static and can often provide immediate insights about the website without the need to load its content. However, existing solutions for detecting malicious web applications through web content analysis often struggle due to complex feature extraction, massive data volumes, evolving attack patterns, and limitations of traditional classifiers. Leveraging solely lexical URL features proves insufficient, potentially leading to inaccurate classifications. This study proposes a multimodal representation approach that fuses textual and image-based features to enhance the performance of the malicious website detection. Textual features facilitate the deep learning model’s ability to understand and represent detailed semantic information related to attack patterns, while image features are effective in recognizing more general malicious patterns. In doing so, patterns that are hidden in textual format may be recognizable in image format. Two Convolutional Neural Network (CNN) models were constructed to extract the hidden features from both textual and image-represented features. The output layers of both models were combined and used as input for an artificial neural network classifier for decision-making. Results show the effectiveness of the proposed model when compared to other models. The overall performance in terms of Matthews..
    corecore