1,627 research outputs found

    An Evasion Attack against ML-based Phishing URL Detectors

    Full text link
    Background: Over the year, Machine Learning Phishing URL classification (MLPU) systems have gained tremendous popularity to detect phishing URLs proactively. Despite this vogue, the security vulnerabilities of MLPUs remain mostly unknown. Aim: To address this concern, we conduct a study to understand the test time security vulnerabilities of the state-of-the-art MLPU systems, aiming at providing guidelines for the future development of these systems. Method: In this paper, we propose an evasion attack framework against MLPU systems. To achieve this, we first develop an algorithm to generate adversarial phishing URLs. We then reproduce 41 MLPU systems and record their baseline performance. Finally, we simulate an evasion attack to evaluate these MLPU systems against our generated adversarial URLs. Results: In comparison to previous works, our attack is: (i) effective as it evades all the models with an average success rate of 66% and 85% for famous (such as Netflix, Google) and less popular phishing targets (e.g., Wish, JBHIFI, Officeworks) respectively; (ii) realistic as it requires only 23ms to produce a new adversarial URL variant that is available for registration with a median cost of only $11.99/year. We also found that popular online services such as Google SafeBrowsing and VirusTotal are unable to detect these URLs. (iii) We find that Adversarial training (successful defence against evasion attack) does not significantly improve the robustness of these systems as it decreases the success rate of our attack by only 6% on average for all the models. (iv) Further, we identify the security vulnerabilities of the considered MLPU systems. Our findings lead to promising directions for future research. Conclusion: Our study not only illustrate vulnerabilities in MLPU systems but also highlights implications for future study towards assessing and improving these systems.Comment: Draft for ACM TOP

    High Accuracy Phishing Detection Based on Convolutional Neural Networks

    Get PDF
    The persistent growth in phishing and the rising volume of phishing websites has led to individuals and organizations worldwide becoming increasingly exposed to various cyber-attacks. Consequently, more effective phishing detection is required for improved cyber defence. Hence, in this paper we present a deep learning-based approach to enable high accuracy detection of phishing sites. The proposed approach utilizes convolutional neural networks (CNN) for high accuracy classification to distinguish genuine sites from phishing sites. We evaluate the models using a dataset obtained from 6,157 genuine and 4,898 phishing websites. Based on the results of extensive experiments, our CNN based models proved to be highly effective in detecting unknown phishing sites. Furthermore, the CNN based approach performed better than traditional machine learning classifiers evaluated on the same dataset, reaching 98.2% phishing detection rate with an F1-score of 0.976. The method presented in this pa-per compares favourably to the state-of-the art in deep learning based phishing website detection

    Malicious URL Website Detection using Selective Hyper Feature Link Stability based on Soft-Max Deep Featured Convolution Neural Network

    Get PDF
    The web resource contains many domains with different users' Uniform Resource Locators (URLs). Due to the increasing amount of information on the Internet resource, malicious activities are done by hackers by expecting malicious websites in URL sub-links. Increasing information theft leads data sources to be vested in huge mediums. So, to analyze the web features to find the malicious webpage based on the deep learning approach, we propose a Selective Hyper Feature Link stability rate (SHFLSR) based on Soft-max Deep featured convolution neural network (SmDFCNN) for identifying the malicious website detection depends on the actions performed and its feature responses. Initially, the URL Signature Frame rate (USFR) is estimated to verify the domain-specific hosting. Then the link stability was confirmed by post-response rate using HyperLink stability post-response state (LSPRS). Depending upon the Spectral successive Domain propagation rate (S2DPR), the features were selected and trained with a deep neural classifier with a logically defined Softmax- Logical activator (SmLA) using Deep featured Convolution neural network (DFCNN). The proposed system performs a high-performance rate by detecting the malicious URL based on the behavioral response of the domain. It increases the detection rate, prediction rate, and classifier performance

    A method based on hierarchical spatiotemporal features for trojan traffic detection

    Full text link
    Trojans are one of the most threatening network attacks currently. HTTP-based Trojan, in particular, accounts for a considerable proportion of them. Moreover, as the network environment becomes more complex, HTTP-based Trojan is more concealed than others. At present, many intrusion detection systems (IDSs) are increasingly difficult to effectively detect such Trojan traffic due to the inherent shortcomings of the methods used and the backwardness of training data. Classical anomaly detection and traditional machine learning-based (TML-based) anomaly detection are highly dependent on expert knowledge to extract features artificially, which is difficult to implement in HTTP-based Trojan traffic detection. Deep learning-based (DL-based) anomaly detection has been locally applied to IDSs, but it cannot be transplanted to HTTP-based Trojan traffic detection directly. To solve this problem, in this paper, we propose a neural network detection model (HSTF-Model) based on hierarchical spatiotemporal features of traffic. Meanwhile, we combine deep learning algorithms with expert knowledge through feature encoders and statistical characteristics to improve the self-learning ability of the model. Experiments indicate that F1 of HSTF-Model can reach 99.4% in real traffic. In addition, we present a dataset BTHT consisting of HTTP-based benign and Trojan traffic to facilitate related research in the field.Comment: 8 pages, 7 figure

    An adaptive anomaly request detection framework based on dynamic web application profiles

    Get PDF
    Web application firewall is a highly effective application in protecting the application layer and database layer of websites from attack access. This paper proposes a new web application firewall deploying method based on Dynamic Web application profiling (DWAP) analysis technique. This is a method to deploy a firewall based on analyzing website access data. DWAP is improved to integrate deeply into the structure of the website to increase the compatibility of the anomaly detection system into each website, thereby improving the ability to detect abnormal requests. To improve the compatibility of the web application firewall with protected objects, the proposed system consists of two parts with the main tasks are: i) Detect abnormal access in web application (WA) access; ii) Semi-automatic update the attack data to the abnormal access detection system during WA access. This new method is applicable in real-time detection systems where updating of new attack data is essential since web attacks are increasingly complex and sophisticated

    ConvXSS:a deep learning-based smart ICT framework against code injection attacks for HTML5 web applications in sustainable smart city infrastructure

    Get PDF
    In this paper we propose ConvXSS, a novel deep learning approach for the detection of XSS and code injection attacks, followed by context-based sanitization of the malicious code if the model detects any malicious code in the application. Firstly, we briefly discuss XSS and code injection attacks that might pose threat to sustainable smart cities. Along with this, we discuss various approaches proposed previously for the detection and alleviation of these attacks followed by their respective limitations. Then we propose our deep learning model adopting whose novelty is based on the approach followed for Data Pre-Processing. Then we finally propose Context-based Sanitization to replace the malicious part of the code with sanitized code. Numerical experiments conducted on various datasets have shown various results out of which the best model has an accuracy of 99.42%, a precision of 99.81% and a recall of 99.35%. When compared with other state of the art techniques in this domain, our approach shows at par or in the best case, better results in terms of detection speed and accuracy of CSS attacks
    • …
    corecore