1,627 research outputs found
An Evasion Attack against ML-based Phishing URL Detectors
Background: Over the year, Machine Learning Phishing URL classification
(MLPU) systems have gained tremendous popularity to detect phishing URLs
proactively. Despite this vogue, the security vulnerabilities of MLPUs remain
mostly unknown. Aim: To address this concern, we conduct a study to understand
the test time security vulnerabilities of the state-of-the-art MLPU systems,
aiming at providing guidelines for the future development of these systems.
Method: In this paper, we propose an evasion attack framework against MLPU
systems. To achieve this, we first develop an algorithm to generate adversarial
phishing URLs. We then reproduce 41 MLPU systems and record their baseline
performance. Finally, we simulate an evasion attack to evaluate these MLPU
systems against our generated adversarial URLs. Results: In comparison to
previous works, our attack is: (i) effective as it evades all the models with
an average success rate of 66% and 85% for famous (such as Netflix, Google) and
less popular phishing targets (e.g., Wish, JBHIFI, Officeworks) respectively;
(ii) realistic as it requires only 23ms to produce a new adversarial URL
variant that is available for registration with a median cost of only
$11.99/year. We also found that popular online services such as Google
SafeBrowsing and VirusTotal are unable to detect these URLs. (iii) We find that
Adversarial training (successful defence against evasion attack) does not
significantly improve the robustness of these systems as it decreases the
success rate of our attack by only 6% on average for all the models. (iv)
Further, we identify the security vulnerabilities of the considered MLPU
systems. Our findings lead to promising directions for future research.
Conclusion: Our study not only illustrate vulnerabilities in MLPU systems but
also highlights implications for future study towards assessing and improving
these systems.Comment: Draft for ACM TOP
High Accuracy Phishing Detection Based on Convolutional Neural Networks
The persistent growth in phishing and the rising volume of phishing websites has led to individuals and organizations worldwide becoming increasingly exposed to various cyber-attacks. Consequently, more effective phishing detection is required for improved cyber defence. Hence, in this paper we present a deep learning-based approach to enable high accuracy detection of phishing sites. The proposed approach utilizes convolutional neural networks (CNN) for high accuracy classification to distinguish genuine sites from phishing sites. We evaluate the models using a dataset obtained from 6,157 genuine and 4,898 phishing websites. Based on the results of extensive experiments, our CNN based models proved to be highly effective in detecting unknown phishing sites. Furthermore, the CNN based approach performed better than traditional machine learning classifiers evaluated on the same dataset, reaching 98.2% phishing detection rate with an F1-score of 0.976. The method presented in this pa-per compares favourably to the state-of-the art in deep learning based phishing website detection
Malicious URL Website Detection using Selective Hyper Feature Link Stability based on Soft-Max Deep Featured Convolution Neural Network
The web resource contains many domains with different users' Uniform Resource Locators (URLs). Due to the increasing amount of information on the Internet resource, malicious activities are done by hackers by expecting malicious websites in URL sub-links. Increasing information theft leads data sources to be vested in huge mediums. So, to analyze the web features to find the malicious webpage based on the deep learning approach, we propose a Selective Hyper Feature Link stability rate (SHFLSR) based on Soft-max Deep featured convolution neural network (SmDFCNN) for identifying the malicious website detection depends on the actions performed and its feature responses. Initially, the URL Signature Frame rate (USFR) is estimated to verify the domain-specific hosting. Then the link stability was confirmed by post-response rate using HyperLink stability post-response state (LSPRS). Depending upon the Spectral successive Domain propagation rate (S2DPR), the features were selected and trained with a deep neural classifier with a logically defined Softmax- Logical activator (SmLA) using Deep featured Convolution neural network (DFCNN). The proposed system performs a high-performance rate by detecting the malicious URL based on the behavioral response of the domain. It increases the detection rate, prediction rate, and classifier performance
A method based on hierarchical spatiotemporal features for trojan traffic detection
Trojans are one of the most threatening network attacks currently. HTTP-based
Trojan, in particular, accounts for a considerable proportion of them.
Moreover, as the network environment becomes more complex, HTTP-based Trojan is
more concealed than others. At present, many intrusion detection systems (IDSs)
are increasingly difficult to effectively detect such Trojan traffic due to the
inherent shortcomings of the methods used and the backwardness of training
data. Classical anomaly detection and traditional machine learning-based
(TML-based) anomaly detection are highly dependent on expert knowledge to
extract features artificially, which is difficult to implement in HTTP-based
Trojan traffic detection. Deep learning-based (DL-based) anomaly detection has
been locally applied to IDSs, but it cannot be transplanted to HTTP-based
Trojan traffic detection directly. To solve this problem, in this paper, we
propose a neural network detection model (HSTF-Model) based on hierarchical
spatiotemporal features of traffic. Meanwhile, we combine deep learning
algorithms with expert knowledge through feature encoders and statistical
characteristics to improve the self-learning ability of the model. Experiments
indicate that F1 of HSTF-Model can reach 99.4% in real traffic. In addition, we
present a dataset BTHT consisting of HTTP-based benign and Trojan traffic to
facilitate related research in the field.Comment: 8 pages, 7 figure
An adaptive anomaly request detection framework based on dynamic web application profiles
Web application firewall is a highly effective application in protecting the application layer and database layer of websites from attack access. This paper proposes a new web application firewall deploying method based on Dynamic Web application profiling (DWAP) analysis technique. This is a method to deploy a firewall based on analyzing website access data. DWAP is improved to integrate deeply into the structure of the website to increase the compatibility of the anomaly detection system into each website, thereby improving the ability to detect abnormal requests. To improve the compatibility of the web application firewall with protected objects, the proposed system consists of two parts with the main tasks are: i) Detect abnormal access in web application (WA) access; ii) Semi-automatic update the attack data to the abnormal access detection system during WA access. This new method is applicable in real-time detection systems where updating of new attack data is essential since web attacks are increasingly complex and sophisticated
ConvXSS:a deep learning-based smart ICT framework against code injection attacks for HTML5 web applications in sustainable smart city infrastructure
In this paper we propose ConvXSS, a novel deep learning approach for the detection of XSS and code injection attacks, followed by context-based sanitization of the malicious code if the model detects any malicious code in the application. Firstly, we briefly discuss XSS and code injection attacks that might pose threat to sustainable smart cities. Along with this, we discuss various approaches proposed previously for the detection and alleviation of these attacks followed by their respective limitations. Then we propose our deep learning model adopting whose novelty is based on the approach followed for Data Pre-Processing. Then we finally propose Context-based Sanitization to replace the malicious part of the code with sanitized code. Numerical experiments conducted on various datasets have shown various results out of which the best model has an accuracy of 99.42%, a precision of 99.81% and a recall of 99.35%. When compared with other state of the art techniques in this domain, our approach shows at par or in the best case, better results in terms of detection speed and accuracy of CSS attacks
- …