2,933 research outputs found
Predicting Phishing Websites using Neural Network trained with Back-Propagation
Phishing is increasing dramatically with the development of modern technologies and the global worldwide computer networks. This results in the loss of customer’s confidence in e-commerce and online banking, financial damages, and identity theft. Phishing is fraudulent effort aims to acquire sensitive information from users such as credit card credentials, and social security number. In this article, we propose a model for predicting phishing attacks based on Artificial Neural Network (ANN). A Feed Forward Neural Network trained by Back Propagation algorithm is developed to classify websites as phishing or legitimate. The suggested model shows high acceptance ability for noisy data, fault tolerance and high prediction accuracy with respect to false positive and false negative rates
An Evasion Attack against ML-based Phishing URL Detectors
Background: Over the year, Machine Learning Phishing URL classification
(MLPU) systems have gained tremendous popularity to detect phishing URLs
proactively. Despite this vogue, the security vulnerabilities of MLPUs remain
mostly unknown. Aim: To address this concern, we conduct a study to understand
the test time security vulnerabilities of the state-of-the-art MLPU systems,
aiming at providing guidelines for the future development of these systems.
Method: In this paper, we propose an evasion attack framework against MLPU
systems. To achieve this, we first develop an algorithm to generate adversarial
phishing URLs. We then reproduce 41 MLPU systems and record their baseline
performance. Finally, we simulate an evasion attack to evaluate these MLPU
systems against our generated adversarial URLs. Results: In comparison to
previous works, our attack is: (i) effective as it evades all the models with
an average success rate of 66% and 85% for famous (such as Netflix, Google) and
less popular phishing targets (e.g., Wish, JBHIFI, Officeworks) respectively;
(ii) realistic as it requires only 23ms to produce a new adversarial URL
variant that is available for registration with a median cost of only
$11.99/year. We also found that popular online services such as Google
SafeBrowsing and VirusTotal are unable to detect these URLs. (iii) We find that
Adversarial training (successful defence against evasion attack) does not
significantly improve the robustness of these systems as it decreases the
success rate of our attack by only 6% on average for all the models. (iv)
Further, we identify the security vulnerabilities of the considered MLPU
systems. Our findings lead to promising directions for future research.
Conclusion: Our study not only illustrate vulnerabilities in MLPU systems but
also highlights implications for future study towards assessing and improving
these systems.Comment: Draft for ACM TOP
Detecting and characterizing lateral phishing at scale
We present the first large-scale characterization of lateral phishing attacks, based on a dataset of 113 million employee-sent emails from 92 enterprise organizations. In a lateral phishing attack, adversaries leverage a compromised enterprise account to send phishing emails to other users, benefit-ting from both the implicit trust and the information in the hijacked user's account. We develop a classifier that finds hundreds of real-world lateral phishing emails, while generating under four false positives per every one-million employee-sent emails. Drawing on the attacks we detect, as well as a corpus of user-reported incidents, we quantify the scale of lateral phishing, identify several thematic content and recipient targeting strategies that attackers follow, illuminate two types of sophisticated behaviors that attackers exhibit, and estimate the success rate of these attacks. Collectively, these results expand our mental models of the 'enterprise attacker' and shed light on the current state of enterprise phishing attacks
Phishing Detection Using Natural Language Processing and Machine Learning
Phishing emails are a primary mode of entry for attackers into an organization. A successful phishing attempt leads to unauthorized access to sensitive information and systems. However, automatically identifying phishing emails is often difficult since many phishing emails have composite features such as body text and metadata that are nearly indistinguishable from valid emails. This paper presents a novel machine learning-based framework, the DARTH framework, that characterizes and combines multiple models, with one model for each composite feature, that enables the accurate identification of phishing emails. The framework analyses each composite feature independently utilizing a multi-faceted approach using Natural Language Processing (NLP) and neural network-based techniques and combines the results of these analyses to classify the emails as malicious or legitimate. Utilizing the framework on more than 150,000 emails and training data from multiple sources, including the authors’ emails and phishtank.com, resulted in the precision (correct identification of malicious observations to the total prediction of malicious observations) of 99.97% with an f-score of 99.98% and accurately identifying phishing emails 99.98% of the time. Utilizing multiple machine learning techniques combined in an ensemble approach across a range of composite features yields highly accurate identification of phishing emails
BERT4ETH: A Pre-trained Transformer for Ethereum Fraud Detection
As various forms of fraud proliferate on Ethereum, it is imperative to
safeguard against these malicious activities to protect susceptible users from
being victimized. While current studies solely rely on graph-based fraud
detection approaches, it is argued that they may not be well-suited for dealing
with highly repetitive, skew-distributed and heterogeneous Ethereum
transactions. To address these challenges, we propose BERT4ETH, a universal
pre-trained Transformer encoder that serves as an account representation
extractor for detecting various fraud behaviors on Ethereum. BERT4ETH features
the superior modeling capability of Transformer to capture the dynamic
sequential patterns inherent in Ethereum transactions, and addresses the
challenges of pre-training a BERT model for Ethereum with three practical and
effective strategies, namely repetitiveness reduction, skew alleviation and
heterogeneity modeling. Our empirical evaluation demonstrates that BERT4ETH
outperforms state-of-the-art methods with significant enhancements in terms of
the phishing account detection and de-anonymization tasks. The code for
BERT4ETH is available at: https://github.com/git-disl/BERT4ETH.Comment: the Web conference (WWW) 202
Honey Sheets: What Happens to Leaked Google Spreadsheets?
Cloud-based documents are inherently valuable, due to the volume and nature
of sensitive personal and business content stored in them. Despite the
importance of such documents to Internet users, there are still large gaps in
the understanding of what cybercriminals do when they illicitly get access to
them by for example compromising the account credentials they are associated
with. In this paper, we present a system able to monitor user activity on
Google spreadsheets. We populated 5 Google spreadsheets with fake bank account
details and fake funds transfer links. Each spreadsheet was configured to
report details of accesses and clicks on links back to us. To study how people
interact with these spreadsheets in case they are leaked, we posted unique
links pointing to the spreadsheets on a popular paste site. We then monitored
activity in the accounts for 72 days, and observed 165 accesses in total. We
were able to observe interesting modifications to these spreadsheets performed
by illicit accesses. For instance, we observed deletion of some fake bank
account information, in addition to insults and warnings that some visitors
entered in some of the spreadsheets. Our preliminary results show that our
system can be used to shed light on cybercriminal behavior with regards to
leaked online documents
- …