1,504 research outputs found
Know Your Phish: Novel Techniques for Detecting Phishing Sites and Their Targets
Phishing is a major problem on the Web. Despite the significant attention it has received over the years, there has been no definitive solution. While the state-of-the-art solutions have reasonably good performance, they require a large amount of training data and are not adept at detecting phishing attacks against new targets. In this paper, we begin with two core observations: (a) although phishers try to make a phishing webpage look similar to its target, they do not have unlimited freedom in structuring the phishing webpage, and (b) a webpage can be characterized by a small set of key terms, how these key terms are used in different parts of a webpage is different in the case of legitimate and phishing webpages. Based on these observations, we develop a phishing detection system with several notable properties: it requires very little training data, scales well to much larger test data, is language-independent, fast, resilient to adaptive attacks and implemented entirely on client-side. In addition, we developed a target identification component that can identify the target website that a phishing webpage is attempting to mimic. The target detection component is faster than previously reported systems and can help minimize false positives in our phishing detection system.Peer reviewe
DeltaPhish: Detecting Phishing Webpages in Compromised Websites
The large-scale deployment of modern phishing attacks relies on the automatic
exploitation of vulnerable websites in the wild, to maximize profit while
hindering attack traceability, detection and blacklisting. To the best of our
knowledge, this is the first work that specifically leverages this adversarial
behavior for detection purposes. We show that phishing webpages can be
accurately detected by highlighting HTML code and visual differences with
respect to other (legitimate) pages hosted within a compromised website. Our
system, named DeltaPhish, can be installed as part of a web application
firewall, to detect the presence of anomalous content on a website after
compromise, and eventually prevent access to it. DeltaPhish is also robust
against adversarial attempts in which the HTML code of the phishing page is
carefully manipulated to evade detection. We empirically evaluate it on more
than 5,500 webpages collected in the wild from compromised websites, showing
that it is capable of detecting more than 99% of phishing webpages, while only
misclassifying less than 1% of legitimate pages. We further show that the
detection rate remains higher than 70% even under very sophisticated attacks
carefully designed to evade our system.Comment: Preprint version of the work accepted at ESORICS 201
Predicting Phishing Websites using Neural Network trained with Back-Propagation
Phishing is increasing dramatically with the development of modern technologies and the global worldwide computer networks. This results in the loss of customer’s confidence in e-commerce and online banking, financial damages, and identity theft. Phishing is fraudulent effort aims to acquire sensitive information from users such as credit card credentials, and social security number. In this article, we propose a model for predicting phishing attacks based on Artificial Neural Network (ANN). A Feed Forward Neural Network trained by Back Propagation algorithm is developed to classify websites as phishing or legitimate. The suggested model shows high acceptance ability for noisy data, fault tolerance and high prediction accuracy with respect to false positive and false negative rates
An Ideal Approach for Detection and Prevention of Phishing Attacks
AbstractPhishing is a treacherous attempt to embezzle personal information such as bank account details, credit card information, social security number, employment details, and online shopping account passwords and so on from internet users. Phishing, or stealing of sensitive information on the web, has dealt a major blow to Internet security in recent times. These attacks use spurious emails or websites designed to fool users into divulging personal financial data by emulating the trusted brands of well-known banks, e-commerce and credit card companies.In this paper, we propose a phishing detection and prevention approach combining URL-based and Webpage similarity based detection. URL-based phishing detection involves extraction of actual URL (to which the website is actually directed) and the visual URL (which is visible to the user). LinkGuard Algorithm is used to analyze the two URLs and finally depending on the result produced by the algorithm the procedure proceeds to the next phase. If phishing is not detected or Phishing possibility is predicted in URL-based detection, the algorithm proceeds to the visual similarity based detection. A novel technique to visually compare a suspicious page with the legitimate one is presented
PhishWHO: Phishing webpage detection via identity keywords extraction and target domain name finder
This paper proposes a phishing detection technique based on the difference between the target and actual
identities of a webpage. The proposed phishing detection approach, called PhishWHO, can be divided
into three phases. The first phase extracts identity keywords from the textual contents of the website,
where a novel weighted URL tokens system based on the N-gram model is proposed. The second phase
finds the target domain name by using a search engine, and the target domain name is selected based on
identity-relevant features. In the final phase, a 3-tier identity matching system is proposed to determine
the legitimacy of the query webpage. The overall experimental results suggest that the proposed system
outperforms the conventional phishing detection methods considered
A Phishing Webpage Detection Method Based on Stacked Autoencoder and Correlation Coefficients
Phishing is a kind of cyber-attack that targets naive online users by tricking them into revealing sensitive information. There are many anti-phishing solutions proposed to date, such as blacklist or whitelist, heuristic-based and machine learning-based methods. However, online users are still being trapped into revealing sensitive information in phishing websites. In this paper, we propose a novel phishing webpage detection model, based on features that are extracted from URL, source codes of HTML, and the third-party services to represent the basic characters of phishing webpages, which uses a deep learning method – Stacked Autoencoder (SAE) to detect phishing webpages. To make features in the same order of magnitude, three kinds of normalization methods are adopted. In particular, a method to calculate correlation coefficients between weight matrixes of SAE is proposed to determine optimal width of hidden layers, which shows high computational efficiency and feasibility. Based on the testing of a set of phishing and benign webpages, the model using SAE achieves the best performance when compared to other algorithms such as Naive Bayes (NB), Support Vector Machine (SVM), Convolutional Neural Networks (CNN), and Recurrent Neural Networks (RNN). It indicates that the proposed detection model is promising and can be applied effectively to phishing detection
High Accuracy Phishing Detection Based on Convolutional Neural Networks
The persistent growth in phishing and the rising volume of phishing websites has led to individuals and organizations worldwide becoming increasingly exposed to various cyber-attacks. Consequently, more effective phishing detection is required for improved cyber defence. Hence, in this paper we present a deep learning-based approach to enable high accuracy detection of phishing sites. The proposed approach utilizes convolutional neural networks (CNN) for high accuracy classification to distinguish genuine sites from phishing sites. We evaluate the models using a dataset obtained from 6,157 genuine and 4,898 phishing websites. Based on the results of extensive experiments, our CNN based models proved to be highly effective in detecting unknown phishing sites. Furthermore, the CNN based approach performed better than traditional machine learning classifiers evaluated on the same dataset, reaching 98.2% phishing detection rate with an F1-score of 0.976. The method presented in this pa-per compares favourably to the state-of-the art in deep learning based phishing website detection
- …