5 research outputs found
Detecting Phishing Websites Using Associative Classification
Phishing is a criminal technique employing both social engineering and technical subterfuge to steal consumer's personal identity data and financial account credential. The aim of the phishing website is to steal the victims’ personal information by visiting and surfing a fake webpage that looks like a true one of a legitimate bank or company and asks the victim to enter personal information such as their username, account number, password, credit card number, …,etc. This paper main goal is to investigate the potential use of automated data mining techniques in detecting the complex problem of phishing Websites in order to help all users from being deceived or hacked by stealing their personal information and passwords leading to catastrophic consequences. Experimentations against phishing data sets and using different common associative classification algorithms (MCAR and CBA) and traditional learning approaches have been conducted with reference to classification accuracy. The results show that the MCAR and CBA algorithms outperformed SVM and algorithms. Keywords: Phishing Websites, Data Mining, Associative Classification, Machine Learning
Detecting Phishing Websites Using Associative Classification
Phishing is a criminal technique employing both social engineering and technical subterfuge to steal consumer's personal identity data and financial account credential. The aim of the phishing website is to steal the victims’ personal information by visiting and surfing a fake webpage that looks like a true one of a legitimate bank or company and asks the victim to enter personal information such as their username, account number, password, credit card number, …,etc. This paper main goal is to investigate the potential use of automated data mining techniques in detecting the complex problem of phishing Websites in order to help all users from being deceived or hacked by stealing their personal information and passwords leading to catastrophic consequences. Experimentations against phishing data sets and using different common associative classification algorithms (MCAR and CBA) and traditional learning approaches have been conducted with reference to classification accuracy. The results show that the MCAR and CBA algorithms outperformed SVM and algorithms. Keywords: Phishing Websites, Data Mining, Associative Classification, Machine Learnin
Tutorial and Critical Analysis of Phishing Websites Methods
The Internet has become an essential component of our everyday social and financial activities. Internet is not important for individual users only but also for organizations, because organizations that offer online trading can achieve a competitive edge by serving worldwide clients. Internet facilitates reaching customers all over the globe without any market place restrictions and with effective use of e-commerce. As a result, the number of customers who rely on the Internet to perform procurements is increasing dramatically. Hundreds of millions of dollars are transferred through the Internet every day. This amount of money was tempting the fraudsters to carry out their fraudulent operations. Hence, Internet users may be vulnerable to different types of web threats, which may cause financial damages, identity theft, loss of private information, brand reputation damage and loss of customers’ confidence in e-commerce and online banking. Therefore, suitability of the Internet for commercial transactions becomes doubtful. Phishing is considered a form of web threats that is defined as the art of impersonating a website of an honest enterprise aiming to obtain user’s confidential credentials such as usernames, passwords and social security numbers. In this article, the phishing phenomena will be discussed in detail. In addition, we present a survey of the state of the art research on such attack. Moreover, we aim to recognize the up-to-date developments in phishing and its precautionary measures and provide a comprehensive study and evaluation of these researches to realize the gap that is still predominating in this area. This research will mostly focus on the web based phishing detection methods rather than email based detection methods
Recommended from our members
Phishing website detection using intelligent data mining techniques. Design and development of an intelligent association classification mining fuzzy based scheme for phishing website detection with an emphasis on E-banking.
Phishing techniques have not only grown in number, but also in sophistication. Phishers might
have a lot of approaches and tactics to conduct a well-designed phishing attack. The targets of
the phishing attacks, which are mainly on-line banking consumers and payment service
providers, are facing substantial financial loss and lack of trust in Internet-based services. In
order to overcome these, there is an urgent need to find solutions to combat phishing attacks.
Detecting phishing website is a complex task which requires significant expert knowledge and
experience. So far, various solutions have been proposed and developed to address these
problems. Most of these approaches are not able to make a decision dynamically on whether the
site is in fact phished, giving rise to a large number of false positives. This is mainly due to
limitation of the previously proposed approaches, for example depending only on fixed black
and white listing database, missing of human intelligence and experts, poor scalability and their
timeliness.
In this research we investigated and developed the application of an intelligent fuzzy-based
classification system for e-banking phishing website detection. The main aim of the proposed
system is to provide protection to users from phishers deception tricks, giving them the ability
to detect the legitimacy of the websites. The proposed intelligent phishing detection system
employed Fuzzy Logic (FL) model with association classification mining algorithms. The
approach combined the capabilities of fuzzy reasoning in measuring imprecise and dynamic
phishing features, with the capability to classify the phishing fuzzy rules. Different phishing experiments which cover all phishing attacks, motivations and deception
behaviour techniques have been conducted to cover all phishing concerns. A layered fuzzy
structure has been constructed for all gathered and extracted phishing website features and
patterns. These have been divided into 6 criteria and distributed to 3 layers, based on their attack
type. To reduce human knowledge intervention, Different classification and association
algorithms have been implemented to generate fuzzy phishing rules automatically, to be
integrated inside the fuzzy inference engine for the final phishing detection.
Experimental results demonstrated that the ability of the learning approach to identify all
relevant fuzzy rules from the training data set. A comparative study and analysis showed that
the proposed learning approach has a higher degree of predictive and detective capability than
existing models. Experiments also showed significance of some important phishing criteria like
URL & Domain Identity, Security & Encryption to the final phishing detection rate.
Finally, our proposed intelligent phishing website detection system was developed, tested and
validated by incorporating the scheme as a web based plug-ins phishing toolbar. The results
obtained are promising and showed that our intelligent fuzzy based classification detection
system can provide an effective help for real-time phishing website detection. The toolbar
successfully recognized and detected approximately 92% of the phishing websites selected from
our test data set, avoiding many miss-classified websites and false phishing alarms
An Ensemble Self-Structuring Neural Network Approach to Solving Classification Problems with Virtual Concept Drift and its Application to Phishing Websites
Classification in data mining is one of the well-known tasks that aim to construct a
classification model from a labelled input data set. Most classification models are
devoted to a static environment where the complete training data set is presented to the
classification algorithm. This data set is assumed to cover all information needed to
learn the pertinent concepts (rules and patterns) related to how to classify unseen
examples to predefined classes. However, in dynamic (non-stationary) domains, the set
of features (input data attributes) may change over time. For instance, some features
that are considered significant at time Ti might become useless or irrelevant at time Ti+j.
This situation results in a phenomena called Virtual Concept Drift. Yet, the set of
features that are dropped at time Ti+j might return to become significant again in the
future. Such a situation results in the so-called Cyclical Concept Drift, which is a direct
result of the frequently called catastrophic forgetting dilemma. Catastrophic forgetting
happens when the learning of new knowledge completely removes the previously
learned knowledge.
Phishing is a dynamic classification problem where a virtual concept drift might occur.
Yet, the virtual concept drift that occurs in phishing might be guided by some
malevolent intelligent agent rather than occurring naturally. One reason why phishers
keep changing the features combination when creating phishing websites might be that
they have the ability to interpret the anti-phishing tool and thus they pick a new set of
features that can circumvent it. However, besides the generalisation capability, fault
tolerance, and strong ability to learn, a Neural Network (NN) classification model is
considered as a black box. Hence, if someone has the skills to hack into the NN based
classification model, he might face difficulties to interpret and understand how the NN
processes the input data in order to produce the final decision (assign class value).
In this thesis, we investigate the problem of virtual concept drift by proposing a
framework that can keep pace with the continuous changes in the input features. The
proposed framework has been applied to phishing websites classification problem and
it shows competitive results with respect to various evaluation measures (Harmonic
Mean (F1-score), precision, accuracy, etc.) when compared to several other data mining
techniques. The framework creates an ensemble of classifiers (group of classifiers) and it
offers a balance between stability (maintaining previously learned knowledge) and
plasticity (learning knowledge from the newly offered training data set). Hence, the
framework can also handle the cyclical concept drift. The classifiers that constitute the
ensemble are created using an improved Self-Structuring Neural Networks algorithm
(SSNN). Traditionally, NN modelling techniques rely on trial and error, which is a
tedious and time-consuming process. The SSNN simplifies structuring NN classifiers
with minimum intervention from the user. The framework evaluates the ensemble
whenever a new data set chunk is collected. If the overall accuracy of the combined
results from the ensemble drops significantly, a new classifier is created using the SSNN
and added to the ensemble. Overall, the experimental results show that the proposed
framework affords a balance between stability and plasticity and can effectively handle
the virtual concept drift when applied to phishing websites classification problem. Most
of the chapters of this thesis have been subject to publicatio