Search CORE

1,583 research outputs found

Intelligent Security for Phishing Online using Adaptive Neuro Fuzzy Systems

Author: Barraclough Phoebe
Fehringer Gerhard
Publication venue: 'The Science and Information Organization'
Publication date: 01/01/2017
Field of study

Anti-phishing detection solutions employed in industry use blacklist-based approaches to achieve low false-positive rates, but blacklist approaches utilizes website URLs only. This study analyses and combines phishing emails and phishing web-forms in a single framework, which allows feature extraction and feature model construction. The outcome should classify between phishing, suspicious, legitimate and detect emerging phishing attacks accurately. The intelligent phishing security for online approach is based on machine learning techniques, using Adaptive Neuro-Fuzzy Inference System and a combination sources from which features are extracted. An experiment was performed using two-fold cross validation method to measure the system’s accuracy. The intelligent phishing security approach achieved a higher accuracy. The finding indicates that the feature model from combined sources can detect phishing websites with a higher accuracy. This paper contributes to phishing field a combined feature which sources in a single framework. The implication is that phishing attacks evolve rapidly; therefore, regular updates and being ahead of phishing strategy is the way forward

Northumbria University Research Portal

Crossref

High Accuracy Phishing Detection Based on Convolutional Neural Networks

Author: Alzaylaee Mohammed K.
Yerima Suleiman Y.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/02/2020
Field of study

The persistent growth in phishing and the rising volume of phishing websites has led to individuals and organizations worldwide becoming increasingly exposed to various cyber-attacks. Consequently, more effective phishing detection is required for improved cyber defence. Hence, in this paper we present a deep learning-based approach to enable high accuracy detection of phishing sites. The proposed approach utilizes convolutional neural networks (CNN) for high accuracy classification to distinguish genuine sites from phishing sites. We evaluate the models using a dataset obtained from 6,157 genuine and 4,898 phishing websites. Based on the results of extensive experiments, our CNN based models proved to be highly effective in detecting unknown phishing sites. Furthermore, the CNN based approach performed better than traditional machine learning classifiers evaluated on the same dataset, reaching 98.2% phishing detection rate with an F1-score of 0.976. The method presented in this pa-per compares favourably to the state-of-the art in deep learning based phishing website detection

arXiv.org e-Print Archive

Crossref

De Montfort University Open Research Archive

A Survey of Website Phishing Detection Techniques

Author: Zinal Shukla, Kirtirajsinh Zala, Riddhi
Publication venue: Auricle Global Society of Education and Research
Publication date: 31/01/2018
Field of study

This article surveys the literature on website phishing detection. Web Phishing lures the user to interact with the fake website. The main objective of this attack is to steal the sensitive information from the user. The attacker creates similar website that looks like original website. It allows attacker to obtain sensitive information such as username, password, credit card details etc. This paper aims to survey many of the recently proposed website phishing detection techniques. A high-level overview of various types of phishing detection techniques is also presented

International Journal on Future Revolution in Computer Science & Communication Engineering

Intelligent phishing website detection system using fuzzy techniques.

Author: Aburrous Maher R.
Dahal Keshav P.
Hossain M. Alamgir
Thabatah F.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2008
Field of study

Phishing websites are forged web pages that are created by malicious people to mimic web pages of real websites and it attempts to defraud people of their personal information. Detecting and identifying Phishing websites is really a complex and dynamic problem involving many factors and criteria, and because of the subjective considerations and the ambiguities involved in the detection, Fuzzy Logic model can be an effective tool in assessing and identifying phishing websites than any other traditional tool since it offers a more natural way of dealing with quality factors rather than exact values. In this paper, we present novel approach to overcome the `fuzziness¿ in traditional website phishing risk assessment and propose an intelligent resilient and effective model for detecting phishing websites. The proposed model is based on FL operators which is used to characterize the website phishing factors and indicators as fuzzy variables and produces six measures and criteria¿s of website phishing attack dimensions with a layer structure. Our experimental results showed the significance and importance of the phishing website criteria (URL & Domain Identity) represented by layer one, and the variety influence of the phishing characteristic layers on the final phishing website rate

Northumbria University Research Portal

Bradford Scholars

Dynamic Rule Covering Classification in Data Mining with Cyber Security Phishing Application

Author: Qabajeh Issa Mohammad
Publication venue: Centre for Computational Intelligence
Publication date: 01/05/2017
Field of study

Data mining is the process of discovering useful patterns from datasets using intelligent techniques to help users make certain decisions. A typical data mining task is classification, which involves predicting a target variable known as the class in previously unseen data based on models learnt from an input dataset. Covering is a well-known classification approach that derives models with If-Then rules. Covering methods, such as PRISM, have a competitive predictive performance to other classical classification techniques such as greedy, decision tree and associative classification. Therefore, Covering models are appropriate decision-making tools and users favour them carrying out decisions. Despite the use of Covering approach in data processing for different classification applications, it is also acknowledged that this approach suffers from the noticeable drawback of inducing massive numbers of rules making the resulting model large and unmanageable by users. This issue is attributed to the way Covering techniques induce the rules as they keep adding items to the rule’s body, despite the limited data coverage (number of training instances that the rule classifies), until the rule becomes with zero error. This excessive learning overfits the training dataset and also limits the applicability of Covering models in decision making, because managers normally prefer a summarised set of knowledge that they are able to control and comprehend rather a high maintenance models. In practice, there should be a trade-off between the number of rules offered by a classification model and its predictive performance. Another issue associated with the Covering models is the overlapping of training data among the rules, which happens when a rule’s classified data are discarded during the rule discovery phase. Unfortunately, the impact of a rule’s removed data on other potential rules is not considered by this approach. However, When removing training data linked with a rule, both frequency and rank of other rules’ items which have appeared in the removed data are updated. The impacted rules should maintain their true rank and frequency in a dynamic manner during the rule discovery phase rather just keeping the initial computed frequency from the original input dataset. In response to the aforementioned issues, a new dynamic learning technique based on Covering and rule induction, that we call Enhanced Dynamic Rule Induction (eDRI), is developed. eDRI has been implemented in Java and it has been embedded in WEKA machine learning tool. The developed algorithm incrementally discovers the rules using primarily frequency and rule strength thresholds. These thresholds in practice limit the search space for both items as well as potential rules by discarding any with insufficient data representation as early as possible resulting in an efficient training phase. More importantly, eDRI substantially cuts down the number of training examples scans by continuously updating potential rules’ frequency and strength parameters in a dynamic manner whenever a rule gets inserted into the classifier. In particular, and for each derived rule, eDRI adjusts on the fly the remaining potential rules’ items frequencies as well as ranks specifically for those that appeared within the deleted training instances of the derived rule. This gives a more realistic model with minimal rules redundancy, and makes the process of rule induction efficient and dynamic and not static. Moreover, the proposed technique minimises the classifier’s number of rules at preliminary stages by stopping learning when any rule does not meet the rule’s strength threshold therefore minimising overfitting and ensuring a manageable classifier. Lastly, eDRI prediction procedure not only priorities using the best ranked rule for class forecasting of test data but also restricts the use of the default class rule thus reduces the number of misclassifications. The aforementioned improvements guarantee classification models with smaller size that do not overfit the training dataset, while maintaining their predictive performance. The eDRI derived models particularly benefit greatly users taking key business decisions since they can provide a rich knowledge base to support their decision making. This is because these models’ predictive accuracies are high, easy to understand, and controllable as well as robust, i.e. flexible to be amended without drastic change. eDRI applicability has been evaluated on the hard problem of phishing detection. Phishing normally involves creating a fake well-designed website that has identical similarity to an existing business trustful website aiming to trick users and illegally obtain their credentials such as login information in order to access their financial assets. The experimental results against large phishing datasets revealed that eDRI is highly useful as an anti-phishing tool since it derived manageable size models when compared with other traditional techniques without hindering the classification performance. Further evaluation results using other several classification datasets from different domains obtained from University of California Data Repository have corroborated eDRI’s competitive performance with respect to accuracy, number of knowledge representation, training time and items space reduction. This makes the proposed technique not only efficient in inducing rules but also effective

De Montfort University Open Research Archive

Intelligent Phishing Detection Scheme Using Deep Learning Algorithms

Author: Adebowale Moruf A.
Hossain Mohammed Alamgir
Lwin Khin T.
Publication venue: 'Emerald'
Publication date: 04/06/2020
Field of study

Purpose: Phishing attacks have evolved in recent years due to high-tech-enabled economic growth worldwide. The rise in all types of fraud loss in 2019 has been attributed to the increase in deception scams and impersonation, as well as to sophisticated online attacks such as phishing. The global impact of phishing attacks will continue to intensify, and thus, a more efficient phishing detection method is required to protect online user activities. To address this need, this study focussed on the design and development of a deep learning-based phishing detection solution that leveraged the universal resource locator and website content such as images, text and frames. Design/methodology/approach: Deep learning techniques are efficient for natural language and image classification. In this study, the convolutional neural network (CNN) and the long short-term memory (LSTM) algorithm were used to build a hybrid classification model named the intelligent phishing detection system (IPDS). To build the proposed model, the CNN and LSTM classifier were trained by using 1m universal resource locators and over 10,000 images. Then, the sensitivity of the proposed model was determined by considering various factors such as the type of feature, number of misclassifications and split issues. Findings: An extensive experimental analysis was conducted to evaluate and compare the effectiveness of the IPDS in detecting phishing web pages and phishing attacks when applied to large data sets. The results showed that the model achieved an accuracy rate of 93.28% and an average detection time of 25 s. Originality/value: The hybrid approach using deep learning algorithm of both the CNN and LSTM methods was used in this research work. On the one hand, the combination of both CNN and LSTM was used to resolve the problem of a large data set and higher classifier prediction performance. Hence, combining the two methods leads to a better result with less training time for LSTM and CNN architecture, while using the image, frame and text features as a hybrid for our model detection. The hybrid features and IPDS classifier for phishing detection were the novelty of this study to the best of the authors' knowledge

Anglia Ruskin Research

Develop a Hybrid Classification using an Ensemble Model for Phishing Website Detection

Author: K Subashini
V Narmatha
Publication venue: Auricle Global Society of Education and Research
Publication date: 07/10/2023
Field of study

Solutions to threats posed by technical and social vulnerabilities must be found to secure the web interface. Social engineering attacks frequently use phishing as one of their vectors. The importance is promptly detecting phishing attacks has increased. The classifier model was constructed using publicly accessible data from trustworthy and phishing websites. A variety of methods were used to extract relevant features to build the model. Before a user experiences any harm, Machine Learning algorithms can reliably identify phishing attacks. To identify phishing attacks on the website, this study presents a novel ensemble model. In this paper, the Artificial Neural Network (ANN) and the Random Forest Classifier (RFC) are used in an ensemble method along with the Support Vector Machine (SVM). Compared to previous studies, this ensemble method more accurately and efficiently detects website phishing attacks. According to experimental findings, the proposed system detects phishing attacks 97.3% of the time

International Journal on Recent and Innovation Trends in Computing and Communication

Tutorial and Critical Analysis of Phishing Websites Methods

Author: Abu-Nimeh
Aburrous
Aburrous
Afroz
Angelo
Cendrowska
Cortes
Dhamija
Dhamija
Dowd
Downs
Fadi Thabtah
Florencio
Franklin
Goldreich
Guang
Guang
Halderman
Han
He
Herzberg
Huang
Jagatic
James
Joshi
Kang
Keizer
Kirda
Kumaraguru
Lee McCluskey
Liu
Liu
Ludl
Mannan
Manning
Miyamoto
Mizuno
Neil
Ohaya
Pan
Quinlan
Rami M. Mohammad
Ronald
Ronda
Ross
Sanglerdsinlapachai
Schneier
Sharifi
Sheng
Sodiya
Sullins
Wenyin
Witten
Wu
Wu
Yossi
Yu
Yue
Zhang
Publication venue: 'Elsevier BV'
Publication date: 13/05/2015
Field of study

The Internet has become an essential component of our everyday social and financial activities. Internet is not important for individual users only but also for organizations, because organizations that offer online trading can achieve a competitive edge by serving worldwide clients. Internet facilitates reaching customers all over the globe without any market place restrictions and with effective use of e-commerce. As a result, the number of customers who rely on the Internet to perform procurements is increasing dramatically. Hundreds of millions of dollars are transferred through the Internet every day. This amount of money was tempting the fraudsters to carry out their fraudulent operations. Hence, Internet users may be vulnerable to different types of web threats, which may cause financial damages, identity theft, loss of private information, brand reputation damage and loss of customers’ confidence in e-commerce and online banking. Therefore, suitability of the Internet for commercial transactions becomes doubtful. Phishing is considered a form of web threats that is defined as the art of impersonating a website of an honest enterprise aiming to obtain user’s confidential credentials such as usernames, passwords and social security numbers. In this article, the phishing phenomena will be discussed in detail. In addition, we present a survey of the state of the art research on such attack. Moreover, we aim to recognize the up-to-date developments in phishing and its precautionary measures and provide a comprehensive study and evaluation of these researches to realize the gap that is still predominating in this area. This research will mostly focus on the web based phishing detection methods rather than email based detection methods

Crossref

University of Huddersfield Repository

Huddersfield Research Portal

Phishing Detection using Base Classifier and Ensemble Technique

Author: Pal Rekha
Pal Saurabh
Pandey Manish Ranjan
Pandey Mithilesh Kumar
Shahi Shantanu
Shukla Arvind Kumar
Publication venue: Auricle Global Society of Education and Research
Publication date: 07/10/2023
Field of study

Phishing attacks continue to pose a significant threat in today's digital landscape, with both individuals and organizations falling victim to these attacks on a regular basis. One of the primary methods used to carry out phishing attacks is through the use of phishing websites, which are designed to look like legitimate sites in order to trick users into giving away their personal information, including sensitive data such as credit card details and passwords. This research paper proposes a model that utilizes several benchmark classifiers, including LR, Bagging, RF, K-NN, DT, SVM, and Adaboost, to accurately identify and classify phishing websites based on accuracy, precision, recall, f1-score, and confusion matrix. Additionally, a meta-learner and stacking model were combined to identify phishing websites in existing systems. The proposed ensemble learning approach using stack-based meta-learners proved to be highly effective in identifying both legitimate and phishing websites, achieving an accuracy rate of up to 97.19%, with precision, recall, and f1 scores of 97%, 98%, and 98%, respectively. Thus, it is recommended that ensemble learning, particularly with stacking and its meta-learner variations, be implemented to detect and prevent phishing attacks and other digital cyber threats

International Journal on Recent and Innovation Trends in Computing and Communication