Search CORE

687 research outputs found

HTMLPhish: Enabling Phishing Web Page Detection by Applying Deep Learning Techniques on HTML Analysis

Author: Chen Yingke
Opara Chidimma
Wei Bo
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2020
Field of study

Recently, the development and implementation of phishing attacks require little technical skills and costs. This uprising has led to an ever-growing number of phishing attacks on the World Wide Web. Consequently, proactive techniques to fight phishing attacks have become extremely necessary. In this paper, we propose HTMLPhish, a deep learning based datadriven end-to-end automatic phishing web page classification approach. Specifically, HTMLPhish receives the content of the HTML document of a web page and employs Convolutional Neural Networks (CNNs) to learn the semantic dependencies in the textual contents of the HTML. The CNNs learn appropriate feature representations from the HTML document embeddings without extensive manual feature engineering. Furthermore, our proposed approach of the concatenation of the word and character embeddings allows our model to manage new features and ensure easy extrapolation to test data. We conduct comprehensive experiments on a dataset of more than 50,000 HTML documents that provides a distribution of phishing to benign web pages obtainable in the real-world that yields over 93% Accuracy and True Positive Rate. Also, HTMLPhish is a completely language-independent and client-side strategy which can, therefore, conduct web page phishing detection regardless of the textual language

arXiv.org e-Print Archive

Northumbria Research Link

Crossref

Lancaster E-Prints

Sensor Signal and Information Processing II [Editorial]

Author: Gao Bin
Woo Wai Lok
Publication venue: 'MDPI AG'
Publication date: 04/07/2020
Field of study

This Special Issue compiles a set of innovative developments on the use of sensor signals and information processing. In particular, these contributions report original studies on a wide variety of sensor signals including wireless communication, machinery, ultrasound, imaging, and internet data, and information processing methodologies such as deep learning, machine learning, compressive sensing, and variational Bayesian. All these devices have one point in common: These algorithms have incorporated some form of computational intelligence as part of their core framework in problem solving. They have the capacity to generalize and discover knowledge for themselves, learning to learn new information whenever unseen data are captured

Multidisciplinary Digital Publishing Institute

Northumbria Research Link

Sensor Signal and Information Processing II

Author
Publication venue: 'MDPI AG'
Publication date: 01/05/2021
Field of study

In the current age of information explosion, newly invented technological sensors and software are now tightly integrated with our everyday lives. Many sensor processing algorithms have incorporated some forms of computational intelligence as part of their core framework in problem solving. These algorithms have the capacity to generalize and discover knowledge for themselves and learn new information whenever unseen data are captured. The primary aim of sensor processing is to develop techniques to interpret, understand, and act on information contained in the data. The interest of this book is in developing intelligent signal processing in order to pave the way for smart sensors. This involves mathematical advancement of nonlinear signal processing theory and its applications that extend far beyond traditional techniques. It bridges the boundary between theory and application, developing novel theoretically inspired methodologies targeting both longstanding and emergent signal processing applications. The topic ranges from phishing detection to integration of terrestrial laser scanning, and from fault diagnosis to bio-inspiring filtering. The book will appeal to established practitioners, along with researchers and students in the emerging field of smart sensors processing

Directory of Open Access Books (DOAB)

A Machine Learning Technique for Detection of Social Media Fake News

Author: Arowolo Micheal Olaolu
Misra Sanjay
Ogundokun Oluwaseun Ogundokun
Publication venue
Publication date: 01/01/2023
Field of study

publishedVersio

IFE Brage

Look Before You Leap: Detecting Phishing Web Pages by Exploiting Raw URL And HTML Characteristics

Author: Chen Yingke
Opara Chidimma
wei Bo.
Publication venue
Publication date: 05/11/2020
Field of study

Cybercriminals resort to phishing as a simple and cost-effective medium to perpetrate cyber-attacks on today's Internet. Recent studies in phishing detection are increasingly adopting automated feature selection over traditional manually engineered features. This transition is due to the inability of existing traditional methods to extrapolate their learning to new data. To this end, in this paper, we propose WebPhish, a deep learning technique using automatic feature selection extracted from the raw URL and HTML of a web page. This approach is the first of its kind, which uses the concatenation of URL and HTML embedding feature vectors as input into a Convolutional Neural Network model to detect phishing attacks on web pages. Extensive experiments on a real-world dataset yielded an accuracy of 98 percent, outperforming other state-of-the-art techniques. Also, WebPhish is a client-side strategy that is completely language-independent and can conduct lightweight phishing detection regardless of the web page's textual language

arXiv.org e-Print Archive

Teeside University's Research Repository

Campus Safety Data Gathering, Classification, and Ranking Based on Clery-Act Reports

Author: Abo Elenin Walaa F
Publication venue: Digital Commons@Georgia Southern
Publication date: 01/01/2023
Field of study

Most existing campus safety rankings are based on criminal incident history with minimal or no consideration of campus security conditions and standard safety measures. Campus safety information published by universities/colleges is usually conceptual/qualitative and not quantitative and are based-on criminal records of these campuses. Thus, no explicit and trusted ranking method for these campuses considers the level of compliance with the standard safety measures. A quantitative safety measure is important to compare different campuses easily and to learn about specific campus safety conditions. In this thesis, we utilize Clery-Act reports of campuses to automatically analyze their safety conditions and generate a safety rank based on these reports. We first provide a survey of campus safety and security measures. We utilize our survey results to provide an automated data-gathering method for capturing standard campus safety data from Clery-act reports. We then utilize the collected information to classify existing campuses based on their safety conditions. Our research model is also capable to predict the safety rank of campuses based on their Clery-Act report by comparing it to existing Clery-Act reports of other campuses and reported rank on public resources. Our research on this thesis uses a number of languages, tools, and technologies such as Python, shell scripts, text conversion, data mining, spreadsheets, and others. We provide a detailed description of our research work on this topic, explain our research methodology, and finally describe our findings and results. This research contributes to the automated campus safety data generation, classification, and ranking

Georgia Southern University: Digital Commons@Georgia Southern

Recommended from our members

Phishing websites detection by using optimized stacking ensemble model

Author: Abdulkarem Mohammed B
Al-Hadhrami T
Al-Sarem M
Alreshidi A
Alshammari MT
Ghaleb Al-Mekhlafi Z
Saeed F
Sarheed Alshammari T
Publication venue: 'Computers, Materials and Continua (Tech Science Press)'
Publication date: 01/01/2022
Field of study

Phishing attacks are security attacks that do not affect only individuals’ or organizations’ websites but may affect Internet of Things (IoT) devices and networks. IoT environment is an exposed environment for such attacks. Attackers may use thingbots software for the dispersal of hidden junk emails that are not noticed by users. Machine and deep learning and other methods were used to design detection methods for these attacks. However, there is still a need to enhance detection accuracy. Optimization of an ensemble classification method for phishing website (PW) detection is proposed in this study. A Genetic Algorithm (GA) was used for the proposed method optimization by tuning several ensemble Machine Learning (ML) methods parameters, including Random Forest (RF), AdaBoost (AB), XGBoost (XGB), Bagging (BA), GradientBoost (GB), and LightGBM (LGBM). These were accomplished by ranking the optimized classifiers to pick out the best classifiers as a base for the proposed method. A PW dataset that is made up of 4898 PWs and 6157 legitimate websites (LWs) was used for this study's experiments. As a result, detection accuracy was enhanced and reached 97.16 percent

Nottingham Trent Institutional Repository (IRep)

Recommended from our members

An optimized stacking ensemble model for phishing websites detection

Author: Al-Hadhrami T
Al-Mekhlafi ZG
Al-Sarem M
Alreshidi A
Alshammari MT
Alshammari TS
Mohammed BA
Saeed F
Publication venue: 'MDPI AG'
Publication date: 28/05/2021
Field of study

Security attacks on legitimate websites to steal users’ information, known as phishing attacks, have been increasing. This kind of attack does not just affect individuals’ or organisations’ websites. Although several detection methods for phishing websites have been proposed using machine learning, deep learning, and other approaches, their detection accuracy still needs to be enhanced. This paper proposes an optimized stacking ensemble method for phishing website detection. The optimisation was carried out using a genetic algorithm (GA) to tune the parameters of several ensemble machine learning methods, including random forests, AdaBoost, XGBoost, Bagging, GradientBoost, and LightGBM. The optimized classifiers were then ranked, and the best three models were chosen as base classifiers of a stacking ensemble method. The experiments were conducted on three phishing website datasets that consisted of both phishing websites and legitimate websites—the Phishing Websites Data Set from UCI (Dataset 1); Phishing Dataset for Machine Learning from Mendeley (Dataset 2, and Datasets for Phishing Websites Detection from Mendeley (Dataset 3). The experimental results showed an improvement using the optimized stacking ensemble method, where the detection accuracy reached 97.16%, 98.58%, and 97.39% for Dataset 1, Dataset 2, and Dataset 3, respectivel

Nottingham Trent Institutional Repository (IRep)

Applications in security and evasions in machine learning : a survey

Author: Borrego Iglesias Carlos
Jhaveri Rutvij
Sagar Ramani
Publication venue: 'MDPI AG'
Publication date: 01/01/2020
Field of study

In recent years, machine learning (ML) has become an important part to yield security and privacy in various applications. ML is used to address serious issues such as real-time attack detection, data leakage vulnerability assessments and many more. ML extensively supports the demanding requirements of the current scenario of security and privacy across a range of areas such as real-time decision-making, big data processing, reduced cycle time for learning, cost-efficiency and error-free processing. Therefore, in this paper, we review the state of the art approaches where ML is applicable more effectively to fulfill current real-world requirements in security. We examine different security applications' perspectives where ML models play an essential role and compare, with different possible dimensions, their accuracy results. By analyzing ML algorithms in security application it provides a blueprint for an interdisciplinary research area. Even with the use of current sophisticated technology and tools, attackers can evade the ML models by committing adversarial attacks. Therefore, requirements rise to assess the vulnerability in the ML models to cope up with the adversarial attacks at the time of development. Accordingly, as a supplement to this point, we also analyze the different types of adversarial attacks on the ML models. To give proper visualization of security properties, we have represented the threat model and defense strategies against adversarial attack methods. Moreover, we illustrate the adversarial attacks based on the attackers' knowledge about the model and addressed the point of the model at which possible attacks may be committed. Finally, we also investigate different types of properties of the adversarial attacks

Multidisciplinary Digital Publishing Institute

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

RECERCAT

Diposit Digital de Documents de la UAB