Search CORE

3 research outputs found

Secure entity authentication

Author: Dou Zuochao
Publication venue: Digital Commons @ NJIT
Publication date: 31/05/2018
Field of study

According to Wikipedia, authentication is the act of confirming the truth of an attribute of a single piece of a datum claimed true by an entity. Specifically, entity authentication is the process by which an agent in a distributed system gains confidence in the identity of a communicating partner (Bellare et al.). Legacy password authentication is still the most popular one, however, it suffers from many limitations, such as hacking through social engineering techniques, dictionary attack or database leak. To address the security concerns in legacy password-based authentication, many new authentication factors are introduced, such as PINs (Personal Identification Numbers) delivered through out-of-band channels, human biometrics and hardware tokens. However, each of these authentication factors has its own inherent weaknesses and security limitations. For example, phishing is still effective even when using out-of-band-channels to deliver PINs (Personal Identification Numbers). In this dissertation, three types of secure entity authentication schemes are developed to alleviate the weaknesses and limitations of existing authentication mechanisms: (1) End user authentication scheme based on Network Round-Trip Time (NRTT) to complement location based authentication mechanisms; (2) Apache Hadoop authentication mechanism based on Trusted Platform Module (TPM) technology; and (3) Web server authentication mechanism for phishing detection with a new detection factor NRTT. In the first work, a new authentication factor based on NRTT is presented. Two research challenges (i.e., the secure measurement of NRTT and the network instabilities) are addressed to show that NRTT can be used to uniquely and securely identify login locations and hence can support location-based web authentication mechanisms. The experiments and analysis show that NRTT has superior usability, deploy-ability, security, and performance properties compared to the state-of-the-art web authentication factors. In the second work, departing from the Kerb eros-centric approach, an authentication framework for Hadoop that utilizes Trusted Platform Module (TPM) technology is proposed. It is proven that pushing the security down to the hardware level in conjunction with software techniques provides better protection over software only solutions. The proposed approach provides significant security guarantees against insider threats, which manipulate the execution environment without the consent of legitimate clients. Extensive experiments are conducted to validate the performance and the security properties of the proposed approach. Moreover, the correctness and the security guarantees are formally proved via Burrows-Abadi-Needham (BAN) logic. In the third work, together with a phishing victim identification algorithm, NRTT is used as a new phishing detection feature to improve the detection accuracy of existing phishing detection approaches. The state-of-art phishing detection methods fall into two categories: heuristics and blacklist. The experiments show that the combination of NRTT with existing heuristics can improve the overall detection accuracy while maintaining a low false positive rate. In the future, to develop a more robust and efficient phishing detection scheme, it is paramount for phishing detection approaches to carefully select the features that strike the right balance between detection accuracy and robustness in the face of potential manipulations. In addition, leveraging Deep Learning (DL) algorithms to improve the performance of phishing detection schemes could be a viable alternative to traditional machine learning algorithms (e.g., SVM, LR), especially when handling complex and large scale datasets

Digital Commons @ New Jersey Institute of Technology (NJIT)

Fast Detection of Zero-Day Phishing Websites Using Machine Learning

Author: Nagunwa Thomas
Publication venue
Publication date: 01/06/2022
Field of study

The recent global growth in the number of internet users and online applications has led to a massive volume of personal data transactions taking place over the internet. In order to gain access to the valuable data and services involved for undertaking various malicious activities, attackers lure users to phishing websites that steal user credentials and other personal data required to impersonate their victims. Sophisticated phishing toolkits and flux networks are increasingly being used by attackers to create and host phishing websites, respectively, in order to increase the number of phishing attacks and evade detection. This has resulted in an increase in the number of new (zero-day) phishing websites. Anti-malware software and web browsers’ anti-phishing filters are widely used to detect the phishing websites thus preventing users from falling victim to phishing. However, these solutions mostly rely on blacklists of known phishing websites. In these techniques, the time lag between creation of a new phishing website and reporting it as malicious leaves a window during which users are exposed to the zero-day phishing websites. This has contributed to a global increase in the number of successful phishing attacks in recent years. To address the shortcoming, this research proposes three Machine Learning (ML)-based approaches for fast and highly accurate prediction of zero-day phishing websites using novel sets of prediction features. The first approach uses a novel set of 26 features based on URL structure, and webpage structure and contents to predict zero-day phishing webpages that collect users’ personal data. The other two approaches detect zero-day phishing webpages, through their hostnames, that are hosted in Fast Flux Service Networks (FFSNs) and Name Server IP Flux Networks (NSIFNs). The networks consist of frequently changing machines hosting malicious websites and their authoritative name servers respectively. The machines provide a layer of protection to the actual service hosts against blacklisting in order to prolong the active life span of the services. Consequently, the websites in these networks become more harmful than those hosted in normal networks. Aiming to address them, our second proposed approach predicts zero-day phishing hostnames hosted in FFSNs using a novel set of 56 features based on DNS, network and host characteristics of the hosting networks. Our last approach predicts zero-day phishing hostnames hosted in NSIFNs using a novel set of 11 features based on DNS and host characteristics of the hosting networks. The feature set in each approach is evaluated using 11 ML algorithms, achieving a high prediction performance with most of the algorithms. This indicates the relevance and robustness of the feature sets for their respective detection tasks. The feature sets also perform well against data collected over a later time period without retraining the data, indicating their long-term effectiveness in detecting the websites. The approaches use highly diversified feature sets which is expected to enhance the resistance to various detection evasion tactics. The measured prediction times of the first and the third approaches are sufficiently low for potential use for real-time protection of users. This thesis also introduces a multi-class classification technique for evaluating the feature sets in the second and third approaches. The technique predicts each of the hostname types as an independent outcome thus enabling experts to use type-specific measures in taking down the phishing websites. Lastly, highly accurate methods for labelling hostnames based on number of changes of IP addresses of authoritative name servers, monitored over a specific period of time, are proposed

BCU Open Access