853 research outputs found

    Characterizing Phishing Threats with Natural Language Processing

    Full text link
    Spear phishing is a widespread concern in the modern network security landscape, but there are few metrics that measure the extent to which reconnaissance is performed on phishing targets. Spear phishing emails closely match the expectations of the recipient, based on details of their experiences and interests, making them a popular propagation vector for harmful malware. In this work we use Natural Language Processing techniques to investigate a specific real-world phishing campaign and quantify attributes that indicate a targeted spear phishing attack. Our phishing campaign data sample comprises 596 emails - all containing a web bug and a Curriculum Vitae (CV) PDF attachment - sent to our institution by a foreign IP space. The campaign was found to exclusively target specific demographics within our institution. Performing a semantic similarity analysis between the senders' CV attachments and the recipients' LinkedIn profiles, we conclude with high statistical certainty (p <10−4< 10^{-4}) that the attachments contain targeted rather than randomly selected material. Latent Semantic Analysis further demonstrates that individuals who were a primary focus of the campaign received CVs that are highly topically clustered. These findings differentiate this campaign from one that leverages random spam.Comment: This paper has been accepted for publication by the IEEE Conference on Communications and Network Security in September 2015 at Florence, Italy. Copyright may be transferred without notice, after which this version may no longer be accessibl

    Knowledge modeling of phishing emails

    Get PDF
    This dissertation investigates whether or not malicious phishing emails are detected better when a meaningful representation of the email bodies is available. The natural language processing theory of Ontological Semantics Technology is used for its ability to model the knowledge representation present in the email messages. Known good and phishing emails were analyzed and their meaning representations fed into machine learning binary classifiers. Unigram language models of the same emails were used as a baseline for comparing the performance of the meaningful data. The end results show how a binary classifier trained on meaningful data is better at detecting phishing emails than a unigram language model binary classifier at least using some of the selected machine learning algorithms

    A Phishing Webpage Detection Method Based on Stacked Autoencoder and Correlation Coefficients

    Get PDF
    Phishing is a kind of cyber-attack that targets naive online users by tricking them into revealing sensitive information. There are many anti-phishing solutions proposed to date, such as blacklist or whitelist, heuristic-based and machine learning-based methods. However, online users are still being trapped into revealing sensitive information in phishing websites. In this paper, we propose a novel phishing webpage detection model, based on features that are extracted from URL, source codes of HTML, and the third-party services to represent the basic characters of phishing webpages, which uses a deep learning method – Stacked Autoencoder (SAE) to detect phishing webpages. To make features in the same order of magnitude, three kinds of normalization methods are adopted. In particular, a method to calculate correlation coefficients between weight matrixes of SAE is proposed to determine optimal width of hidden layers, which shows high computational efficiency and feasibility. Based on the testing of a set of phishing and benign webpages, the model using SAE achieves the best performance when compared to other algorithms such as Naive Bayes (NB), Support Vector Machine (SVM), Convolutional Neural Networks (CNN), and Recurrent Neural Networks (RNN). It indicates that the proposed detection model is promising and can be applied effectively to phishing detection

    Parameter optimization for intelligent phishing detection using Adaptive Neuro-Fuzzy

    Get PDF
    Phishing attacks has been growing rapidly in the past few years. As a result, a number of approaches have been proposed to address the problem. Despite various approaches proposed such as feature-based and blacklist-based via machine learning techniques, there is still a lack of accuracy and real-time solution. Most approaches applying machine learning techniques requires that parameters are tuned to solve a problem, but parameters are difficult to tune to a desirable output. This study presents a parameter tuning framework, using adaptive Neuron-fuzzy inference system with comprehensive data to maximize systems performance. Extensive experiment was conducted. During ten-fold cross-validation, the data is split into training and testing pairs and parameters are set according to desirable output and have achieved 98.74% accuracy. Our results demonstrated higher performance compared to other results in the field. This paper contributes new comprehensive data, novel parameter tuning method and applied a new algorithm in a new field. The implication is that adaptive neuron-fuzzy system with effective data and proper parameter tuning can enhance system performance. The outcome will provide a new knowledge in the field

    Intelligent phishing detection parameter framework for E-banking transactions based on Neuro-fuzzy

    Get PDF
    Phishing attacks have become more sophisticated in web-based transactions. As a result, various solutions have been developed to tackle the problem. Such solutions including feature-based and blacklist-based approaches applying machine learning algorithms. However, there is still a lack of accuracy and real-time solution. Most machine learning algorithms are parameter driven, but the parameters are difficult to tune to a desirable output. In line with Jiang and Ma’s findings, this study presents a parameter tuning framework, using Neuron-fuzzy system with comprehensive features in order to maximize systems performance. The neuron-fuzzy system was chosen because it has ability to generate fuzzy rules by given features and to learn new features. Extensive experiments were conducted, using different feature-sets, two cross-validation methods, a hybrid method and different parameters and achieved 98.4% accuracy. Our results demonstrated a high performance compared to other results in the field. As a contribution, we introduced a novel parameter tuning framework based on a neuron-fuzzy with six feature-sets and identified different numbers of membership functions different number of epochs, different sizes of feature-sets on a single platform. Parameter tuning based on neuron-fuzzy system with comprehensive features can enhance system performance in real-time. The outcome will provide guidance to the researchers who are using similar techniques in the field. It will decrease difficulties and increase confidence in the process of tuning parameters on a given problem

    Innovations of Phishing Defense: The Mechanism, Measurement and Defense Strategies

    Get PDF
    Now-a-days, social engineering is considered to be one of the most overwhelming threats in the field of cyber security. Social engineers, who deceive people by using their personal appeal through cunning communication, do not rely on finding the vulnerabilities to break into the cyberspace as traditional hackers. Instead, they make shifty communication with the victims that often enable them to gain confidential information like their credentials to compromise cyber security. Phishing attack has become one of the most commonly used social engineering methods in daily life. Since the attacker does not rely on technical vulnerabilities, social engineering, especially phishing attacks cannot be tackled using cyber security tools like firewalls, IDSs (Intrusion Detection Systems), etc. What is more, the increased popularity of the social media has further complicated the problem by availing abundance of information that can be used against the victims. The objective of this paper is to propose a new framework that characterizes the behavior of the phishing attack, and a comprehensive model for describing awareness, measurement and defense of phishing based attacks. To be specific, we propose a hybrid multi-layer model using Natural Language Processing (NLP) techniques for defending against phishing attacks. The model enables a new prospect in detection of a potential attacker trying to manipulate the victim for revealing confidential information

    Deceptive Previews: A Study of the Link Preview Trustworthiness in Social Platforms

    Get PDF
    Social media has become a primary mean of content and information sharing, thanks to its speed and simplicity. In this scenario, link previews play the important role of giving a meaningful first glance to users, summarizing the content of the shared webpage within their title, description and image. In our work, we analyzed the preview-rendering process, observing how it is possible to misuse it to obtain benign-looking previews for malicious links. Concrete use-case of this research field is phishing and spam spread, considering targeted attacks in addition to large-scale campaigns. We designed a set of experiments for 20 social media platforms including social networks and instant messenger applications and found out how most of the platforms follow their own preview design and format, sometimes providing partial information. Four of these platforms allow preview crafting so as to hide the malicious target even to a tech-savvy user, and we found that it is possible to create misleading previews for the remaining 16 platforms when an attacker can register their own domain. We also observe how 18 social media platforms do not employ active nor passive countermeasures against the spread of known malicious links or software, and that existing cross-checks on malicious URLs can be bypassed through client and server-side redirections. To conclude, we suggest seven recommendations covering the spectrum of our findings, to improve the overall preview-rendering mechanism and increase users’ overall trust in social media platforms
    • 

    corecore