194 research outputs found

    Unbiased phishing detection using domain name based features

    Get PDF
    2018 Summer.Includes bibliographical references.Internet users are coming under a barrage of phishing attacks of increasing frequency and sophistication. While these attacks have been remarkably resilient against the vast range of defenses proposed by academia, industry, and research organizations, machine learning approaches appear to be a promising one in distinguishing between phishing and legitimate websites. There are three main concerns with existing machine learning approaches for phishing detection. The first concern is there is neither a framework, preferably open-source, for extracting feature and keeping the dataset updated nor an updated dataset of phishing and legitimate website. The second concern is the large number of features used and the lack of validating arguments for the choice of the features selected to train the machine learning classifier. The last concern relates to the type of datasets used in the literature that seems to be inadvertently biased with respect to the features based on URL or content. In this thesis, we describe the implementation of our open-source and extensible framework to extract features and create up-to-date phishing dataset. With having this framework, named Fresh-Phish, we implemented 29 different features that we used to detect whether a given website is legitimate or phishing. We used 26 features that were reported in related work and added 3 new features and created a dataset of 6,000 websites with these features of which 3,000 were malicious and 3,000 were genuine and tested our approach. Using 6 different classifiers we achieved the accuracy of 93% which is a reasonable high in this field. To address the second and third concerns, we put forward the intuition that the domain name of phishing websites is the tell-tale sign of phishing and holds the key to successful phishing detection. We focus on this aspect of phishing websites and design features that explore the relationship of the domain name to the key elements of the website. Our work differs from existing state-of-the-art as our feature set ensures that there is minimal or no bias with respect to a dataset. Our learning model trains with only seven features and achieves a true positive rate of 98% and a classification accuracy of 97%, on sample dataset. Compared to the state-of-the-art work, our per data instance processing and classification is 4 times faster for legitimate websites and 10 times faster for phishing websites. Importantly, we demonstrate the shortcomings of using features based on URLs as they are likely to be biased towards dataset collection and usage. We show the robustness of our learning algorithm by testing our classifiers on unknown live phishing URLs and achieve a higher detection accuracy of 99.7% compared to the earlier known best result of 95% detection rate

    An Evasion Attack against ML-based Phishing URL Detectors

    Full text link
    Background: Over the year, Machine Learning Phishing URL classification (MLPU) systems have gained tremendous popularity to detect phishing URLs proactively. Despite this vogue, the security vulnerabilities of MLPUs remain mostly unknown. Aim: To address this concern, we conduct a study to understand the test time security vulnerabilities of the state-of-the-art MLPU systems, aiming at providing guidelines for the future development of these systems. Method: In this paper, we propose an evasion attack framework against MLPU systems. To achieve this, we first develop an algorithm to generate adversarial phishing URLs. We then reproduce 41 MLPU systems and record their baseline performance. Finally, we simulate an evasion attack to evaluate these MLPU systems against our generated adversarial URLs. Results: In comparison to previous works, our attack is: (i) effective as it evades all the models with an average success rate of 66% and 85% for famous (such as Netflix, Google) and less popular phishing targets (e.g., Wish, JBHIFI, Officeworks) respectively; (ii) realistic as it requires only 23ms to produce a new adversarial URL variant that is available for registration with a median cost of only $11.99/year. We also found that popular online services such as Google SafeBrowsing and VirusTotal are unable to detect these URLs. (iii) We find that Adversarial training (successful defence against evasion attack) does not significantly improve the robustness of these systems as it decreases the success rate of our attack by only 6% on average for all the models. (iv) Further, we identify the security vulnerabilities of the considered MLPU systems. Our findings lead to promising directions for future research. Conclusion: Our study not only illustrate vulnerabilities in MLPU systems but also highlights implications for future study towards assessing and improving these systems.Comment: Draft for ACM TOP

    Phishing in email and instant messaging

    Get PDF
    Abstract. Phishing is a constantly evolving threat in the world of information security that affects everyone, no matter if you’re a retail worker or the head of IT in a large organisation. Because of this, this thesis aims to give the reader a good overview of what phishing is, and due to its prevalence in email and instant messaging, focuses on educating the reader on common signs and techniques used in phishing in the aforementioned forms of communication. The chosen research method is literature review, as it is the ideal choice for presenting an overview of a larger subject. As a result of the research, many common phishing signs and techniques in both email and instant messaging are presented. Some of these signs include strange senders, fake domain names and spellings mistakes. With this thesis, anyone looking to improve their understanding about phishing can do so in a way that is easy to understand. Some suggestions for future research are also presented based on this thesis’ shortcomings, namely the lack of studies on phishing in instant messaging

    Computer Crime and Identity theft

    Get PDF
    The problem at hand is the increased amount of vulnerabilities and security hazards for individuals engaging in e-commerce, business transactions over the World Wide Web. Since the majority of people aren\u27t paying their bills by mailing in their payment to the vendor, they pay for the items they purchase online, which makes them open to hackers and social engineering attacks. They place their credit card/debit card numbers, their phone number and home address, and even their birth date information on company websites. All these security vulnerabilities make the risk of identity theft increasingly high. Identity theft is when an individual\u27s personal (confidential) information, such as social security or account numbers, is stolen and used against them

    SUBJECT MATTER EXPERTS’ FEEDBACK ON EXPERIMENTAL PROCEDURES TO MEASURE USER’S JUDGMENT ERRORS IN SOCIAL ENGINEERING ATTACKS

    Get PDF
    Distracted users can fail to correctly distinguish the differences between legitimate and malicious emails or search engine results. Mobile phone users can have a more challenging time identifying malicious content due to the smaller screen size and the limited security features in mobile phone applications. Thus, the main goal of this research study was to design, develop, and validate a set of field experiments to assess user’s judgment when exposed to two types of simulated social engineering attacks: phishing and Potentially Malicious Search Engine Results (PMSER), based on the interaction of the environment (distracting vs. non-distracting) and type of device used (mobile vs. computer). In this paper, we provide the results from the Delphi methodology research we conducted using an expert panel consisting of 28 cybersecurity Subject Matter Experts (SMEs) who participated, out of 60 cybersecurity experts invited. Half of the SMEs were with over 10 years of experience in cybersecurity, the rest around five years. SMEs were asked to validate two sets of experimental tasks (phishing & PMSER) as specified in RQ1. The SMEs were then asked to identify physical and Audio/Visual (A/V) environmental factors for distracting and non-distracting environments. About 50% of the SMEs found that an airport was the most distracting environment for mobile phone and computer users. About 35.7% of the SMEs also found that a home environment was the least distracting environment for users, with an office setting coming into a close second place. About 67.9% of the SMEs chose “all” for the most distracting A/V distraction level, which included continuous background noise, visual distractions, and distracting/loud music. About 46.4% of the SMEs chose “all” for the least distracting A/V level, including a quiet environment, relaxing background music, and no visual distractions. The SMEs were then asked to evaluate a randomization table. This was important for RQ2 to set up the eight experimental protocols to maintain the validity of the proposed experiment. About 89.3% indicated a strong consensus that we should keep the randomization as it is. The SMEs were also asked whether we should keep, revise, or replace the number of questions for each mini-IQ test to three questions each. About 75% of the SMEs responded that we should keep the number of mini-IQ questions to three. Finally, the SMEs were asked to evaluate the proposed procedures for the pilot testing and experimental research phases conducted in the future. About 96.4% of the SMEs selected to keep the first pilot testing procedure. For second and third pilot testing procedures, the SMEs responded with an 89.3% strong consensus to keep the procedures. For the first experimental procedure, a strong consensus of 92.9% of the SMEs recommended keeping the procedure. Finally, for the third experimental procedure, there was an 85.7% majority to keep the procedure. The expert panel was used to validate the proposed experimental procedures and recommended adjustments. The conclusions, study limitations, and recommendations for future research are discussed

    Towards an Assessment of Judgment Errors in Social Engineering Attacks Due to Environment and Device Type

    Get PDF
    Phishing continues to be a significant invasive threat to computer and mobile device users. Cybercriminals continuously develop new phishing schemes using email, and malicious search engine links to gather personal information of unsuspecting users. This information is used for financial gains through identity theft schemes or draining financial accounts of victims. Users are often distracted and fail to fully process the phishing attacks then unknowingly fall victim to the scam until much later. Users operating mobile phones and computers are likely to make judgment errors when making decisions in distracting environments due to cognitive overload. Distracted users can fail to correctly distinguish the differences between legitimate and malicious emails or search engine results. Mobile phone users can have even a harder time identifying malicious content due to the smaller screen size and the limited security features in mobile phone applications. Thus, the main goal of this work-in-progress research study is to design, develop, and validate a set of field experiments to assess users judgment when exposed to two types of simulated social engineering attacks (phishing & possibly malicious search engine results (PMSER)), based on the interaction of the kind of environment (distracting vs. non-distracting) and type of device used (mobile vs. computer). In this paper, we outlines the Delphi methodology phase that this study will take using an expert panel to validate the proposed experimental procedures and recommend further steps for the empirical testing. The conclusions, study limitations and recommendations for future research are discussed. Keywords: Cybersecurity, social engineering, judgment error in cybersecurity, phishing email mitigation, distracting environment

    Experimental Study to Assess the Role of Environment and Device Type on the Success of Social Engineering Attacks: The Case of Judgment Errors

    Get PDF
    Phishing continues to be an invasive threat to computer and mobile device users. Cybercriminals continuously develop new phishing schemes using e-mail and malicious search engine links to gather the personal information of unsuspecting users. This information is used for financial gains through identity theft schemes or draining victims\u27 financial accounts. Many users of varying demographic backgrounds fall victim to phishing schemes at one time or another. Users are often distracted and fail to process the phishing attempts fully, then unknowingly fall victim to the scam until much later. Users operating mobile phones and computers are likely to make judgment errors when making decisions in distracting environments due to cognitive overload. Distracted users cannot distinguish between legitimate and malicious emails or search engine results correctly. Mobile phone users can have a harder time distinguishing malicious content due to the smaller screen size and the limited security features in mobile phone applications. The main goal of this research study was to design, develop, and validate experimental settings to empirically test if there are significant mean differences in users’ judgment when: exposed to two types of simulated social engineering attacks (phishing & Potentially Malicious Search Engine Results (PMSER)), based on the interaction of the kind of environment (distracting vs. non-distracting) and type of device used (mobile vs. computer). This research used field experiments to test whether users are more likely to fall for phishing schemes in a distracting environment while using mobile phones or desktop/laptop computers. The second phase included a pilot test with 10 participants testing the Subject Matter Experts (SME) validated tasks and measures. The third phase included the delivery of the validated tasks and measures that were revised through the pilot testing phase with 68 participants. The results of the first phase have SME validated two sets of experimental tasks and eight experimental protocols to assess the measures of users’ judgment when exposed to two types of simulated social engineering attacks (phishing & PMSER) in two kinds of environments (distracting vs. non-distracting) and two types of devices (mobile phone vs. computer). The second phase results, the phishing mini-IQ test results, do not follow what was initially indicated in prior literature. Specifically, it was surprising to learn that the non-distracting environment results for the Phishing IQ tests were overall lower than those of distracting environment, which is counter to what was envisioned. These Phishing IQ test results may be assumed to be because, during the distracting environment, the participants were monitored over zoom to enable the distracting sound file. In contrast, in the non-distracting environment, they have marked the selections independently and may have rushed to identify the phishing samples. In contrast, PMSER detection on a computer outperformed mobile devices. It is suspected that these results are more accurate as individuals’ familiarity with PMSER is much lower. Their habituation to such messages is more deficient, causing them to pay closer attention and be more precise in their detections. A two-way Analysis of Variance (ANOVA) was conducted on the results. While it appears that some variations do exist, none of the comparisons were significant for Phishing IQ tests by environment (F=3.714, p=0.061) or device type (F=0.380, p=0.541), and PMSER IQ tests by environment (F=1.383, p=0.247) or device type (F=0.228, p=0.636). The results for the final phase showed there were no significant differences among both groups for Phishing and PMSER (F=0.985, p=0.322) and PMSER (F=3.692, p=0.056) using a two-way ANOVA. The two-way ANOVA results also showed significant differences among both groups for Phishing and PMSER vs. Device Type and Environment, Phishing (F=3.685, p=0.013), PMSER (F=1.629, p=0.183). A two-way ANOVA was evaluated for significant differences between groups. The results of the two-way ANOVA showed there were significant differences among both groups for Phishing and PMSER vs. Device Type and Environment. Phishing (F=3.685, p=0.013), PMSER (F=1.629, p=0.183). The p-values of the F-test for the Phishing IQ vs. Device Type and Environment were lower than the .05 level of significance. The two-way Analysis of Covariance (ANCOVA) results showed significant differences between Phishing vs. Environment and Device Type plus PMSER vs. Environment and Device Type. Specifically, the Education covariate for Table 32(F=3.930, p=0.048), Table 33(F=3.951, p=0.048), Table 34(F=10.429, p=0.001), and Table 35(F=10.329, p=0.001) was lower than the .05 level of significance
    • …
    corecore