7 research outputs found

    Unsupervised authorship analysis of phishing webpages

    Get PDF
    Authorship analysis on phishing websites enables the investigation of phishing attacks, beyond basic analysis. In authorship analysis, salient features from documents are used to determine properties about the author, such as which of a set of candidate authors wrote a given document. In unsupervised authorship analysis, the aim is to group documents such that all documents by one author are grouped together. Applying this to cyber-attacks shows the size and scope of attacks from specific groups. This in turn allows investigators to focus their attention on specific attacking groups rather than trying to profile multiple independent attackers. In this paper, we analyse phishing websites using the current state of the art unsupervised authorship analysis method, called NUANCE. The results indicate that the application produces clusters which correlate strongly to authorship, evaluated using expert knowledge and external information as well as showing an improvement over a previous approach with known flaws. © 2012 IEEE

    Client Side Script Phishing Attacks Detection Method using Active Content Popularity Monitoring

    Get PDF
    The phisher can attack the client side script by means of threatening information which affects the majority of online users in sequence. The malicious users steal a variety of sensitive information from financial organizations in order to run nameless client side script in the phishing attack. In most of the time, the consumer will ignore association script and popup windows which in turn run a set of malicious processes and send the sensitive information to the remote sites. To secure consumers by limiting the client side script, an effective Client Side Script Phishing Attack Detection (CSSPAD) method is proposed to detect the client side script phishing attacks. The proposed methodis based on Active Content Popularity Monitoring (ACPM) and client script classification methods. This method categorizes the client side script according to a mixture of factors like the quantity of information being transferred by the script, the parent information of the script is being accessed. The proposed method computes the active time of the script, amount of data transferred and popularity of the webpage

    A Multi-Pronged Approach to Phishing Email Detection

    Get PDF
    Phishing emails are a nuisance and a growing threat for the world causing loss of time, effort and money. In this era of online communication and electronic data exchange, every individual connected to the Internet has to face the danger of phishing attacks. Typically, benign-looking emails are used as the attack vectors, which trick users into revealing sensitive information like login credentials, credit-card details, etc. Since every email contains important information in its header, this thesis describes ways of capturing this information for successful classification of phishing emails. Moreover, the phisher has total control over the email body and subject, but little control over the header after the email leaves the sender's domain, unless the phisher is sophisticated and spends a lot of time crafting the attack, which reduces the payoff or may even backfire or yield mixed results. This thesis is a consolidated account of various systems designed to combat phishing emails from different dimensions. The main area of focus is email header. Techniques like n-gram analysis, machine learning and network port scanning are used to extract useful features from the emails. This thesis shows that the classes of features used in these systems are very effective in distinguishing the phishing emails from the legitimate ones. Using different real datasets from varied domains, it highlights the robustness of the methods presented. Some methods, like the header-domain analysis, obtain high detection rates of 99.9% and low false positive rates of 0.1%. These approaches have the advantage and flexibility that they can be easily combined with other existing methods, in addition to being used in standalone mode.Computer Science, Department o

    Authorship Attribution of Source Code: A Language-Agnostic Approach and Applicability in Software Engineering

    Full text link
    Authorship attribution of source code has been an established research topic for several decades. State-of-the-art results for the authorship attribution problem look promising for the software engineering field, where they could be applied to detect plagiarized code and prevent legal issues. With this study, we first introduce a language-agnostic approach to authorship attribution of source code. Two machine learning models based on our approach match or improve over state-of-the-art results, originally achieved by language-specific approaches, on existing datasets for code in C++, Python, and Java. After that, we discuss limitations of existing synthetic datasets for authorship attribution, and propose a data collection approach that delivers datasets that better reflect aspects important for potential practical use in software engineering. In particular, we discuss the concept of work context and its importance for authorship attribution. Finally, we demonstrate that high accuracy of authorship attribution models on existing datasets drastically drops when they are evaluated on more realistic data. We conclude the paper by outlining next steps in design and evaluation of authorship attribution models that could bring the research efforts closer to practical use.Comment: 12 page

    The Effect of Code Obfuscation on Authorship Attribution of Binary Computer Files

    Get PDF
    In many forensic investigations, questions linger regarding the identity of the authors of the software specimen. Research has identified methods for the attribution of binary files that have not been obfuscated, but a significant percentage of malicious software has been obfuscated in an effort to hide both the details of its origin and its true intent. Little research has been done around analyzing obfuscated code for attribution. In part, the reason for this gap in the research is that deobfuscation of an unknown program is a challenging task. Further, the additional transformation of the executable file introduced by the obfuscator modifies or removes features from the original executable that would have been used in the author attribution process. Existing research has demonstrated good success in attributing the authorship of an executable file of unknown provenance using methods based on static analysis of the specimen file. With the addition of file obfuscation, static analysis of files becomes difficult, time consuming, and in some cases, may lead to inaccurate findings. This paper presents a novel process for authorship attribution using dynamic analysis methods. A software emulated system was fully instrumented to become a test harness for a specimen of unknown provenance, allowing for supervised control, monitoring, and trace data collection during execution. This trace data was used as input into a supervised machine learning algorithm trained to identify stylometric differences in the specimen under test and provide predictions on who wrote the specimen. The specimen files were also analyzed for authorship using static analysis methods to compare prediction accuracies with prediction accuracies gathered from this new, dynamic analysis based method. Experiments indicate that this new method can provide better accuracy of author attribution for files of unknown provenance, especially in the case where the specimen file has been obfuscated

    Automatically determining phishing campaigns using the USCAP methodology

    Get PDF
    Phishing fraudsters attempt to create an environment which looks and feels like a legitimate institution, while at the same time attempting to bypass filters and suspicions of their targets. This is a difficult compromise for the phishers and presents a weakness in the process of conducting this fraud. In this research, a methodology is presented that looks at the differences that occur between phishing websites from an authorship analysis perspective and is able to determine different phishing campaigns undertaken by phishing groups. The methodology is named USCAP, for Unsupervised SCAP, which builds on the SCAP methodology from supervised authorship and extends it for unsupervised learning problems. The phishing website source code is examined to generate a model that gives the size and scope of each of the recognized phishing campaigns. The USCAP methodology introduces the first time that phishing websites have been clustered by campaign in an automatic and reliable way, compared to previous methods which relied on costly expert analysis of phishing websites. Evaluation of these clusters indicates that each cluster is strongly consistent with a high stability and reliability when analyzed using new information about the attacks, such as the dates that the attack occurred on. The clusters found are indicative of different phishing campaigns, presenting a step towards an automated phishing authorship analysis methodology. © 2010 IEEE

    Emissions Trading in Practice : A Handbook on Design and Implementation

    Get PDF
    Currently, about 46 national jurisdictions and 35 cities, states, and regions, representing almost a quarter of global greenhouse gas (GHG) emissions, are putting a price on carbon as a central component of their efforts to reduce emissions and place their growth trajectory on a more sustainable footing. An increasing number of these jurisdictions are approaching carbon pricing through the design and implementation of Emissions Trading Systems (ETS). As of 2021, ETSs were operating across four continents in 38 countries, 18 states or provinces, and six cities covering over 40 percent of global gross domestic product (GDP), and additional systems are under development. This handbook sets out a 10-step process for designing and implementing an ETS. These steps are interdependent, and the choices made at each step will have important repercussions for decisions in the other steps. In practice the process of ETS design will be iterative rather than linear. The need to adjust and adapt policies over time is reflected in the update of this handbook, which was first released in 2016. New insights, approaches, and designs have proliferated adjusting the way ETSs operate and further developing our understanding of them
    corecore