Search CORE

10,010 research outputs found

On the Feasibility of Malware Authorship Attribution

Author: A Rahimian
C Kruegel
DE Knuth
DI Holmes
EH Spafford
F Can
G Frantzeskou
I Krsul
J Ferrante
M Fowler
N Pržulj
N Rosenblum
S Alrabaee
S Alrabaee
S Alrabaee
S Burrows
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 10/01/2017
Field of study

There are many occasions in which the security community is interested to discover the authorship of malware binaries, either for digital forensics analysis of malware corpora or for thwarting live threats of malware invasion. Such a discovery of authorship might be possible due to stylistic features inherent to software codes written by human programmers. Existing studies of authorship attribution of general purpose software mainly focus on source code, which is typically based on the style of programs and environment. However, those features critically depend on the availability of the program source code, which is usually not the case when dealing with malware binaries. Such program binaries often do not retain many semantic or stylistic features due to the compilation process. Therefore, authorship attribution in the domain of malware binaries based on features and styles that will survive the compilation process is challenging. This paper provides the state of the art in this literature. Further, we analyze the features involved in those techniques. By using a case study, we identify features that can survive the compilation process. Finally, we analyze existing works on binary authorship attribution and study their applicability to real malware binaries.Comment: FPS 201

arXiv.org e-Print Archive

Crossref

The Effect of Code Obfuscation on Authorship Attribution of Binary Computer Files

Author: Hendrikse Steven
Publication venue: NSUWorks
Publication date: 01/01/2017
Field of study

In many forensic investigations, questions linger regarding the identity of the authors of the software specimen. Research has identified methods for the attribution of binary files that have not been obfuscated, but a significant percentage of malicious software has been obfuscated in an effort to hide both the details of its origin and its true intent. Little research has been done around analyzing obfuscated code for attribution. In part, the reason for this gap in the research is that deobfuscation of an unknown program is a challenging task. Further, the additional transformation of the executable file introduced by the obfuscator modifies or removes features from the original executable that would have been used in the author attribution process. Existing research has demonstrated good success in attributing the authorship of an executable file of unknown provenance using methods based on static analysis of the specimen file. With the addition of file obfuscation, static analysis of files becomes difficult, time consuming, and in some cases, may lead to inaccurate findings. This paper presents a novel process for authorship attribution using dynamic analysis methods. A software emulated system was fully instrumented to become a test harness for a specimen of unknown provenance, allowing for supervised control, monitoring, and trace data collection during execution. This trace data was used as input into a supervised machine learning algorithm trained to identify stylometric differences in the specimen under test and provide predictions on who wrote the specimen. The specimen files were also analyzed for authorship using static analysis methods to compare prediction accuracies with prediction accuracies gathered from this new, dynamic analysis based method. Experiments indicate that this new method can provide better accuracy of author attribution for files of unknown provenance, especially in the case where the specimen file has been obfuscated

NSU Works

Identifying Authorship Style in Malicious Binaries: Techniques, Challenges & Datasets

Author: Cavallaro L
Gray J
Sgandurra D
Publication venue: 'Center for Open Science'
Publication date: 18/01/2021
Field of study

Attributing a piece of malware to its creator typically requires threat intelligence. Binary attribution increases the level of difficulty as it mostly relies upon the ability to disassemble binaries to identify authorship style. Our survey explores malicious author style and the adversarial techniques used by them to remain anonymous. We examine the adversarial impact on the state-of-the-art methods. We identify key findings and explore the open research challenges. To mitigate the lack of ground truth datasets in this domain, we publish alongside this survey the largest and most diverse meta-information dataset of 15,660 malware labeled to 164 threat actor groups

arXiv.org e-Print Archive

UCL Discovery

The Writing for Healing and Transformation Project

Author: Osborn Heather Elizabeth
Publication venue: Research Commons at Kutztown University
Publication date: 22/03/2021
Field of study

As a qualitative action research study, the purpose of The Writing for Healing and Transformation Project was to facilitate more inclusive writing strategies and to promote individual and collective healing on issues of social suffering and oppression (Kleinman, Das, & Lock, 1997; Pennebaker & Smyth, 2016) for diverse students at a community college located in the northeastern United States. The 18 participants in the study included students in my English II literature and composition course. The theoretical framework encompassed Pennebaker’s (2016) “writing for healing” paradigm, advocating the use of expressivist writing and “social suffering theory,” examining how power structures affect social problems (Kleinman, Das, & Lock, 1997). As an intervention, course readings included literature with social suffering themes. Postmodernism and Poststructural Feminism were also central theoretical components of the study, introducing the use of the semiotic strategies of translingualism and multimodalities to examine teaching strategies. The intended results were to engage students as agents of community caregiving for social healing through the publication of a charity book on a social suffering theme chosen by the students and to facilitate inclusive and alternative methods of rhetorical expression. The data collected included a recorded book theme discussion, the students’ submissions for the book, and semi-structured interviews with three participants. Using open coding, the results demonstrated a number of benefits to students, including increased confidence and poststructural shifts in thinking and writing. Book submissions exhibited a variety of rhetorical styles and semiotic strategies, along with defined solutions for healing on social suffering topics

Research Commons Kutztown University

BinGold: Towards robust binary analysis by extracting the semantics of binary code as semantic flow graphs (SFGs)

Author: Alrabaee Saed
Debbabi Mourad
Wang Lingyu
Publication venue: The Author(s). Published by Elsevier Ltd.
Publication date: 07/08/2016
Field of study

AbstractBinary analysis is useful in many practical applications, such as the detection of malware or vulnerable software components. However, our survey of the literature shows that most existing binary analysis tools and frameworks rely on assumptions about specific compilers and compilation settings. It is well known that techniques such as refactoring and light obfuscation can significantly alter the structure of code, even for simple programs. Applying such techniques or changing the compiler and compilation settings can significantly affect the accuracy of available binary analysis tools, which severely limits their practicability, especially when applied to malware. To address these issues, we propose a novel technique that extracts the semantics of binary code in terms of both data and control flow. Our technique allows more robust binary analysis because the extracted semantics of the binary code is generally immune from light obfuscation, refactoring, and varying the compilers or compilation settings. Specifically, we apply data-flow analysis to extract the semantic flow of the registers as well as the semantic components of the control flow graph, which are then synthesized into a novel representation called the semantic flow graph (SFG). Subsequently, various properties, such as reflexive, symmetric, antisymmetric, and transitive relations, are extracted from the SFG and applied to binary analysis. We implement our system in a tool called BinGold and evaluate it against thirty binary code applications. Our evaluation shows that BinGold successfully determines the similarity between binaries, yielding results that are highly robust against light obfuscation and refactoring. In addition, we demonstrate the application of BinGold to two important binary analysis tasks: binary code authorship attribution, and the detection of clone components across program executables. The promising results suggest that BinGold can be used to enhance existing techniques, making them more robust and practical

Elsevier - Publisher Connector

SHIELD: Thwarting Code Authorship Attribution

Author: Abuhamad Mohammed
Jung Changhun
Mohaisen David
Nyang DaeHun
Publication venue
Publication date: 25/04/2023
Field of study

Authorship attribution has become increasingly accurate, posing a serious privacy risk for programmers who wish to remain anonymous. In this paper, we introduce SHIELD to examine the robustness of different code authorship attribution approaches against adversarial code examples. We define four attacks on attribution techniques, which include targeted and non-targeted attacks, and realize them using adversarial code perturbation. We experiment with a dataset of 200 programmers from the Google Code Jam competition to validate our methods targeting six state-of-the-art authorship attribution methods that adopt a variety of techniques for extracting authorship traits from source-code, including RNN, CNN, and code stylometry. Our experiments demonstrate the vulnerability of current authorship attribution methods against adversarial attacks. For the non-targeted attack, our experiments demonstrate the vulnerability of current authorship attribution methods against the attack with an attack success rate exceeds 98.5\% accompanied by a degradation of the identification confidence that exceeds 13\%. For the targeted attacks, we show the possibility of impersonating a programmer using targeted-adversarial perturbations with a success rate ranging from 66\% to 88\% for different authorship attribution techniques under several adversarial scenarios.Comment: 12 pages, 13 figure

arXiv.org e-Print Archive

Recommended from our members

The Power of Debt: Identity and Collective Action in the Age of Finance

Author: Appel Hannah
Kline Caitlin
Whitley Sa
Publication venue: eScholarship, University of California
Publication date: 01/01/2019
Field of study

The Debt Collective is organized around the possibility for radical action within and against finance capitalism. In the wake of the 2008 financial crisis it was often hard to understand how finance intersected with the everyday lives of ordinary—and especially, poor—people. But in the years since, it has become increasingly clear that mass indebtedness—from the mortgage crisis to student debt, criminal “justice” fines and fees to municipal austerity—is a direct effect of the conflict between debt payments generating value as securitized investments (mortgage backed securities, municipal bond offerings) vs. their role in providing shelter, food, and the ability to merely get by. In the age of finance, debt has become an immersive, systemic problem; but it is one that, in its ubiquity, may hold the seeds of its own solutions. As oil tycoon JP Getty famously quipped: “If you owe the bank

100 that’s your problem. If you owe the bank

100 million, that’s the bank’s problem.” Student debt alone stands today at 1.5 trillion dollars. Together, arguably, we can be the banks’ problem. This is the provocation of the Debt Collective: how can we reframe debt from an issue of isolation and shame to a platform for collective action and political mobilization

eScholarship - University of California