2,332 research outputs found

    Identifying Authorship Style in Malicious Binaries: Techniques, Challenges & Datasets

    Get PDF
    Attributing a piece of malware to its creator typically requires threat intelligence. Binary attribution increases the level of difficulty as it mostly relies upon the ability to disassemble binaries to identify authorship style. Our survey explores malicious author style and the adversarial techniques used by them to remain anonymous. We examine the adversarial impact on the state-of-the-art methods. We identify key findings and explore the open research challenges. To mitigate the lack of ground truth datasets in this domain, we publish alongside this survey the largest and most diverse meta-information dataset of 15,660 malware labeled to 164 threat actor groups

    Designing for Irrelevance

    Get PDF
    My job title is ‘designer’ but I’m reluctant to describe myself as a designer for a number of reasons: first, because the practice has a lot to answer for; and second, because I don’t do a whole lot of design. I help groups of people to collaborate and converse their way through problems towards solutions—activating a latent capability for design in people as they think and work differently, together. The sense of agency that accompanies this is intoxicating. This work can produce strategies, systems, and services, as well as spaces, objects, and graphics. The awareness that design can shape both our (intangible) experiences and our (tangible) environments—and that, as a mode of thinking, it can be accessible, inclusive, and participatory—shifts it from a practice to a stance. In this sense, is design a choice that we make to perceive and move through the world in a contextual and intentional way? What does this mean for the practice of design?I respod to these question by reflecting on my experience of participating in the Indonesia Australia Design Futures project

    Adversarial Attacks on Code Models with Discriminative Graph Patterns

    Full text link
    Pre-trained language models of code are now widely used in various software engineering tasks such as code generation, code completion, vulnerability detection, etc. This, in turn, poses security and reliability risks to these models. One of the important threats is \textit{adversarial attacks}, which can lead to erroneous predictions and largely affect model performance on downstream tasks. Current adversarial attacks on code models usually adopt fixed sets of program transformations, such as variable renaming and dead code insertion, leading to limited attack effectiveness. To address the aforementioned challenges, we propose a novel adversarial attack framework, GraphCodeAttack, to better evaluate the robustness of code models. Given a target code model, GraphCodeAttack automatically mines important code patterns, which can influence the model's decisions, to perturb the structure of input code to the model. To do so, GraphCodeAttack uses a set of input source codes to probe the model's outputs and identifies the \textit{discriminative} ASTs patterns that can influence the model decisions. GraphCodeAttack then selects appropriate AST patterns, concretizes the selected patterns as attacks, and inserts them as dead code into the model's input program. To effectively synthesize attacks from AST patterns, GraphCodeAttack uses a separate pre-trained code model to fill in the ASTs with concrete code snippets. We evaluate the robustness of two popular code models (e.g., CodeBERT and GraphCodeBERT) against our proposed approach on three tasks: Authorship Attribution, Vulnerability Prediction, and Clone Detection. The experimental results suggest that our proposed approach significantly outperforms state-of-the-art approaches in attacking code models such as CARROT and ALERT

    SHIELD: Thwarting Code Authorship Attribution

    Full text link
    Authorship attribution has become increasingly accurate, posing a serious privacy risk for programmers who wish to remain anonymous. In this paper, we introduce SHIELD to examine the robustness of different code authorship attribution approaches against adversarial code examples. We define four attacks on attribution techniques, which include targeted and non-targeted attacks, and realize them using adversarial code perturbation. We experiment with a dataset of 200 programmers from the Google Code Jam competition to validate our methods targeting six state-of-the-art authorship attribution methods that adopt a variety of techniques for extracting authorship traits from source-code, including RNN, CNN, and code stylometry. Our experiments demonstrate the vulnerability of current authorship attribution methods against adversarial attacks. For the non-targeted attack, our experiments demonstrate the vulnerability of current authorship attribution methods against the attack with an attack success rate exceeds 98.5\% accompanied by a degradation of the identification confidence that exceeds 13\%. For the targeted attacks, we show the possibility of impersonating a programmer using targeted-adversarial perturbations with a success rate ranging from 66\% to 88\% for different authorship attribution techniques under several adversarial scenarios.Comment: 12 pages, 13 figure

    Robin: A Novel Method to Produce Robust Interpreters for Deep Learning-Based Code Classifiers

    Full text link
    Deep learning has been widely used in source code classification tasks, such as code classification according to their functionalities, code authorship attribution, and vulnerability detection. Unfortunately, the black-box nature of deep learning makes it hard to interpret and understand why a classifier (i.e., classification model) makes a particular prediction on a given example. This lack of interpretability (or explainability) might have hindered their adoption by practitioners because it is not clear when they should or should not trust a classifier's prediction. The lack of interpretability has motivated a number of studies in recent years. However, existing methods are neither robust nor able to cope with out-of-distribution examples. In this paper, we propose a novel method to produce \underline{Rob}ust \underline{in}terpreters for a given deep learning-based code classifier; the method is dubbed Robin. The key idea behind Robin is a novel hybrid structure combining an interpreter and two approximators, while leveraging the ideas of adversarial training and data augmentation. Experimental results show that on average the interpreter produced by Robin achieves a 6.11\% higher fidelity (evaluated on the classifier), 67.22\% higher fidelity (evaluated on the approximator), and 15.87x higher robustness than that of the three existing interpreters we evaluated. Moreover, the interpreter is 47.31\% less affected by out-of-distribution examples than that of LEMNA.Comment: To be published in the 38th IEEE/ACM International Conference on Automated Software Engineering (ASE 2023

    Generative adversarial copy machines

    Get PDF
    This essay explores the redistribution of expressive agency across human artists and non-human entities that inevitably occurs when artificial intelligence (AI) becomes involved in creative processes. In doing so, my focus is not on a ‘becoming-creative’ of AI in an anthropocentric sense of the term. Rather, my central argument is as follows: if AI systems are (or will be) capable of generating outputs that can satisfy requirements by which creativity is currently being evaluated, validated, and valorised, then AI inevitably disturbs prevailing aesthetic and ontological assumptions concerning anthropocentrically framed ideals of the artist figure, the work of art, and the idea of creativity as such. I will elaborate this argument by way of a close reading of Generative Adversarial Network (GAN) technology and its uses in AI art, alongside examples of ownership claims and disputes involving GAN-style AI art. Overall, the discussion links to cultural theories of AI, relevant legal theory, and posthumanist thought. It is across these contexts that I will reframe GAN systems, even when their ‘artistic’ outputs can be interpreted with reference to the concept of the singular author figure, as ‘Generative Adversarial Copy Machines.’ Ultimately, I want to propose that the disturbances effected by AI in artistic practices can pose a critical challenge to the integrity of cultural ownership models – specifically: intellectual property (IP) enclosures – which rely on an anthropocentric conceptualisation of authorship

    OpenML: networked science in machine learning

    Full text link
    Many sciences have made significant breakthroughs by adopting online tools that help organize, structure and mine information that is too detailed to be printed in journals. In this paper, we introduce OpenML, a place for machine learning researchers to share and organize data in fine detail, so that they can work more effectively, be more visible, and collaborate with others to tackle harder problems. We discuss how OpenML relates to other examples of networked science and what benefits it brings for machine learning research, individual scientists, as well as students and practitioners.Comment: 12 pages, 10 figure

    Artificial intelligence and UK national security: Policy considerations

    Get PDF
    RUSI was commissioned by GCHQ to conduct an independent research study into the use of artificial intelligence (AI) for national security purposes. The aim of this project is to establish an independent evidence base to inform future policy development regarding national security uses of AI. The findings are based on in-depth consultation with stakeholders from across the UK national security community, law enforcement agencies, private sector companies, academic and legal experts, and civil society representatives. This was complemented by a targeted review of existing literature on the topic of AI and national security. The research has found that AI offers numerous opportunities for the UK national security community to improve efficiency and effectiveness of existing processes. AI methods can rapidly derive insights from large, disparate datasets and identify connections that would otherwise go unnoticed by human operators. However, in the context of national security and the powers given to UK intelligence agencies, use of AI could give rise to additional privacy and human rights considerations which would need to be assessed within the existing legal and regulatory framework. For this reason, enhanced policy and guidance is needed to ensure the privacy and human rights implications of national security uses of AI are reviewed on an ongoing basis as new analysis methods are applied to data

    Dos and Don'ts of Machine Learning in Computer Security

    Get PDF
    With the growing processing power of computing systems and the increasing availability of massive datasets, machine learning algorithms have led to major breakthroughs in many different areas. This development has influenced computer security, spawning a series of work on learning-based security systems, such as for malware detection, vulnerability discovery, and binary code analysis. Despite great potential, machine learning in security is prone to subtle pitfalls that undermine its performance and render learning-based systems potentially unsuitable for security tasks and practical deployment. In this paper, we look at this problem with critical eyes. First, we identify common pitfalls in the design, implementation, and evaluation of learning-based security systems. We conduct a study of 30 papers from top-tier security conferences within the past 10 years, confirming that these pitfalls are widespread in the current security literature. In an empirical analysis, we further demonstrate how individual pitfalls can lead to unrealistic performance and interpretations, obstructing the understanding of the security problem at hand. As a remedy, we propose actionable recommendations to support researchers in avoiding or mitigating the pitfalls where possible. Furthermore, we identify open problems when applying machine learning in security and provide directions for further research.Comment: to appear at USENIX Security Symposium 202
    • …
    corecore