10 research outputs found

    TSM: Measuring the Enticement of Honeyfiles with Natural Language Processing

    Get PDF
    Honeyfile deployment is a useful breach detection method in cyber deception that can also inform defenders about the intent and interests of intruders and malicious insiders. A key property of a honeyfile, enticement, is the extent to which the file can attract an intruder to interact with it. We introduce a novel metric, Topic Semantic Matching (TSM), which uses topic modelling to represent files in the repository and semantic matching in an embedding vector space to compare honeyfile text and topic words robustly. We also present a honeyfile corpus created with different Natural Language Processing (NLP) methods. Experiments show that TSM is effective in inter-corpus comparisons and is a promising tool to measure the enticement of honeyfiles. TSM is the first measure to use NLP techniques to quantify the enticement of honeyfile content that compares the essential topical content of local contexts to honeyfiles and is robust to paraphrasing

    HoneyCode: Automating Deceptive Software Repositories with Deep Generative Models

    Get PDF
    We propose HoneyCode, an architecture for the generation of synthetic software repositories for cyber deception. The synthetic repositories have the characteristics of real software, including language features, file names and extensions, but contain no real intellectual property. The fake software can be used as a honeypot or form part of a deceptive environment. Existing approaches to software repository generation lack scalability due to reliance on hand-crafted structures for specific languages. Our approach is language agnostic and learns the underlying representations of repository structures, filenames and file content through a novel Tree Recurrent Network (TRN) and two recurrent networks (RNN) respectively. Each stage of the sequential generation process utilises features from prior steps, which increases the honey repository’s authenticity and consistency. Experiments show TRN generates tree samples that reduce degree mean maximal distance (MMD) by 90-92% and depth MMD by 75-86% to a held out test data set in comparison to recent deep graph generators and a baseline random tree generator. In addition, our RNN models generate convincing filenames with authentic syntax and realistic file content

    Modelling Direct Messaging Networks with Multiple Recipients for Cyber Deception

    Full text link
    Cyber deception is emerging as a promising approach to defending networks and systems against attackers and data thieves. However, despite being relatively cheap to deploy, the generation of realistic content at scale is very costly, due to the fact that rich, interactive deceptive technologies are largely hand-crafted. With recent improvements in Machine Learning, we now have the opportunity to bring scale and automation to the creation of realistic and enticing simulated content. In this work, we propose a framework to automate the generation of email and instant messaging-style group communications at scale. Such messaging platforms within organisations contain a lot of valuable information inside private communications and document attachments, making them an enticing target for an adversary. We address two key aspects of simulating this type of system: modelling when and with whom participants communicate, and generating topical, multi-party text to populate simulated conversation threads. We present the LogNormMix-Net Temporal Point Process as an approach to the first of these, building upon the intensity-free modeling approach of Shchur et al. to create a generative model for unicast and multi-cast communications. We demonstrate the use of fine-tuned, pre-trained language models to generate convincing multi-party conversation threads. A live email server is simulated by uniting our LogNormMix-Net TPP (to generate the communication timestamp, sender and recipients) with the language model, which generates the contents of the multi-party email threads. We evaluate the generated content with respect to a number of realism-based properties, that encourage a model to learn to generate content that will engage the attention of an adversary to achieve a deception outcome

    Three Decades of Deception Techniques in Active Cyber Defense -- Retrospect and Outlook

    Full text link
    Deception techniques have been widely seen as a game changer in cyber defense. In this paper, we review representative techniques in honeypots, honeytokens, and moving target defense, spanning from the late 1980s to the year 2021. Techniques from these three domains complement with each other and may be leveraged to build a holistic deception based defense. However, to the best of our knowledge, there has not been a work that provides a systematic retrospect of these three domains all together and investigates their integrated usage for orchestrated deceptions. Our paper aims to fill this gap. By utilizing a tailored cyber kill chain model which can reflect the current threat landscape and a four-layer deception stack, a two-dimensional taxonomy is developed, based on which the deception techniques are classified. The taxonomy literally answers which phases of a cyber attack campaign the techniques can disrupt and which layers of the deception stack they belong to. Cyber defenders may use the taxonomy as a reference to design an organized and comprehensive deception plan, or to prioritize deception efforts for a budget conscious solution. We also discuss two important points for achieving active and resilient cyber defense, namely deception in depth and deception lifecycle, where several notable proposals are illustrated. Finally, some outlooks on future research directions are presented, including dynamic integration of different deception techniques, quantified deception effects and deception operation cost, hardware-supported deception techniques, as well as techniques developed based on better understanding of the human element.Comment: 19 page

    DECEPTION BASED TECHNIQUES AGAINST RANSOMWARES: A SYSTEMATIC REVIEW

    Get PDF
    Ransomware is the most prevalent emerging business risk nowadays. It seriously affects business continuity and operations. According to Deloitte Cyber Security Landscape 2022, up to 4000 ransomware attacks occur daily, while the average number of days an organization takes to identify a breach is 191. Sophisticated cyber-attacks such as ransomware typically must go through multiple consecutive phases (initial foothold, network propagation, and action on objectives) before accomplishing its final objective. This study analyzed decoy-based solutions as an approach (detection, prevention, or mitigation) to overcome ransomware. A systematic literature review was conducted, in which the result has shown that deception-based techniques have given effective and significant performance against ransomware with minimal resources. It is also identified that contrary to general belief, deception techniques mainly involved in passive approaches (i.e., prevention, detection) possess other active capabilities such as ransomware traceback and obstruction (thwarting), file decryption, and decryption key recovery. Based on the literature review, several evaluation methods are also analyzed to measure the effectiveness of these deception-based techniques during the implementation process

    Deep Down the Rabbit Hole: On References in Networks of Decoy Elements

    Full text link
    Deception technology has proven to be a sound approach against threats to information systems. Aside from well-established honeypots, decoy elements, also known as honeytokens, are an excellent method to address various types of threats. Decoy elements are causing distraction and uncertainty to an attacker and help detecting malicious activity. Deception is meant to be complementing firewalls and intrusion detection systems. Particularly insider threats may be mitigated with deception methods. While current approaches consider the use of multiple decoy elements as well as context-sensitivity, they do not sufficiently describe a relationship between individual elements. In this work, inter-referencing decoy elements are introduced as a plausible extension to existing deception frameworks, leading attackers along a path of decoy elements. A theoretical foundation is introduced, as well as a stochastic model and a reference implementation. It was found that the proposed system is suitable to enhance current decoy frameworks by adding a further dimension of inter-connectivity and therefore improve intrusion detection and prevention

    Modeling Deception for Cyber Security

    Get PDF
    In the era of software-intensive, smart and connected systems, the growing power and so- phistication of cyber attacks poses increasing challenges to software security. The reactive posture of traditional security mechanisms, such as anti-virus and intrusion detection systems, has not been sufficient to combat a wide range of advanced persistent threats that currently jeopardize systems operation. To mitigate these extant threats, more ac- tive defensive approaches are necessary. Such approaches rely on the concept of actively hindering and deceiving attackers. Deceptive techniques allow for additional defense by thwarting attackers’ advances through the manipulation of their perceptions. Manipu- lation is achieved through the use of deceitful responses, feints, misdirection, and other falsehoods in a system. Of course, such deception mechanisms may result in side-effects that must be handled. Current methods for planning deception chiefly portray attempts to bridge military deception to cyber deception, providing only high-level instructions that largely ignore deception as part of the software security development life cycle. Con- sequently, little practical guidance is provided on how to engineering deception-based techniques for defense. This PhD thesis contributes with a systematic approach to specify and design cyber deception requirements, tactics, and strategies. This deception approach consists of (i) a multi-paradigm modeling for representing deception requirements, tac- tics, and strategies, (ii) a reference architecture to support the integration of deception strategies into system operation, and (iii) a method to guide engineers in deception mod- eling. A tool prototype, a case study, and an experimental evaluation show encouraging results for the application of the approach in practice. Finally, a conceptual coverage map- ping was developed to assess the expressivity of the deception modeling language created.Na era digital o crescente poder e sofisticação dos ataques cibernéticos apresenta constan- tes desafios para a segurança do software. A postura reativa dos mecanismos tradicionais de segurança, como os sistemas antivírus e de detecção de intrusão, não têm sido suficien- tes para combater a ampla gama de ameaças que comprometem a operação dos sistemas de software actuais. Para mitigar estas ameaças são necessárias abordagens ativas de defesa. Tais abordagens baseiam-se na ideia de adicionar mecanismos para enganar os adversários (do inglês deception). As técnicas de enganação (em português, "ato ou efeito de enganar, de induzir em erro; artimanha usada para iludir") contribuem para a defesa frustrando o avanço dos atacantes por manipulação das suas perceções. A manipula- ção é conseguida através de respostas enganadoras, de "fintas", ou indicações erróneas e outras falsidades adicionadas intencionalmente num sistema. É claro que esses meca- nismos de enganação podem resultar em efeitos colaterais que devem ser tratados. Os métodos atuais usados para enganar um atacante inspiram-se fundamentalmente nas técnicas da área militar, fornecendo apenas instruções de alto nível que ignoram, em grande parte, a enganação como parte do ciclo de vida do desenvolvimento de software seguro. Consequentemente, há poucas referências práticas em como gerar técnicas de defesa baseadas em enganação. Esta tese de doutoramento contribui com uma aborda- gem sistemática para especificar e desenhar requisitos, táticas e estratégias de enganação cibernéticas. Esta abordagem é composta por (i) uma modelação multi-paradigma para re- presentar requisitos, táticas e estratégias de enganação, (ii) uma arquitetura de referência para apoiar a integração de estratégias de enganação na operação dum sistema, e (iii) um método para orientar os engenheiros na modelação de enganação. Uma ferramenta protó- tipo, um estudo de caso e uma avaliação experimental mostram resultados encorajadores para a aplicação da abordagem na prática. Finalmente, a expressividade da linguagem de modelação de enganação é avaliada por um mapeamento de cobertura de conceitos

    Automating the Generation of Enticing Text Content for High-Interaction Honeyfiles

    Get PDF
    While advanced defenders have successfully used honeyfiles to detect unauthorized intruders and insider threats for more than 30 years, the complexity associated with adaptively devising enticing content has limited their diffusion. This paper presents four new designs for automating the construction of honeyfile content. The new designs select a document from the target directory as a template and employ word transposition and substitution based on parts of speech tagging and n-grams collected from both the target directory and the surrounding file system. These designs were compared to previous methods using a new theory to quantitatively evaluate honeyfile enticement. The new designs were able to successfully mimic the content from the target directory, whilst minimizing the introduction of material from other sources. The designs may also hold potential to match many of the characteristics of nearby documents, whilst minimizing the replication of copyrighted or classified material from documents they are protectin
    corecore