1,377 research outputs found

    SAFE: Self-Attentive Function Embeddings for Binary Similarity

    Get PDF
    The binary similarity problem consists in determining if two functions are similar by only considering their compiled form. Advanced techniques for binary similarity recently gained momentum as they can be applied in several fields, such as copyright disputes, malware analysis, vulnerability detection, etc., and thus have an immediate practical impact. Current solutions compare functions by first transforming their binary code in multi-dimensional vector representations (embeddings), and then comparing vectors through simple and efficient geometric operations. However, embeddings are usually derived from binary code using manual feature extraction, that may fail in considering important function characteristics, or may consider features that are not important for the binary similarity problem. In this paper we propose SAFE, a novel architecture for the embedding of functions based on a self-attentive neural network. SAFE works directly on disassembled binary functions, does not require manual feature extraction, is computationally more efficient than existing solutions (i.e., it does not incur in the computational overhead of building or manipulating control flow graphs), and is more general as it works on stripped binaries and on multiple architectures. We report the results from a quantitative and qualitative analysis that show how SAFE provides a noticeable performance improvement with respect to previous solutions. Furthermore, we show how clusters of our embedding vectors are closely related to the semantic of the implemented algorithms, paving the way for further interesting applications (e.g. semantic-based binary function search).Comment: Published in International Conference on Detection of Intrusions and Malware, and Vulnerability Assessment (DIMVA) 201

    What Makes Good Open-Vocabulary Detector: A Disassembling Perspective

    Full text link
    Open-vocabulary detection (OVD) is a new object detection paradigm, aiming to localize and recognize unseen objects defined by an unbounded vocabulary. This is challenging since traditional detectors can only learn from pre-defined categories and thus fail to detect and localize objects out of pre-defined vocabulary. To handle the challenge, OVD leverages pre-trained cross-modal VLM, such as CLIP, ALIGN, etc. Previous works mainly focus on the open vocabulary classification part, with less attention on the localization part. We argue that for a good OVD detector, both classification and localization should be parallelly studied for the novel object categories. We show in this work that improving localization as well as cross-modal classification complement each other, and compose a good OVD detector jointly. We analyze three families of OVD methods with different design emphases. We first propose a vanilla method,i.e., cropping a bounding box obtained by a localizer and resizing it into the CLIP. We next introduce another approach, which combines a standard two-stage object detector with CLIP. A two-stage object detector includes a visual backbone, a region proposal network (RPN), and a region of interest (RoI) head. We decouple RPN and ROI head (DRR) and use RoIAlign to extract meaningful features. In this case, it avoids resizing objects. To further accelerate the training time and reduce the model parameters, we couple RPN and ROI head (CRR) as the third approach. We conduct extensive experiments on these three types of approaches in different settings. On the OVD-COCO benchmark, DRR obtains the best performance and achieves 35.8 Novel AP50_{50}, an absolute 2.8 gain over the previous state-of-the-art (SOTA). For OVD-LVIS, DRR surpasses the previous SOTA by 1.9 AP50_{50} in rare categories. We also provide an object detection dataset called PID and provide a baseline on PID

    Intergenerational Test Generation for Natural Language Processing Applications

    Full text link
    The development of modern NLP applications often relies on various benchmark datasets containing plenty of manually labeled tests to evaluate performance. While constructing datasets often costs many resources, the performance on the held-out data may not properly reflect their capability in real-world application scenarios and thus cause tremendous misunderstanding and monetary loss. To alleviate this problem, in this paper, we propose an automated test generation method for detecting erroneous behaviors of various NLP applications. Our method is designed based on the sentence parsing process of classic linguistics, and thus it is capable of assembling basic grammatical elements and adjuncts into a grammatically correct test with proper oracle information. We implement this method into NLPLego, which is designed to fully exploit the potential of seed sentences to automate the test generation. NLPLego disassembles the seed sentence into the template and adjuncts and then generates new sentences by assembling context-appropriate adjuncts with the template in a specific order. Unlike the taskspecific methods, the tests generated by NLPLego have derivation relations and different degrees of variation, which makes constructing appropriate metamorphic relations easier. Thus, NLPLego is general, meaning it can meet the testing requirements of various NLP applications. To validate NLPLego, we experiment with three common NLP tasks, identifying failures in four state-of-art models. Given seed tests from SQuAD 2.0, SST, and QQP, NLPLego successfully detects 1,732, 5301, and 261,879 incorrect behaviors with around 95.7% precision in three tasks, respectively

    Malware detection based on call graph similarities

    Get PDF
    S rostoucím množstvím škodlivých souborů se stalo využití strojového učení pro jejich detekci nezbytností. Autoři škodlivých souborů vytváří důmyslnější programy, aby překonali stále se zlepšující antivirovou ochranu. Windows OS zůstává nejčastějším cílem útoků. Viry se často šíří ve formátu Portable Executable (PE). PE soubory mohou být zkoumány pomocí metod statické analýzy, které se hodí pro zpracovávání velkého množství dat. Mnoho antivirových systémů disassembluje soubory a zkoumá jejich kód, který nabízí vhled do funkcionality souboru. Assembly kód je členěn do funkcí. Vztahy mezi funkcemi zachycuje graf volání funkcí (GVF). Tento graf byl zkoumán v literatuře a jeho struktura byla využita k hledání podobností mezi soubory. V poslední době začaly být úspěšně využívány grafové neuronové sítě (GNN) ke zpracování těchto grafů. V naší práci zkoumáme různé druhy a architektury GNN a vzájemně je porovnáváme. Po tom, co vybereme nejlepší GNN model, ho srovnáme s modelem, který nevyužívá grafovou strukturu GVF, abychom zjistili zda tato struktura zlepšuje klasifikační modely. Naši studii provádíme na velkém datasetu o více než 5 milionech PE souborů.Machine learning-powered malware detection systems became a necessity to fight the rising volume of malware. Malware authors create more sophisticated programs to overcome always improving antivirus engines. Windows OS remains the most targeted system, and the malicious payload commonly comes in the Portable executable (PE) file format. PE files can be analyzed with the static analysis methods, which are suitable for processing large amounts of data. Many engines disassemble binaries and study the code, which carries valuable insight into binary behavior. The assembly code is divided into functions that carry the functionality. The relations between functions form a Function Call Graph (FCG). FCG has been studied in the literature, and the graph structure was employed to find similarities between files. Recently, Graph Neural Networks (GNNs) have been adapted to work upon FCGs and are claimed to be performing well. In this work, we study and compare different GNN models and their architectures. After selecting the best GNN model, we compare it with a non-structural model to verify if an FCG structure improves classification models. We perform our empirical study on a large dataset of more than 5 million PE files

    ShapeStacks: Learning Vision-Based Physical Intuition for Generalised Object Stacking

    Full text link
    Physical intuition is pivotal for intelligent agents to perform complex tasks. In this paper we investigate the passive acquisition of an intuitive understanding of physical principles as well as the active utilisation of this intuition in the context of generalised object stacking. To this end, we provide: a simulation-based dataset featuring 20,000 stack configurations composed of a variety of elementary geometric primitives richly annotated regarding semantics and structural stability. We train visual classifiers for binary stability prediction on the ShapeStacks data and scrutinise their learned physical intuition. Due to the richness of the training data our approach also generalises favourably to real-world scenarios achieving state-of-the-art stability prediction on a publicly available benchmark of block towers. We then leverage the physical intuition learned by our model to actively construct stable stacks and observe the emergence of an intuitive notion of stackability - an inherent object affordance - induced by the active stacking task. Our approach performs well even in challenging conditions where it considerably exceeds the stack height observed during training or in cases where initially unstable structures must be stabilised via counterbalancing.Comment: revised version to appear at ECCV 201
    corecore