1,377 research outputs found
SAFE: Self-Attentive Function Embeddings for Binary Similarity
The binary similarity problem consists in determining if two functions are
similar by only considering their compiled form. Advanced techniques for binary
similarity recently gained momentum as they can be applied in several fields,
such as copyright disputes, malware analysis, vulnerability detection, etc.,
and thus have an immediate practical impact. Current solutions compare
functions by first transforming their binary code in multi-dimensional vector
representations (embeddings), and then comparing vectors through simple and
efficient geometric operations. However, embeddings are usually derived from
binary code using manual feature extraction, that may fail in considering
important function characteristics, or may consider features that are not
important for the binary similarity problem. In this paper we propose SAFE, a
novel architecture for the embedding of functions based on a self-attentive
neural network. SAFE works directly on disassembled binary functions, does not
require manual feature extraction, is computationally more efficient than
existing solutions (i.e., it does not incur in the computational overhead of
building or manipulating control flow graphs), and is more general as it works
on stripped binaries and on multiple architectures. We report the results from
a quantitative and qualitative analysis that show how SAFE provides a
noticeable performance improvement with respect to previous solutions.
Furthermore, we show how clusters of our embedding vectors are closely related
to the semantic of the implemented algorithms, paving the way for further
interesting applications (e.g. semantic-based binary function search).Comment: Published in International Conference on Detection of Intrusions and
Malware, and Vulnerability Assessment (DIMVA) 201
What Makes Good Open-Vocabulary Detector: A Disassembling Perspective
Open-vocabulary detection (OVD) is a new object detection paradigm, aiming to
localize and recognize unseen objects defined by an unbounded vocabulary. This
is challenging since traditional detectors can only learn from pre-defined
categories and thus fail to detect and localize objects out of pre-defined
vocabulary. To handle the challenge, OVD leverages pre-trained cross-modal VLM,
such as CLIP, ALIGN, etc. Previous works mainly focus on the open vocabulary
classification part, with less attention on the localization part. We argue
that for a good OVD detector, both classification and localization should be
parallelly studied for the novel object categories. We show in this work that
improving localization as well as cross-modal classification complement each
other, and compose a good OVD detector jointly. We analyze three families of
OVD methods with different design emphases. We first propose a vanilla
method,i.e., cropping a bounding box obtained by a localizer and resizing it
into the CLIP. We next introduce another approach, which combines a standard
two-stage object detector with CLIP. A two-stage object detector includes a
visual backbone, a region proposal network (RPN), and a region of interest
(RoI) head. We decouple RPN and ROI head (DRR) and use RoIAlign to extract
meaningful features. In this case, it avoids resizing objects. To further
accelerate the training time and reduce the model parameters, we couple RPN and
ROI head (CRR) as the third approach. We conduct extensive experiments on these
three types of approaches in different settings. On the OVD-COCO benchmark, DRR
obtains the best performance and achieves 35.8 Novel AP, an absolute 2.8
gain over the previous state-of-the-art (SOTA). For OVD-LVIS, DRR surpasses the
previous SOTA by 1.9 AP in rare categories. We also provide an object
detection dataset called PID and provide a baseline on PID
Intergenerational Test Generation for Natural Language Processing Applications
The development of modern NLP applications often relies on various benchmark
datasets containing plenty of manually labeled tests to evaluate performance.
While constructing datasets often costs many resources, the performance on the
held-out data may not properly reflect their capability in real-world
application scenarios and thus cause tremendous misunderstanding and monetary
loss. To alleviate this problem, in this paper, we propose an automated test
generation method for detecting erroneous behaviors of various NLP
applications. Our method is designed based on the sentence parsing process of
classic linguistics, and thus it is capable of assembling basic grammatical
elements and adjuncts into a grammatically correct test with proper oracle
information. We implement this method into NLPLego, which is designed to fully
exploit the potential of seed sentences to automate the test generation.
NLPLego disassembles the seed sentence into the template and adjuncts and then
generates new sentences by assembling context-appropriate adjuncts with the
template in a specific order. Unlike the taskspecific methods, the tests
generated by NLPLego have derivation relations and different degrees of
variation, which makes constructing appropriate metamorphic relations easier.
Thus, NLPLego is general, meaning it can meet the testing requirements of
various NLP applications. To validate NLPLego, we experiment with three common
NLP tasks, identifying failures in four state-of-art models. Given seed tests
from SQuAD 2.0, SST, and QQP, NLPLego successfully detects 1,732, 5301, and
261,879 incorrect behaviors with around 95.7% precision in three tasks,
respectively
Malware detection based on call graph similarities
S rostoucím množstvím škodlivých souborů se stalo využití strojového učení pro jejich detekci nezbytností. Autoři škodlivých souborů vytváří důmyslnější programy, aby překonali stále se zlepšující antivirovou ochranu. Windows OS zůstává nejčastějším cílem útoků. Viry se často šíří ve formátu Portable Executable (PE). PE soubory mohou být zkoumány pomocí metod statické analýzy, které se hodí pro zpracovávání velkého množství dat. Mnoho antivirových systémů disassembluje soubory a zkoumá jejich kód, který nabízí vhled do funkcionality souboru. Assembly kód je členěn do funkcí. Vztahy mezi funkcemi zachycuje graf volání funkcí (GVF). Tento graf byl zkoumán v literatuře a jeho struktura byla využita k hledání podobností mezi soubory. V poslední době začaly být úspěšně využívány grafové neuronové sítě (GNN) ke zpracování těchto grafů. V naší práci zkoumáme různé druhy a architektury GNN a vzájemně je porovnáváme. Po tom, co vybereme nejlepší GNN model, ho srovnáme s modelem, který nevyužívá grafovou strukturu GVF, abychom zjistili zda tato struktura zlepšuje klasifikační modely. Naši studii provádíme na velkém datasetu o více než 5 milionech PE souborů.Machine learning-powered malware detection systems became a necessity to fight the rising volume of malware. Malware authors create more sophisticated programs to overcome always improving antivirus engines. Windows OS remains the most targeted system, and the malicious payload commonly comes in the Portable executable (PE) file format. PE files can be analyzed with the static analysis methods, which are suitable for processing large amounts of data. Many engines disassemble binaries and study the code, which carries valuable insight into binary behavior. The assembly code is divided into functions that carry the functionality. The relations between functions form a Function Call Graph (FCG). FCG has been studied in the literature, and the graph structure was employed to find similarities between files. Recently, Graph Neural Networks (GNNs) have been adapted to work upon FCGs and are claimed to be performing well. In this work, we study and compare different GNN models and their architectures. After selecting the best GNN model, we compare it with a non-structural model to verify if an FCG structure improves classification models. We perform our empirical study on a large dataset of more than 5 million PE files
ShapeStacks: Learning Vision-Based Physical Intuition for Generalised Object Stacking
Physical intuition is pivotal for intelligent agents to perform complex
tasks. In this paper we investigate the passive acquisition of an intuitive
understanding of physical principles as well as the active utilisation of this
intuition in the context of generalised object stacking. To this end, we
provide: a simulation-based dataset featuring 20,000 stack configurations
composed of a variety of elementary geometric primitives richly annotated
regarding semantics and structural stability. We train visual classifiers for
binary stability prediction on the ShapeStacks data and scrutinise their
learned physical intuition. Due to the richness of the training data our
approach also generalises favourably to real-world scenarios achieving
state-of-the-art stability prediction on a publicly available benchmark of
block towers. We then leverage the physical intuition learned by our model to
actively construct stable stacks and observe the emergence of an intuitive
notion of stackability - an inherent object affordance - induced by the active
stacking task. Our approach performs well even in challenging conditions where
it considerably exceeds the stack height observed during training or in cases
where initially unstable structures must be stabilised via counterbalancing.Comment: revised version to appear at ECCV 201
- …