26,845 research outputs found
Malware Visualization and Similarity via Tracking Binary Execution Path
Today, computer systems are widely and importantly used throughout society, and malicious codes to take over the system and perform malicious actions are continuously being created and developed. These malicious codes are sometimes found in new forms, but in many cases they are modified from existing malicious codes. Since there are too many threatening malicious codes that are being continuously generated for human analysis, various studies to efficiently detect, classify, and analyze are essential. There are two main ways to analyze malicious code. First, static analysis is a technique to identify malicious behaviors by analyzing the structure of malicious codes or specific binary patterns at the code level. The second is a dynamic analysis technique that uses virtualization tools to build an environment in a virtual machine and executes malicious code to analyze malicious behavior. The method used to analyze malicious codes in this paper is a static analysis technique. Although there is a lot of information that can be obtained from dynamic analysis, there is a disadvantage that it can be analyzed normally only when the environment in which each malicious code is executed is matched. However, since the method proposed in this paper tracks and analyzes the execution stream of the code, static analysis is performed, but the effect of dynamic analysis can be expected.The core idea of this paper is to express the malicious code as a 25 25 pixel image using 25 API categories selected. The interaction and frequency of the API is made into a 25 25 pixel image based on a matrix using RGB values. When analyzing the malicious code, the Euclidean distance algorithm is applied to the generated image to measure the color similarity, and the similarity of the mutual malicious behavior is calculated based on the final Euclidean distance value. As a result, as a result of comparing the similarity calculated by the proposed method with the similarity calculated by the existing similarity calculation method, the similarity was calculated to be 5-10% higher on average. The method proposed in this study spends a lot of time deriving results because it analyzes, visualizes, and calculates the similarity of the visualized sample. Therefore, it takes a lot of time to analyze a huge number of malicious codes. A large amount of malware can be analyzed through follow-up studies, and improvements are needed to study the accuracy according to the size of the data set
Malicious cryptography techniques for unreversable (malicious or not) binaries
Fighting against computer malware require a mandatory step of reverse
engineering. As soon as the code has been disassemblied/decompiled (including a
dynamic analysis step), there is a hope to understand what the malware actually
does and to implement a detection mean. This also applies to protection of
software whenever one wishes to analyze them. In this paper, we show how to
amour code in such a way that reserse engineering techniques (static and
dymanic) are absolutely impossible by combining malicious cryptography
techniques developped in our laboratory and new types of programming (k-ary
codes). Suitable encryption algorithms combined with new cryptanalytic
approaches to ease the protection of (malicious or not) binaries, enable to
provide both total code armouring and large scale polymorphic features at the
same time. A simple 400 Kb of executable code enables to produce a binary code
and around mutated forms natively while going far beyond the old
concept of decryptor.Comment: 17 pages, 2 figures, accepted for presentation at H2HC'1
PowerDrive: Accurate De-Obfuscation and Analysis of PowerShell Malware
PowerShell is nowadays a widely-used technology to administrate and manage
Windows-based operating systems. However, it is also extensively used by
malware vectors to execute payloads or drop additional malicious contents.
Similarly to other scripting languages used by malware, PowerShell attacks are
challenging to analyze due to the extensive use of multiple obfuscation layers,
which make the real malicious code hard to be unveiled. To the best of our
knowledge, a comprehensive solution for properly de-obfuscating such attacks is
currently missing. In this paper, we present PowerDrive, an open-source, static
and dynamic multi-stage de-obfuscator for PowerShell attacks. PowerDrive
instruments the PowerShell code to progressively de-obfuscate it by showing the
analyst the employed obfuscation steps. We used PowerDrive to successfully
analyze thousands of PowerShell attacks extracted from various malware vectors
and executables. The attained results show interesting patterns used by
attackers to devise their malicious scripts. Moreover, we provide a taxonomy of
behavioral models adopted by the analyzed codes and a comprehensive list of the
malicious domains contacted during the analysis
Analyzing Dynamic Code: A Sound Abstract Interpreter for Evil Eval
Dynamic languages, such as JavaScript, employ string-to-code primitives to turn dynamically generated text into executable code at run-time. These features make standard static analysis extremely hard if not impossible, because its essential data structures, i.e., the control-flow graph and the system of recursive equations associated with the program to analyze, are themselves dynamically mutating objects. Nevertheless, assembling code at run-time by manipulating strings, such as by eval in JavaScript, has been always strongly discouraged, since it is often recognized that "eval is evil,"leading static analyzers to not consider such statements or ignoring their effects. Unfortunately, the lack of formal approaches to analyze string-to-code statements pose a perfect habitat for malicious code, that is surely evil and do not respect good practice rules, allowing them to hide malicious intents as strings to be converted to code and making static analyses blind to the real malicious aim of the code. Hence, the need to handle string-to-code statements approximating what they can execute, and therefore allowing the analysis to continue (even in the presence of dynamically generated program statements) with an acceptable degree of precision, should be clear. To reach this goal, we propose a static analysis allowing us to collect string values and to soundly over-approximate and analyze the code potentially executed by a string-to-code statement
Classifying malicious windows executables using anomaly based detection
A malicious executable is broadly defined as any program or piece of code designed to cause damage to a system or the information it contains, or to prevent the system from being used in a normal manner. A generic term used to describe any kind of malicious software is Maiware, which includes Viruses, Worms, Trojans, Backdoors, Root-kits, Spyware and Exploits. Anomaly detection is technique which builds a statistical profile of the normal and malicious data and classifies unseen data based on these two profiles.
A detection system is presented here which is anomaly based and focuses on the Windows® platform. Several file infection techniques were studied to understand what particular features in the executable binary are more susceptible to being used for the malicious code propagation. A framework is presented for collecting data for both static (non-execution based) as well as dynamic (execution based) analysis of the malicious executables. Two specific features are extracted using static analysis, Windows API (from the Import Address Table of the Portable Executable Header) and the hex byte frequency count (collected using Hexdump utility) which have been explained in detail. Dynamic analysis features which were extracted are briefly mentioned and the major challenges faced using this data is explained. Classification results using Support Vector Machines for anomaly detection is shown for the two static analysis features. Experimental results have provided classification results with up to 94% accuracy for new, previously unseen executables
Malicious Source Code Detection Using Transformer
Open source code is considered a common practice in modern software
development. However, reusing other code allows bad actors to access a wide
developers' community, hence the products that rely on it. Those attacks are
categorized as supply chain attacks. Recent years saw a growing number of
supply chain attacks that leverage open source during software development,
relaying the download and installation procedures, whether automatic or manual.
Over the years, many approaches have been invented for detecting vulnerable
packages. However, it is uncommon to detect malicious code within packages.
Those detection approaches can be broadly categorized as analyzes that use
(dynamic) and do not use (static) code execution. Here, we introduce Malicious
Source code Detection using Transformers (MSDT) algorithm. MSDT is a novel
static analysis based on a deep learning method that detects real-world code
injection cases to source code packages. In this study, we used MSDT and a
dataset with over 600,000 different functions to embed various functions and
applied a clustering algorithm to the resulting vectors, detecting the
malicious functions by detecting the outliers. We evaluated MSDT's performance
by conducting extensive experiments and demonstrated that our algorithm is
capable of detecting functions that were injected with malicious code with
precision@k values of up to 0.909
- …