400 research outputs found

    Analysis and Concealment of Malware in an Adversarial Environment

    Get PDF
    Nowadays, users and devices are rapidly growing, and there is a massive migration of data and infrastructure from physical systems to virtual ones. Moreover, people are always connected and deeply dependent on information and communications. Thanks to the massive growth of Internet of Things applications, this phenomenon also affects everyday objects such as home appliances and vehicles. This extensive interconnection implies a significant rate of potential security threats for systems, devices, and virtual identities. For this reason, malware detection and analysis is one of the most critical security topics. The used detection strategies are well suited to analyze and respond to potential threats, but they are vulnerable and can be bypassed under specific conditions. In light of this scenario, this thesis highlights the existent detection strategies and how it is possible to deceive them using malicious contents concealment strategies, such as code obfuscation and adversarial attacks. Moreover, the ultimate goal is to explore new viable ways to detect and analyze embedded malware and study the feasibility of generating adversarial attacks. In line with these two goals, in this thesis, I present two research contributions. The first one proposes a new viable way to detect and analyze the malicious contents inside Microsoft Office documents (even when concealed). The second one proposes a study about the feasibility of generating Android malicious applications capable of bypassing a real-world detection system. Firstly, I present Oblivion, a static and dynamic system for large-scale analysis of Office documents with embedded (and most of the time concealed) malicious contents. Oblivion performs instrumentation of the code and executes the Office documents in a virtualized environment to de-obfuscate and reconstruct their behavior. In particular, Oblivion can systematically extract embedded PowerShell and non-PowerShell attacks and reconstruct the employed obfuscation strategies. This research work aims to provide a scalable system that allows analysts to go beyond simple malware detection by performing a real, in-depth inspection of macros. To evaluate the system, a large-scale analysis of more than 40,000 Office documents has been performed. The attained results show that Oblivion can efficiently de-obfuscate malicious macro-files by revealing a large corpus of PowerShell and non-PowerShell attacks in a short amount of time. Then, the focus is on presenting an Android adversarial attack framework. This research work aims to understand the feasibility of generating adversarial samples specifically through the injection of Android system API calls only. In particular, the constraints necessary to generate actual adversarial samples are discussed. To evaluate the system, I employ an interpretability technique to assess the impact of specific API calls on the evasion. It is also assessed the vulnerability of the used detection system against mimicry and random noise attacks. Finally, it is proposed a basic implementation to generate concrete and working adversarial samples. The attained results suggest that injecting system API calls could be a viable strategy for attackers to generate concrete adversarial samples. This thesis aims to improve the security landscape in both the research and industrial world by exploring a hot security topic and proposing two novel research works about embedded malware. The main conclusion of this research experience is that systems and devices can be secured with the most robust security processes. At the same time, it is fundamental to improve user awareness and education in detecting and preventing possible attempts of malicious infections

    On the Feasibility of Malware Authorship Attribution

    Full text link
    There are many occasions in which the security community is interested to discover the authorship of malware binaries, either for digital forensics analysis of malware corpora or for thwarting live threats of malware invasion. Such a discovery of authorship might be possible due to stylistic features inherent to software codes written by human programmers. Existing studies of authorship attribution of general purpose software mainly focus on source code, which is typically based on the style of programs and environment. However, those features critically depend on the availability of the program source code, which is usually not the case when dealing with malware binaries. Such program binaries often do not retain many semantic or stylistic features due to the compilation process. Therefore, authorship attribution in the domain of malware binaries based on features and styles that will survive the compilation process is challenging. This paper provides the state of the art in this literature. Further, we analyze the features involved in those techniques. By using a case study, we identify features that can survive the compilation process. Finally, we analyze existing works on binary authorship attribution and study their applicability to real malware binaries.Comment: FPS 201

    Adversarial Machine Learning for the Protection of Legitimate Software

    Get PDF
    Obfuscation is the transforming a given program into one that is syntactically different but semantically equivalent. This new obfuscated program now has its code and/or data changed so that they are hidden and difficult for attackers to understand. Obfuscation is an important security tool and used to defend against reverse engineering. When applied to a program, different transformations can be observed to exhibit differing degrees of complexity and changes to the program. Recent work has shown, by studying these side effects, one can associate patterns with different transformations. By taking this into account and attempting to profile these unique side effects, it is possible to create a classifier using machine learning which can analyze transformed software and identifies what transformation was used to put it in its current state. This has the effect of weakening the security of obfuscating transformations used to protect legitimate software. In this research, we explore options to increase the robustness of obfuscation against attackers who utilize machine learning, particular those who use it to identify the type of obfuscation being employed. To accomplish this, we segment our research into three stages. For the first stage, we implement a suite of classifiers that are used to xiv identify the obfuscation used in samples. These establish a baseline for determining the effectiveness of our proposed defenses and make use of three varied feature sets. For the second stage, we explore methods to evade detection by the classifiers. To accomplish this, attacks setup using the principles of adversarial machine learning are carried out as evasion attacks. These attacks take an obfuscated program and make subtle changes to various aspects that will cause it to be mislabeled by the classifiers. The changes made to the programs affect features looked at by our classifiers, focusing mainly on the number and distribution of opcodes within the program. A constraint of these changes is that the program remains semantically unchanged. In addition, we explore a means of algorithmic dead code insertion in to achieve comparable results against a broader range of classifiers. In the third stage, we combine our attack strategies and evaluate the effect of our changes on the strength of obfuscating transformations. We also propose a framework to implement and automate these and other measures. We the following contributions: 1. An evaluation of the effectiveness of supervised learning models at labeling obfuscated transformations. We create these models using three unique feature sets: Code Images, Opcode N-grams, and Gadgets. 2. Demonstration of two approaches to algorithmic dummy code insertion designed to improve the stealth of obfuscating transformations against machine learning: Adversarial Obfuscation and Opcode Expansion 3. A unified version of our two defenses capable of achieving effectiveness against a broad range of classifiers, while also demonstrating its impact on obfuscation metrics

    Design of secure and robust cognitive system for malware detection

    Full text link
    Machine learning based malware detection techniques rely on grayscale images of malware and tends to classify malware based on the distribution of textures in graycale images. Albeit the advancement and promising results shown by machine learning techniques, attackers can exploit the vulnerabilities by generating adversarial samples. Adversarial samples are generated by intelligently crafting and adding perturbations to the input samples. There exists majority of the software based adversarial attacks and defenses. To defend against the adversaries, the existing malware detection based on machine learning and grayscale images needs a preprocessing for the adversarial data. This can cause an additional overhead and can prolong the real-time malware detection. So, as an alternative to this, we explore RRAM (Resistive Random Access Memory) based defense against adversaries. Therefore, the aim of this thesis is to address the above mentioned critical system security issues. The above mentioned challenges are addressed by demonstrating proposed techniques to design a secure and robust cognitive system. First, a novel technique to detect stealthy malware is proposed. The technique uses malware binary images and then extract different features from the same and then employ different ML-classifiers on the dataset thus obtained. Results demonstrate that this technique is successful in differentiating classes of malware based on the features extracted. Secondly, I demonstrate the effects of adversarial attacks on a reconfigurable RRAM-neuromorphic architecture with different learning algorithms and device characteristics. I also propose an integrated solution for mitigating the effects of the adversarial attack using the reconfigurable RRAM architecture.Comment: arXiv admin note: substantial text overlap with arXiv:2104.0665

    A Deep-dive into Cryptojacking Malware: From an Empirical Analysis to a Detection Method for Computationally Weak Devices

    Get PDF
    Cryptojacking is an act of using a victim\u27s computation power without his/her consent. Unauthorized mining costs extra electricity consumption and decreases the victim host\u27s computational efficiency dramatically. In this thesis, we perform an extensive research on cryptojacking malware from every aspects. First, we present a systematic overview of cryptojacking malware based on the information obtained from the combination of academic research papers, two large cryptojacking datasets of samples, and numerous major attack instances. Second, we created a dataset of 6269 websites containing cryptomining scripts in their source codes to characterize the in-browser cryptomining ecosystem by differentiating permissioned and permissionless cryptomining samples. Third, we introduce an accurate and efficient IoT cryptojacking detection mechanism based on network traffic features that achieves an accuracy of 99%. Finally, we believe this thesis will greatly expand the scope of research and facilitate other novel solutions in the cryptojacking domain

    Evaluation Methodologies in Software Protection Research

    Full text link
    Man-at-the-end (MATE) attackers have full control over the system on which the attacked software runs, and try to break the confidentiality or integrity of assets embedded in the software. Both companies and malware authors want to prevent such attacks. This has driven an arms race between attackers and defenders, resulting in a plethora of different protection and analysis methods. However, it remains difficult to measure the strength of protections because MATE attackers can reach their goals in many different ways and a universally accepted evaluation methodology does not exist. This survey systematically reviews the evaluation methodologies of papers on obfuscation, a major class of protections against MATE attacks. For 572 papers, we collected 113 aspects of their evaluation methodologies, ranging from sample set types and sizes, over sample treatment, to performed measurements. We provide detailed insights into how the academic state of the art evaluates both the protections and analyses thereon. In summary, there is a clear need for better evaluation methodologies. We identify nine challenges for software protection evaluations, which represent threats to the validity, reproducibility, and interpretation of research results in the context of MATE attacks

    Detecting derivative malware samples using deobfuscation-assisted similarity analysis

    Get PDF
    The overwhelming popularity of PHP as a hosting platform has made it the language of choice for developers of Remote Access Trojans (RATs or web shells) and other malicious software. These shells are typically used to compromise and monetise web platforms by providing the attacker with basic remote access to the system, including _le transfer, command execution, network reconnaissance, and database connectivity. Once infected, compromised systems can be used to defraud users by hosting phishing sites, performing Distributed Denial of Service attacks, or serving as anonymous platforms for sending spam or other malfeasance. The vast majority of these threats are largely derivative, incorporating core capabilities found in more established RATs such as c99 and r57. Authors of malicious software routinely produce new shell variants by modifying the behaviours of these ubiquitous RATs, either to add desired functionality or to avoid detection by signature-based detection systems. Once these modified shells are eventually identified (or additional functionality is required), the process of shell adaptation begins again. The end result of this iterative process is a web of separate but related shell variants, many of which are at least partially derived from one of the more popular and influential RATs. In response to the problem outlined above, the author set out to design and implement a system capable of circumventing common obfuscation techniques and identifying derivative malware samples in a given collection. To begin with, a decoder component was developed to syntactically deobfuscate and normalise PHP code by detecting and reversing idiomatic obfuscation constructs, and to apply uniform formatting conventions to all system inputs. A unified malware analysis framework, called Viper, was then extended to create a modular similarity analysis system comprised of individual feature extraction modules, modules responsible for batch processing, a matrix module for comparing sample features, and two visualisation modules capable of generating visual representations of shell similarity. The principal conclusion of the research was that the deobfuscation performed by the decoder component prior to analysis dramatically improved the observed levels of similarity between test samples. This in turn allowed the modular similarity analysis system to identify derivative clusters (or families) within a large collection of shells more accurately. Techniques for isolating and re-rendering these clusters were also developed and demonstrated to be effective at increasing the amount of detail available for evaluating the relative magnitudes of the relationships within each cluster

    Analysis avoidance techniques of malicious software

    Get PDF
    Anti Virus (AV) software generally employs signature matching and heuristics to detect the presence of malicious software (malware). The generation of signatures and determination of heuristics is dependent upon an AV analyst having successfully determined the nature of the malware, not only for recognition purposes, but also for the determination of infected files and startup mechanisms that need to be removed as part of the disinfection process. If a specimen of malware has not been previously extensively analyzed, it is unlikely to be detected by AV software. In addition, malware is becoming increasingly profit driven and more likely to incorporate stealth and deception techniques to avoid detection and analysis to remain on infected systems for a myriad of nefarious purposes. Malware extends beyond the commonly thought of virus or worm, to customized malware that has been developed for specific and targeted miscreant purposes. Such customized malware is highly unlikely to be detected by AV software because it will not have been previously analyzed and a signature will not exist. Analysis in such a case will have to be conducted by a digital forensics analyst to determine the functionality of the malware. Malware can employ a plethora of techniques to hinder the analysis process conducted by AV and digital forensics analysts. The purpose of this research has been to answer three research questions directly related to the employment of these techniques as: 1. What techniques can malware use to avoid being analyzed? 2. How can the use of these techniques be detected? 3. How can the use of these techniques be mitigated
    corecore