6 research outputs found

    Studying JavaScript Security Through Static Analysis

    Get PDF
    Mit dem stetigen Wachstum des Internets wächst auch das Interesse von Angreifern. Ursprünglich sollte das Internet Menschen verbinden; gleichzeitig benutzen aber Angreifer diese Vernetzung, um Schadprogramme wirksam zu verbreiten. Insbesondere JavaScript ist zu einem beliebten Angriffsvektor geworden, da es Angreifer ermöglicht Bugs und weitere Sicherheitslücken auszunutzen, und somit die Sicherheit und Privatsphäre der Internetnutzern zu gefährden. In dieser Dissertation fokussieren wir uns auf die Erkennung solcher Bedrohungen, indem wir JavaScript Code statisch und effizient analysieren. Zunächst beschreiben wir unsere zwei Detektoren, welche Methoden des maschinellen Lernens mit statischen Features aus Syntax, Kontroll- und Datenflüssen kombinieren zur Erkennung bösartiger JavaScript Dateien. Wir evaluieren daraufhin die Verlässlichkeit solcher statischen Systeme, indem wir bösartige JavaScript Dokumente umschreiben, damit sie die syntaktische Struktur von bestehenden gutartigen Skripten reproduzieren. Zuletzt studieren wir die Sicherheit von Browser Extensions. Zu diesem Zweck modellieren wir Extensions mit einem Graph, welcher Kontroll-, Daten-, und Nachrichtenflüsse mit Pointer Analysen kombiniert, wodurch wir externe Flüsse aus und zu kritischen Extension-Funktionen erkennen können. Insgesamt wiesen wir 184 verwundbare Chrome Extensions nach, welche die Angreifer ausnutzen könnten, um beispielsweise beliebigen Code im Browser eines Opfers auszuführen.As the Internet keeps on growing, so does the interest of malicious actors. While the Internet has become widespread and popular to interconnect billions of people, this interconnectivity also simplifies the spread of malicious software. Specifically, JavaScript has become a popular attack vector, as it enables to stealthily exploit bugs and further vulnerabilities to compromise the security and privacy of Internet users. In this thesis, we approach these issues by proposing several systems to statically analyze real-world JavaScript code at scale. First, we focus on the detection of malicious JavaScript samples. To this end, we propose two learning-based pipelines, which leverage syntactic, control and data-flow based features to distinguish benign from malicious inputs. Subsequently, we evaluate the robustness of such static malicious JavaScript detectors in an adversarial setting. For this purpose, we introduce a generic camouflage attack, which consists in rewriting malicious samples to reproduce existing benign syntactic structures. Finally, we consider vulnerable browser extensions. In particular, we abstract an extension source code at a semantic level, including control, data, and message flows, and pointer analysis, to detect suspicious data flows from and toward an extension privileged context. Overall, we report on 184 Chrome extensions that attackers could exploit to, e.g., execute arbitrary code in a victim's browser

    Evaluation Methodologies in Software Protection Research

    Full text link
    Man-at-the-end (MATE) attackers have full control over the system on which the attacked software runs, and try to break the confidentiality or integrity of assets embedded in the software. Both companies and malware authors want to prevent such attacks. This has driven an arms race between attackers and defenders, resulting in a plethora of different protection and analysis methods. However, it remains difficult to measure the strength of protections because MATE attackers can reach their goals in many different ways and a universally accepted evaluation methodology does not exist. This survey systematically reviews the evaluation methodologies of papers on obfuscation, a major class of protections against MATE attacks. For 572 papers, we collected 113 aspects of their evaluation methodologies, ranging from sample set types and sizes, over sample treatment, to performed measurements. We provide detailed insights into how the academic state of the art evaluates both the protections and analyses thereon. In summary, there is a clear need for better evaluation methodologies. We identify nine challenges for software protection evaluations, which represent threats to the validity, reproducibility, and interpretation of research results in the context of MATE attacks

    The Effect of Code Obfuscation on Authorship Attribution of Binary Computer Files

    Get PDF
    In many forensic investigations, questions linger regarding the identity of the authors of the software specimen. Research has identified methods for the attribution of binary files that have not been obfuscated, but a significant percentage of malicious software has been obfuscated in an effort to hide both the details of its origin and its true intent. Little research has been done around analyzing obfuscated code for attribution. In part, the reason for this gap in the research is that deobfuscation of an unknown program is a challenging task. Further, the additional transformation of the executable file introduced by the obfuscator modifies or removes features from the original executable that would have been used in the author attribution process. Existing research has demonstrated good success in attributing the authorship of an executable file of unknown provenance using methods based on static analysis of the specimen file. With the addition of file obfuscation, static analysis of files becomes difficult, time consuming, and in some cases, may lead to inaccurate findings. This paper presents a novel process for authorship attribution using dynamic analysis methods. A software emulated system was fully instrumented to become a test harness for a specimen of unknown provenance, allowing for supervised control, monitoring, and trace data collection during execution. This trace data was used as input into a supervised machine learning algorithm trained to identify stylometric differences in the specimen under test and provide predictions on who wrote the specimen. The specimen files were also analyzed for authorship using static analysis methods to compare prediction accuracies with prediction accuracies gathered from this new, dynamic analysis based method. Experiments indicate that this new method can provide better accuracy of author attribution for files of unknown provenance, especially in the case where the specimen file has been obfuscated

    Cost-effective Detection of Drive-by-Download Attacks with Hybrid Client Honeypots

    No full text
    With the increasing connectivity of and reliance on computers and networks, important aspects of computer systems are under a constant threat. In particular, drive-by-download attacks have emerged as a new threat to the integrity of computer systems. Drive-by-download attacks are clientside attacks that originate fromweb servers that are visited byweb browsers. As a vulnerable web browser retrieves a malicious web page, the malicious web server can push malware to a user's machine that can be executed without their notice or consent. The detection of malicious web pages that exist on the Internet is prohibitively expensive. It is estimated that approximately 150 million malicious web pages that launch drive-by-download attacks exist today. Socalled high-interaction client honeypots are devices that are able to detect these malicious web pages, but they are slow and known to miss attacks. Detection ofmaliciousweb pages in these quantitieswith client honeypots would cost millions of US dollars. Therefore, we have designed a more scalable system called a hybrid client honeypot. It consists of lightweight client honeypots, the so-called low-interaction client honeypots, and traditional high-interaction client honeypots. The lightweight low-interaction client honeypots inspect web pages at high speed and forward only likely malicious web pages to the high-interaction client honeypot for a final classification. For the comparison of client honeypots and evaluation of the hybrid client honeypot system, we have chosen a cost-based evaluation method: the true positive cost curve (TPCC). It allows us to evaluate client honeypots against their primary purpose of identification of malicious web pages. We show that costs of identifying malicious web pages with the developed hybrid client honeypot systems are reduced by a factor of nine compared to traditional high-interaction client honeypots. The five main contributions of our work are: High-Interaction Client Honeypot The first main contribution of our work is the design and implementation of a high-interaction client honeypot Capture-HPC. It is an open-source, publicly available client honeypot research platform, which allows researchers and security professionals to conduct research on malicious web pages and client honeypots. Based on our client honeypot implementation and analysis of existing client honeypots, we developed a component model of client honeypots. This model allows researchers to agree on the object of study, allows for focus of specific areas within the object of study, and provides a framework for communication of research around client honeypots. True Positive Cost Curve As mentioned above, we have chosen a cost-based evaluationmethod to compare and evaluate client honeypots against their primary purpose of identification ofmaliciousweb pages: the true positive cost curve. It takes into account the unique characteristics of client honeypots, speed, detection accuracy, and resource cost and provides a simple, cost-based mechanism to evaluate and compare client honeypots in an operating environment. As such, the TPCC provides a foundation for improving client honeypot technology. The TPCC is the second main contribution of our work. Mitigation of Risks to the Experimental Design with HAZOP - Mitigation of risks to internal and external validity on the experimental design using hazard and operability (HAZOP) study is the third main contribution. This methodology addresses risks to intent (internal validity) as well as generalizability of results beyond the experimental setting (external validity) in a systematic and thorough manner. Low-Interaction Client Honeypots - Malicious web pages are usually part of a malware distribution network that consists of several servers that are involved as part of the drive-by-download attack. Development and evaluation of classification methods that assess whether a web page is part of a malware distribution network is the fourth main contribution. Hybrid Client Honeypot System - The fifth main contribution is the hybrid client honeypot system. It incorporates the mentioned classification methods in the form of a low-interaction client honeypot and a high-interaction client honeypot into a hybrid client honeypot systemthat is capable of identifying malicious web pages in a cost effective way on a large scale. The hybrid client honeypot system outperforms a high-interaction client honeypot with identical resources and identical false positive rate

    Formalization and Detection of Host-Based Code Injection Attacks in the Context of Malware

    Get PDF
    The Host-Based Code Injection Attack (HBCIAs) is a technique that malicious software utilizes in order to avoid detection or steal sensitive information. In a nutshell, this is a local attack where code is injected across process boundaries and executed in the context of a victim process. Malware employs HBCIAs on several operating systems including Windows, Linux, and macOS. This thesis investigates the topic of HBCIAs in the context of malware. First, we conduct basic research on this topic. We formalize HBCIAs in the context of malware and show in several measurements, amongst others, the high prevelance of HBCIA-utilizing malware. Second, we present Bee Master, a platform-independent approach to dynamically detect HBCIAs. This approach applies the honeypot paradigm to operating system processes. Bee Master deploys fake processes as honeypots, which are attacked by malicious software. We show that Bee Master reliably detects HBCIAs on Windows and Linux. Third, we present Quincy, a machine learning-based system to detect HBCIAs in post-mortem memory dumps. It utilizes up to 38 features including memory region sparseness, memory region protection, and the occurence of HBCIA-related strings. We evaluate Quincy with two contemporary detection systems called Malfind and Hollowfind. This evaluation shows that Quincy outperforms them both. It is able to increase the detection performance by more than eight percent

    Software similarity and classification

    Full text link
    This thesis analyses software programs in the context of their similarity to other software programs. Applications proposed and implemented include detecting malicious software and discovering security vulnerabilities
    corecore