50 research outputs found

    CryptoKnight:generating and modelling compiled cryptographic primitives

    Get PDF
    Cryptovirological augmentations present an immediate, incomparable threat. Over the last decade, the substantial proliferation of crypto-ransomware has had widespread consequences for consumers and organisations alike. Established preventive measures perform well, however, the problem has not ceased. Reverse engineering potentially malicious software is a cumbersome task due to platform eccentricities and obfuscated transmutation mechanisms, hence requiring smarter, more efficient detection strategies. The following manuscript presents a novel approach for the classification of cryptographic primitives in compiled binary executables using deep learning. The model blueprint, a Dynamic Convolutional Neural Network (DCNN), is fittingly configured to learn from variable-length control flow diagnostics output from a dynamic trace. To rival the size and variability of equivalent datasets, and to adequately train our model without risking adverse exposure, a methodology for the procedural generation of synthetic cryptographic binaries is defined, using core primitives from OpenSSL with multivariate obfuscation, to draw a vastly scalable distribution. The library, CryptoKnight, rendered an algorithmic pool of AES, RC4, Blowfish, MD5 and RSA to synthesise combinable variants which automatically fed into its core model. Converging at 96% accuracy, CryptoKnight was successfully able to classify the sample pool with minimal loss and correctly identified the algorithm in a real-world crypto-ransomware applicatio

    Understanding Android Obfuscation Techniques: A Large-Scale Investigation in the Wild

    Get PDF
    In this paper, we seek to better understand Android obfuscation and depict a holistic view of the usage of obfuscation through a large-scale investigation in the wild. In particular, we focus on four popular obfuscation approaches: identifier renaming, string encryption, Java reflection, and packing. To obtain the meaningful statistical results, we designed efficient and lightweight detection models for each obfuscation technique and applied them to our massive APK datasets (collected from Google Play, multiple third-party markets, and malware databases). We have learned several interesting facts from the result. For example, malware authors use string encryption more frequently, and more apps on third-party markets than Google Play are packed. We are also interested in the explanation of each finding. Therefore we carry out in-depth code analysis on some Android apps after sampling. We believe our study will help developers select the most suitable obfuscation approach, and in the meantime help researchers improve code analysis systems in the right direction

    Catching modern botnets using active integrated evidential reasoning

    Full text link

    DAG-Based Attack and Defense Modeling: Don't Miss the Forest for the Attack Trees

    Full text link
    This paper presents the current state of the art on attack and defense modeling approaches that are based on directed acyclic graphs (DAGs). DAGs allow for a hierarchical decomposition of complex scenarios into simple, easily understandable and quantifiable actions. Methods based on threat trees and Bayesian networks are two well-known approaches to security modeling. However there exist more than 30 DAG-based methodologies, each having different features and goals. The objective of this survey is to present a complete overview of graphical attack and defense modeling techniques based on DAGs. This consists of summarizing the existing methodologies, comparing their features and proposing a taxonomy of the described formalisms. This article also supports the selection of an adequate modeling technique depending on user requirements

    Rethinking Privacy and Security Mechanisms in Online Social Networks

    Get PDF
    With billions of users, Online Social Networks(OSNs) are amongst the largest scale communication applications on the Internet. OSNs enable users to easily access news from local and worldwide, as well as share information publicly and interact with friends. On the negative side, OSNs are also abused by spammers to distribute ads or malicious information, such as scams, fraud, and even manipulate public political opinions. Having achieved significant commercial success with large amount of user information, OSNs do treat the security and privacy of their users seriously and provide several mechanisms to reinforce their account security and information privacy. However, the efficacy of those measures is either not thoroughly validated or in need to be improved. In sight of cyber criminals and potential privacy threats on OSNs, we focus on the evaluations and improvements of OSN user privacy configurations, account security protection mechanisms, and trending topic security in this dissertation. We first examine the effectiveness of OSN privacy settings on protecting user privacy. Given each privacy configuration, we propose a corresponding scheme to reveal the target user\u27s basic profile and connection information starting from some leaked connections on the user\u27s homepage. Based on the dataset we collected on Facebook, we calculate the privacy exposure in each privacy setting type and measure the accuracy of our privacy inference schemes with different amount of public information. The evaluation results show that (1) a user\u27s private basic profile can be inferred with high accuracy and (2) connections can be revealed in a significant portion based on even a small number of directly leaked connections. Secondly, we propose a behavioral-profile-based method to detect OSN user account compromisation in a timely manner. Specifically, we propose eight behavioral features to portray a user\u27s social behavior. A user\u27s statistical distributions of those feature values comprise its behavioral profile. Based on the sample data we collected from Facebook, we observe that each user\u27s activities are highly likely to conform to its behavioral profile while two different user\u27s profile tend to diverge from each other, which can be employed for compromisation detection. The evaluation result shows that the more complete and accurate a user\u27s behavioral profile can be built the more accurately compromisation can be detected. Finally, we investigate the manipulation of OSN trending topics. Based on the dataset we collected from Twitter, we manifest the manipulation of trending and a suspect spamming infrastructure. We then measure how accurately the five factors (popularity, coverage, transmission, potential coverage, and reputation) can predict trending using an SVM classifier. We further study the interaction patterns between authenticated accounts and malicious accounts in trending. at last we demonstrate the threats of compromised accounts and sybil accounts to trending through simulation and discuss countermeasures against trending manipulation

    Using Malware Analysis to Evaluate Botnet Resilience

    Get PDF
    Bos, H.J. [Promotor]Steen, M.R. van [Promotor

    Machine Learning and Security of Non-Executable Files

    Get PDF
    Computer malware is a well-known threat in security which, despite the enormous time and effort invested in fighting it, is today more prevalent than ever. Recent years have brought a surge in one particular type: malware embedded in non-executable file formats, e.g., PDF, SWF and various office file formats. The result has been a massive number of infections, owed primarily to the trust that ordinary computer users have in these file formats. In addition, their feature-richness and implementation complexity have created enormous attack surfaces in widely deployed client software, resulting in regular discoveries of new vulnerabilities. The traditional approach to malware detection – signature matching, heuristics and behavioral profiling – has from its inception been a labor-intensive manual task, always lagging one step behind the attacker. With the exponential growth of computers and networks, malware has become more diverse, wide-spread and adaptive than ever, scaling much faster than the available talent pool of human malware analysts. An automated and scalable approach is needed to fill the gap between automated malware adaptation and manual malware detection, and machine learning is emerging as a viable solution. Its branch called adversarial machine learning studies the security of machine learning algorithms and the special conditions that arise when machine learning is applied for security. This thesis is a study of adversarial machine learning in the context of static detection of malware in non-executable file formats. It evaluates the effectiveness, efficiency and security of machine learning applications in this context. To this end, it introduces 3 data-driven detection methods developed using very large, high quality datasets. PJScan detects malicious PDF files based on lexical properties of embedded JavaScript code and is the fastest method published to date. SL2013 extends its coverage to all PDF files, regardless of JavaScript presence, by analyzing the hierarchical structure of PDF logical building blocks and demonstrates excellent performance in a novel long-term realistic experiment. Finally, Hidost generalizes the hierarchical-structure-based feature set to become the first machine-learning-based malware detector operating on multiple file formats. In a comprehensive experimental evaluation on PDF and SWF, it outperforms other academic methods and commercial antivirus systems in detection effectiveness. Furthermore, the thesis presents a framework for security evaluation of machine learning classifiers in a case study performed on an independent PDF malware detector. The results show that the ability to manipulate a part of the classifier’s feature set allows a malicious adversary to disguise malware so that it appears benign to the classifier with a high success rate. The presented methods are released as open-source software.Schadsoftware ist eine gut bekannte Sicherheitsbedrohung. Trotz der enormen Zeit und des Aufwands die investiert werden, um sie zu beseitigen, ist sie heute weiter verbreitet als je zuvor. In den letzten Jahren kam es zu einem starken Anstieg von Schadsoftware, welche in nicht-ausführbaren Dateiformaten, wie PDF, SWF und diversen Office-Formaten, eingebettet ist. Die Folge war eine massive Anzahl von Infektionen, ermöglicht durch das Vertrauen, das normale Rechnerbenutzer in diese Dateiformate haben. Außerdem hat die Komplexität und Vielseitigkeit dieser Dateiformate große Angriffsflächen in weitverbreiteter Klient-Software verursacht, und neue Sicherheitslücken werden regelmäßig entdeckt. Der traditionelle Ansatz zur Erkennung von Schadsoftware – Mustererkennung, Heuristiken und Verhaltensanalyse – war vom Anfang an eine äußerst mühevolle Handarbeit, immer einen Schritt hinter den Angreifern zurück. Mit dem exponentiellen Wachstum von Rechenleistung und Netzwerkgeschwindigkeit ist Schadsoftware diverser, zahlreicher und schneller-anpassend geworden als je zuvor, doch die Verfügbarkeit von menschlichen Schadsoftware-Analysten kann nicht so schnell skalieren. Ein automatischer und skalierbarer Ansatz ist gefragt, und maschinelles Lernen tritt als eine brauchbare Lösung hervor. Ein Bereich davon, Adversarial Machine Learning, untersucht die Sicherheit von maschinellen Lernverfahren und die besonderen Verhältnisse, die bei der Anwendung von machinellem Lernen für Sicherheit entstehen. Diese Arbeit ist eine Studie von Adversarial Machine Learning im Kontext statischer Schadsoftware-Erkennung in nicht-ausführbaren Dateiformaten. Sie evaluiert die Wirksamkeit, Leistungsfähigkeit und Sicherheit von maschinellem Lernen in diesem Kontext. Zu diesem Zweck stellt sie 3 datengesteuerte Erkennungsmethoden vor, die alle auf sehr großen und diversen Datensätzen entwickelt wurden. PJScan erkennt bösartige PDF-Dateien anhand lexikalischer Eigenschaften von eingebettetem JavaScript-Code und ist die schnellste bisher veröffentliche Methode. SL2013 erweitert die Erkennung auf alle PDF-Dateien, unabhängig davon, ob sie JavaScript enthalten, indem es die hierarchische Struktur von logischen PDF-Bausteinen analysiert. Es zeigt hervorragende Leistung in einem neuen, langfristigen und realistischen Experiment. Schließlich generalisiert Hidost den auf hierarchischen Strukturen basierten Merkmalsraum und wurde zum ersten auf maschinellem Lernen basierten Schadsoftware-Erkennungssystem, das auf mehreren Dateiformaten anwendbar ist. In einer umfassenden experimentellen Evaulierung auf PDF- und SWF-Formaten schlägt es andere akademische Methoden und kommerzielle Antiviren-Lösungen bezüglich Erkennungswirksamkeit. Überdies stellt diese Doktorarbeit ein Framework für Sicherheits-Evaluierung von auf machinellem Lernen basierten Klassifikatoren vor und wendet es in einer Fallstudie auf eine unabhängige akademische Schadsoftware-Erkennungsmethode an. Die Ergebnisse zeigen, dass die Fähigkeit, nur einen Teil von Features, die ein Klasifikator verwendet, zu manipulieren, einem Angreifer ermöglicht, Schadsoftware in Dateien so einzubetten, dass sie von der Erkennungsmethode mit hoher Erfolgsrate als gutartig fehlklassifiziert wird. Die vorgestellten Methoden wurden als Open-Source-Software veröffentlicht
    corecore