10 research outputs found
Online Accumulation: Reconstruction of Worm Propagation Path
Abstract. Knowledge of the worm origin is necessary to forensic analysis, and knowledge of the initial causal flows supports diagnosis of how network defenses were breached. Fast and accurate online tracing network worm during its propagation, help to detect worm origin and the earliest infected nodes, and is essential for large-scale worm containment. This paper introduces the Accumulation Algorithm which can efficiently tracing worm origin and the initial propagation paths, and presents an improved online Accumulation Algorithm using sliding detection windows. We also analyzes and verifies their detection accuracy and containment efficacy through simulation experiments in large scale network. Results indicate that the online Accumulation Algorithm can accurately tracing worms and efficiently containing their propagation in an approximately real-time manner
Forensic identification and detection of hidden and obfuscated malware
The revolution in online criminal activities and malicious software (malware) has posed a serious challenge in malware forensics. Malicious attacks have become more organized and purposefully directed. With cybercrimes escalating to great heights in quantity as well as in sophistication and stealth, the main challenge is to detect hidden and obfuscated malware. Malware authors use a variety of obfuscation methods and specialized stealth techniques of information hiding to embed malicious code, to infect systems and to thwart any attempt to detect them, specifically with the use of commercially available anti-malware engines. This has led to the situation of zero-day attacks, where malware inflict systems even with existing security measures. The aim of this thesis is to address this situation by proposing a variety of novel digital forensic and data mining techniques to automatically detect hidden and obfuscated malware. Anti-malware engines use signature matching to detect malware where signatures are generated by human experts by disassembling the file and selecting pieces of unique code. Such signature based detection works effectively with known malware but performs poorly with hidden or unknown malware. Code obfuscation techniques, such as packers, polymorphism and metamorphism, are able to fool current detection techniques by modifying the parent code to produce offspring copies resulting in malware that has the same functionality, but with a different structure. These evasion techniques exploit the drawbacks of traditional malware detection methods, which take current malware structure and create a signature for detecting this malware in the future. However, obfuscation techniques aim to reduce vulnerability to any kind of static analysis to the determent of any reverse engineering process. Furthermore, malware can be hidden in file system slack space, inherent in NTFS file system based partitions, resulting in malware detection that even more difficult.Doctor of Philosoph
A structured approach to malware detection and analysis in digital forensics investigation
A thesis submitted to the University of Bedfordshire in partial fulfilment of the requirement for the degree of PhDWithin the World Wide Web (WWW), malware is considered one of the most serious threats to system security with complex system issues caused by malware and spam. Networks and systems can be accessed and compromised by various types of malware, such as viruses, worms, Trojans, botnet and rootkits, which compromise systems through coordinated attacks. Malware often uses anti-forensic techniques to avoid detection and investigation. Moreover, the results of investigating such attacks are often ineffective and can create barriers for obtaining clear evidence due to the lack of sufficient tools and the immaturity of forensics methodology. This research addressed various complexities faced by investigators in the detection and analysis of malware. In this thesis, the author identified the need for a new approach towards malware detection that focuses on a robust framework, and proposed a solution based on an extensive literature review and market research analysis. The literature review focussed on the different trials and techniques in malware detection to identify the parameters for developing a solution design, while market research was carried out to understand the precise nature of the current problem. The author termed the new approaches and development of the new framework the triple-tier centralised online real-time environment (tri-CORE) malware analysis (TCMA). The tiers come from three distinctive phases of detection and analysis where the entire research pattern is divided into three different domains. The tiers are the malware acquisition function, detection and analysis, and the database operational function. This framework design will contribute to the field of computer forensics by making the investigative process more effective and efficient. By integrating a hybrid method for malware detection, associated limitations with both static and dynamic methods are eliminated. This aids forensics experts with carrying out quick, investigatory processes to detect the behaviour of the malware and its related elements. The proposed framework will help to ensure system confidentiality, integrity, availability and accountability. The current research also focussed on a prototype (artefact) that was developed in favour of a different approach in digital forensics and malware detection methods. As such, a new Toolkit was designed and implemented, which is based on a simple architectural structure and built from open source software that can help investigators develop the skills to critically respond to current cyber incidents and analyses
Application of information theory and statistical learning to anomaly detection
In today\u27s highly networked world, computer intrusions and other attacks area constant threat. The detection of such attacks, especially attacks that are new or previously unknown, is important to secure networks and computers. A major focus of current research efforts in this area is on anomaly detection.;In this dissertation, we explore applications of information theory and statistical learning to anomaly detection. Specifically, we look at two difficult detection problems in network and system security, (1) detecting covert channels, and (2) determining if a user is a human or bot. We link both of these problems to entropy, a measure of randomness information content, or complexity, a concept that is central to information theory. The behavior of bots is low in entropy when tasks are rigidly repeated or high in entropy when behavior is pseudo-random. In contrast, human behavior is complex and medium in entropy. Similarly, covert channels either create regularity, resulting in low entropy, or encode extra information, resulting in high entropy. Meanwhile, legitimate traffic is characterized by complex interdependencies and moderate entropy. In addition, we utilize statistical learning algorithms, Bayesian learning, neural networks, and maximum likelihood estimation, in both modeling and detecting of covert channels and bots.;Our results using entropy and statistical learning techniques are excellent. By using entropy to detect covert channels, we detected three different covert timing channels that were not detected by previous detection methods. Then, using entropy and Bayesian learning to detect chat bots, we detected 100% of chat bots with a false positive rate of only 0.05% in over 1400 hours of chat traces. Lastly, using neural networks and the idea of human observational proofs to detect game bots, we detected 99.8% of game bots with no false positives in 95 hours of traces. Our work shows that a combination of entropy measures and statistical learning algorithms is a powerful and highly effective tool for anomaly detection
Recommended from our members
Security awareness of computer users: A game based learning approach
This thesis was submitted for the degree of Doctor of Philosophy and awarded by Brunel University.The research reported in this thesis focuses on developing a framework for game design to protect computer users against phishing attacks. A comprehensive literature review was conducted to understand the research domain, support the proposed research work and identify the research gap to fulfil the contribution to knowledge. Two studies and one theoretical design were carried out to achieve the aim of this research reported in this thesis. A quantitative approach was used in the first study while engaging both quantitative and qualitative approaches in the second study. The first study reported in this thesis was focused to investigate the key elements that should be addressed in the game design framework to avoid phishing attacks. The proposed game design framework was aimed to enhance the user avoidance behaviour through motivation to thwart phishing attack. The results of this study revealed that perceived threat, safeguard effectiveness, safeguard cost, self-efficacy, perceived severity and perceived susceptibility elements should be incorporated into the game design framework for computer users to avoid phishing attacks through their motivation. The theoretical design approach was focused on designing a mobile game to educate computer users against phishing attacks. The elements of the framework were addressed in the mobile game design context. The main objective of the proposed mobile game design was to teach users how to identify phishing website addresses (URLs), which is one of many ways of identifying a phishing attack. The mobile game prototype was developed using MIT App inventor emulator. In the second study, the formulated game design framework was evaluated through the deployed mobile game prototype on a HTC One X touch screen smart phone. Then a discussion is reported in this thesis investigating the effectiveness of the developed mobile game prototype compared to traditional online learning to thwart phishing threats. Finally, the research reported in this thesis found that the mobile game is somewhat effective in enhancing the user’s phishing awareness. It also revealed that the participants who played the mobile game were better able to identify fraudulent websites compared to the participants who read the website without any training. Therefore, the research reported in this thesis determined that perceived threat, safeguard effectiveness, safeguard cost, self-efficacy, perceived threat and perceived susceptibility elements have a significant impact on avoidance behaviour through motivation to thwart phishing attacks as addressed in the game design framework
On countermeasures of worm attacks over the Internet
Worm attacks have always been considered dangerous threats to the Internet since they can
infect a large number of computers and consequently cause large-scale service disruptions and
damage. Thus, research on modeling worm attacks, and defenses against them, have become
vital to the field of computer and network security. This dissertation intends to systematically
study two classes of countermeasures against worm attacks, known as traffic-based
countermeasure and non-traffic based countermeasure. Traffic-based countermeasures are those
whose means are limited to monitoring, collecting, and analyzing the traffic generated by worm
attacks. Non-traffic based countermeasures do not have such limitations.
For the traffic-based countermeasures, we first consider the worm attack that adopts feedback
loop-control mechanisms which make its overall propagation traffic behavior similar to
background non-worm traffic and circumvent the detection. We also develop a novel spectrumbased
scheme to achieve highly effective detection performance against such attacks. We then
consider worm attacks that perform probing traffic in a stealthy manner to obtain the location infrastructure of a defense system and introduce an information-theoretic based framework to
obtain the limitations of such attacks and develop corresponding countermeasures.
For the non-traffic based countermeasures, we first consider new unseen worm attacks and
develop the countermeasure based on mining the dynamic signature of worm programs’ run-time
execution. We then consider a generic worm attack that dynamically changes its propagation
patterns and develops integrated countermeasures based on the attacker’s contradicted
objectives. Lastly, we consider the real-world system setting with multiple incoming worm
attacks that collaborate by sharing the history of their interactions with the defender and develop
a generic countermeasure based on establishing the defender’s reputation of toughness in its
repeated interactions with multiple incoming attackers to optimize the long-term defense
performance.
This dissertation research has broad impacts on Internet worm research since this work is
fundamental, practical and extensible. Our developed framework can be used by researchers to
understand key features of other forms of new worm attacks and develop countermeasures
against them
Machine Learning and Security of Non-Executable Files
Computer malware is a well-known threat in security which, despite the enormous time and effort invested in fighting it, is today more prevalent than ever. Recent years have brought a surge in one particular type: malware embedded in non-executable file formats, e.g., PDF, SWF and various office file formats. The result has been a massive number of infections, owed primarily to the trust that ordinary computer users have in these file formats. In addition, their feature-richness and implementation complexity have created enormous attack surfaces in widely deployed client software, resulting in regular discoveries of new vulnerabilities.
The traditional approach to malware detection – signature matching, heuristics and behavioral profiling – has from its inception been a labor-intensive manual task, always lagging one step behind the attacker. With the exponential growth of computers and networks, malware has become more diverse, wide-spread and adaptive than ever, scaling much faster than the available talent pool of human malware analysts. An automated and scalable approach is needed to fill the gap between automated malware adaptation and manual malware detection, and machine learning is emerging as a viable solution. Its branch called adversarial machine learning studies the security of machine learning algorithms and the special conditions that arise when machine learning is applied for security.
This thesis is a study of adversarial machine learning in the context of static detection of malware in non-executable file formats. It evaluates the effectiveness, efficiency and security of machine learning applications in this context. To this end, it introduces 3 data-driven detection methods developed using very large, high quality datasets. PJScan detects malicious PDF files based on lexical properties of embedded JavaScript code and is the fastest method published to date. SL2013 extends its coverage to all PDF files, regardless of JavaScript presence, by analyzing the hierarchical structure of PDF logical building blocks and demonstrates excellent performance in a novel long-term realistic experiment. Finally, Hidost generalizes the hierarchical-structure-based feature set to become the first machine-learning-based malware detector operating on multiple file formats. In a comprehensive experimental evaluation on PDF and SWF, it outperforms other academic methods and commercial antivirus systems in detection effectiveness.
Furthermore, the thesis presents a framework for security evaluation of machine learning classifiers in a case study performed on an independent PDF malware detector. The results show that the ability to manipulate a part of the classifier’s feature set allows a malicious adversary to disguise malware so that it appears benign to the classifier with a high success rate. The presented methods are released as open-source software.Schadsoftware ist eine gut bekannte Sicherheitsbedrohung. Trotz der enormen Zeit und des Aufwands die investiert werden, um sie zu beseitigen, ist sie heute weiter verbreitet als je zuvor. In den letzten Jahren kam es zu einem starken Anstieg von Schadsoftware, welche in nicht-ausführbaren Dateiformaten, wie PDF, SWF und diversen Office-Formaten, eingebettet ist. Die Folge war eine massive Anzahl von Infektionen, ermöglicht durch das Vertrauen, das normale Rechnerbenutzer in diese Dateiformate haben. Außerdem hat die Komplexität und Vielseitigkeit dieser Dateiformate große Angriffsflächen in weitverbreiteter Klient-Software verursacht, und neue Sicherheitslücken werden regelmäßig entdeckt.
Der traditionelle Ansatz zur Erkennung von Schadsoftware – Mustererkennung, Heuristiken und Verhaltensanalyse – war vom Anfang an eine äußerst mühevolle Handarbeit, immer einen Schritt hinter den Angreifern zurück. Mit dem exponentiellen Wachstum von Rechenleistung und Netzwerkgeschwindigkeit ist Schadsoftware diverser, zahlreicher und schneller-anpassend geworden als je zuvor, doch die Verfügbarkeit von menschlichen Schadsoftware-Analysten kann nicht so schnell skalieren. Ein automatischer und skalierbarer Ansatz ist gefragt, und maschinelles Lernen tritt als eine brauchbare Lösung hervor. Ein Bereich davon, Adversarial Machine Learning, untersucht die Sicherheit von maschinellen Lernverfahren und die besonderen Verhältnisse, die bei der Anwendung von machinellem Lernen für Sicherheit entstehen.
Diese Arbeit ist eine Studie von Adversarial Machine Learning im Kontext statischer Schadsoftware-Erkennung in nicht-ausführbaren Dateiformaten. Sie evaluiert die Wirksamkeit, Leistungsfähigkeit und Sicherheit von maschinellem Lernen in diesem Kontext. Zu diesem Zweck stellt sie 3 datengesteuerte Erkennungsmethoden vor, die alle auf sehr großen und diversen Datensätzen entwickelt wurden. PJScan erkennt bösartige PDF-Dateien anhand lexikalischer Eigenschaften von eingebettetem JavaScript-Code und ist die schnellste bisher veröffentliche Methode. SL2013 erweitert die Erkennung auf alle PDF-Dateien, unabhängig davon, ob sie JavaScript enthalten, indem es die hierarchische Struktur von logischen PDF-Bausteinen analysiert. Es zeigt hervorragende Leistung in einem neuen, langfristigen und realistischen Experiment. Schließlich generalisiert Hidost den auf hierarchischen Strukturen basierten Merkmalsraum und wurde zum ersten auf maschinellem Lernen basierten Schadsoftware-Erkennungssystem, das auf mehreren Dateiformaten anwendbar ist. In einer umfassenden experimentellen Evaulierung auf PDF- und SWF-Formaten schlägt es andere akademische Methoden und kommerzielle Antiviren-Lösungen bezüglich Erkennungswirksamkeit.
Überdies stellt diese Doktorarbeit ein Framework für Sicherheits-Evaluierung von auf machinellem Lernen basierten Klassifikatoren vor und wendet es in einer Fallstudie auf eine unabhängige akademische Schadsoftware-Erkennungsmethode an. Die Ergebnisse zeigen, dass die Fähigkeit, nur einen Teil von Features, die ein Klasifikator verwendet, zu manipulieren, einem Angreifer ermöglicht, Schadsoftware in Dateien so einzubetten, dass sie von der Erkennungsmethode mit hoher Erfolgsrate als gutartig fehlklassifiziert wird. Die vorgestellten Methoden wurden als Open-Source-Software veröffentlicht
Social informatics
5th International Conference, SocInfo 2013, Kyoto, Japan, November 25-27, 2013, Proceedings</p