10 research outputs found

    Harnessing Predictive Models for Assisting Network Forensic Investigations of DNS Tunnels

    Get PDF
    In recent times, DNS tunneling techniques have been used for malicious purposes, however network security mechanisms struggle to detect them. Network forensic analysis has been proven effective, but is slow and effort intensive as Network Forensics Analysis Tools struggle to deal with undocumented or new network tunneling techniques. In this paper, we present a machine learning approach, based on feature subsets of network traffic evidence, to aid forensic analysis through automating the inference of protocols carried within DNS tunneling techniques. We explore four network protocols, namely, HTTP, HTTPS, FTP, and POP3. Three features are extracted from the DNS tunneled traffic: IP packet length, DNS Query Name Entropy, and DNS Query Name Length. We benchmark the performance of four classification models, i.e., decision trees, support vector machines, k-nearest neighbours, and neural networks, on a data set of DNS tunneled traffic. Classification accuracy of 95% is achieved and the feature set reduces the original evidence data size by a factor of 74%. More importantly, our findings provide strong evidence that predictive modeling machine learning techniques can be used to identify network protocols within DNS tunneled traffic in real-time with high accuracy from a relatively small-sized feature-set, without necessarily infringing on privacy from the outset, nor having to collect complete DNS Tunneling sessions

    Detect Kernel-Mode Rootkits via Real Time Logging & Controlling Memory Access

    Get PDF
    Modern malware and spyware platforms attack existing antivirus solutions and even Microsoft PatchGuard. To protect users and business systems new technologies developed by Intel and AMD CPUs may be applied. To deal with the new malware we propose monitoring and controlling access to the memory in real time using Intel VT-x with EPT. We have checked this concept by developing MemoryMonRWX, which is a bare-metal hypervisor. MemoryMonRWX is able to track and trap all types of memory access: read, write, and execute. MemoryMonRWX also has the following competitive advantages: fine-grained analysis, support of multi-core CPUs and 64-bit Windows 10. MemoryMonRWX is able to protect critical kernel memory areas even when PatchGuard has been disabled by malware. Its main innovative features are as follows: guaranteed interception of every memory access, resilience, and low performance degradation

    Towards Automation in Digital Investigations : Seeking Efficiency in Digital Forensics in Mobile and Cloud Environments

    No full text
    Cybercrime and related malicious activity in our increasingly digital world has become more prevalent and sophisticated, evading traditional security mechanisms. Digital forensics has been proposed to help investigate, understand and eventually mitigate such attacks. The practice of digital forensics, however, is still fraught with various challenges. Some of the most prominent of these challenges include the increasing amounts of data and the diversity of digital evidence sources appearing in digital investigations. Mobile devices and cloud infrastructures are an interesting specimen, as they inherently exhibit these challenging circumstances and are becoming more prevalent in digital investigations today. Additionally they embody further characteristics such as large volumes of data from multiple sources, dynamic sharing of resources, limited individual device capabilities and the presence of sensitive data. These combined set of circumstances make digital investigations in mobile and cloud environments particularly challenging. This is not aided by the fact that digital forensics today still involves manual, time consuming tasks within the processes of identifying evidence, performing evidence acquisition and correlating multiple diverse sources of evidence in the analysis phase. Furthermore, industry standard tools developed are largely evidence-oriented, have limited support for evidence integration and only automate certain precursory tasks, such as indexing and text searching. In this study, efficiency, in the form of reducing the time and human labour effort expended, is sought after in digital investigations in highly networked environments through the automation of certain activities in the digital forensic process. To this end requirements are outlined and an architecture designed for an automated system that performs digital forensics in highly networked mobile and cloud environments. Part of the remote evidence acquisition activity of this architecture is built and tested on several mobile devices in terms of speed and reliability. A method for integrating multiple diverse evidence sources in an automated manner, supporting correlation and automated reasoning is developed and tested. Finally the proposed architecture is reviewed and enhancements proposed in order to further automate the architecture by introducing decentralization particularly within the storage and processing functionality. This decentralization also improves machine to machine communication supporting several digital investigation processes enabled by the architecture through harnessing the properties of various peer-to-peer overlays. Remote evidence acquisition helps to improve the efficiency (time and effort involved) in digital investigations by removing the need for proximity to the evidence. Experiments show that a single TCP connection client-server paradigm does not offer the required scalability and reliability for remote evidence acquisition and that a multi-TCP connection paradigm is required. The automated integration, correlation and reasoning on multiple diverse evidence sources demonstrated in the experiments improves speed and reduces the human effort needed in the analysis phase by removing the need for time-consuming manual correlation. Finally, informed by published scientific literature, the proposed enhancements for further decentralizing the Live Evidence Information Aggregator (LEIA) architecture offer a platform for increased machine-to-machine communication thereby enabling automation and reducing the need for manual human intervention

    Advancing Automation in Digital Forensic Investigations

    No full text
    Digital Forensics is used to aid traditional preventive security mechanisms when they fail to curtail sophisticated and stealthy cybercrime events. The Digital Forensic Investigation process is largely manual in nature, or at best quasi-automated, requiring a highly skilled labour force and involving a sizeable time investment. Industry standard tools are evidence-centric, automate only a few precursory tasks (E.g. Parsing and Indexing) and have limited capabilities of integration from multiple evidence sources. Furthermore, these tools are always human-driven. These challenges are exacerbated in the increasingly computerized and highly networked environment of today. Volumes of digital evidence to be collected and analyzed have increased, and so has the diversity of digital evidence sources involved in a typical case. This further handicaps digital forensics practitioners, labs and law enforcement agencies, causing delays in investigations and legal systems due to backlogs of cases. Improved efficiency of the digital investigation process is needed, in terms of increasing the speed and reducing the human effort expended. This study aims at achieving this time and effort reduction, by advancing automation within the digital forensic investigation process. Using a Design Science research approach, artifacts are designed and developed to address these practical problems. Summarily, the requirements, and architecture of a system for automating digital investigations in highly networked environments are designed. The architecture initially focuses on automation of the identification and acquisition of digital evidence, while later versions focus on full automation and self-organization of devices for all phases of the digital investigation process. Part of the remote evidence acquisition capability of this system architecture is implemented as a proof of concept. The speed and reliability of capturing digital evidence from remote mobile devices over a client-server paradigm is evaluated. A method for the uniform representation and integration of multiple diverse evidence sources for enabling automated correlation, simple reasoning and querying is developed and tested. This method is aimed at automating the analysis phase of digital investigations. Machine Learning (ML)-based triage methods are developed and tested to evaluate the feasibility and performance of using such techniques to automate the identification of priority digital evidence fragments. Models from these ML methods are evaluated in identifying network protocols within DNS tunneled network traffic. A large dataset is also created for future research in ML-based triage for identifying suspicious processes for memory forensics. From an ex ante evaluation, the designed system architecture enables individual devices to participate in the entire digital investigation process, contributing their processing power towards alleviating the burden on the human analyst. Experiments show that remote evidence acquisition of mobile devices over networks is feasible, however a single-TCP-connection paradigm scales poorly. A proof of concept experiment demonstrates the viability of the automated integration, correlation and reasoning over multiple diverse evidence sources using semantic web technologies. Experimentation also shows that ML-based triage methods can enable prioritization of certain digital evidence sources, for acquisition or analysis, with up to 95% accuracy. The artifacts developed in this study provide concrete ways to enhance automation in the digital forensic investigation process to increase the investigation speed and reduce the amount of costly human intervention needed.

    On the Network Performance of Digital Evidence Acquisition of Small Scale Devices Over Public Networks

    Get PDF
    While cybercrime proliferates – becoming more complex and surreptitious on the Internet – the tools and techniques used in performing digital investigations are still largely lagging behind, effectively slowing down law enforcement agencies at large. Real-time remote acquisition of digital evidence over the Internet is still an elusive ideal in the combat against cybercrime. In this paper we briefly describe the architecture of a comprehensive proactive digital investigation system that is termed as the Live Evidence Information Aggregator (LEIA). This system aims at collecting digital evidence from potentially any device in real time over the Internet. Particular focus is made on the importance of the efficiency of the network communication in the evidence acquisition phase, in order to retrieve potentially evidentiary information remotely and with immediacy. Through a proof of concept implementation, we demonstrate the live, remote evidence capturing capabilities of such a system on small scale devices, highlighting the necessity for better throughput envisioned through the use of Peer-to-Peer overlays

    Improving Distributed Forensics and Incident Response in Loosely Controlled Networked Environments

    No full text
    Mobile devices and virtualized appliances in the Internet of Things can be end nodes on varying networks owned by different parties over time, while still seamlessly participating in licit or illicit activities. Digital Forensics and Incident Response (DFIR) tools today struggle to perform digital investigations in such loosely controlled networked environments as they face several challenges including: scarcity of resources, availability, trust, privacy, data volumes, velocity and variety. In this paper we analyze the state of research in DFIR in networked environments, identifying the challenges facing DFIR tools particularly in loosely controlled network environments. We present the requirements for a system to address these challenges at the various steps of the typical digital investigation methodology. From this we identify the need for support from Peer to Peer (P2P) overlays and discuss their relative merits and drawbacks in order to identify those that would best support DFIR in loosely controlled networked environments. Finally we incorporate both structured and unstructured P2P overlays in various capacities in our architecture in order to organize devices in loosely controlled networks, using context information, thus enabling efficient capture, analysis and reporting of artifacts of use in digital investigations

    DNS-Tunneling-JSON-4-Classes.zip

    No full text
    Data set containing features extracted from 211 DNS Tunneling packet captures. The packet capture samples are classified by the protocols tunneled within the DNS tunnel. The features are stored in json files for each packet capture. The features in each file include the IP Packet Length, the DNS Query Name Length and the DNS Query Name entropy. In this "slightly unclean" version of the feature set the DNS Query Name field values are also present, but are not actually necessary. <br><br>This feature set may be used to perform machine learning techniques on DNS Tunneling traffic to discover new insights without necessarily having to reconstruct and analyze the equivalent full packet captures.<br

    Android Process Memory String Dumps Dataset

    No full text
    A dataset containing 2375 samples of Android Process Memory String Dumps. The dataset is broadly composed of 2 classes: "Benign App" Memory Dumps and "Malicious App" Memory Dumps, respectively, split into 2 ZIP archives. The ZIP archives in total are approximately 17GB in size, however the unzipped contents are approximately 67GB.<br><br>This dataset is derived from a subset of the APK files originally made freely available for research through the AndroZoo project [1]. The AndroZoo project collected millions of Android applications and scanned them with the VirusTotal online malware scanning service, thereby classifying most of the apps as either malicious or benign at the time of scanning. <div><br></div><div>The process memory dumps in this dataset were generated through running the subset of APK files from the AndroZoo dataset in an Android Emulator, capturing the process memory of the individual process and subsequently extracting only the strings from the process memory dump. This was facilitated through building 2 applications: <i>Coriander</i> and <i>AndroMemDumpBeta</i> which facilitate the running of Apps on Android Emulators, and the capturing of process memory respectively. The source code for these software applications is available on Github. <br><br>The individual samples are labelled with the SHA256 hash filename from the original AndroZoo labeling and the application package names extracted from within the specific APK manifest file. They also contain a time-stamp for when the memory dumping process took place for the specific file. The file extension used is ".dmp" to indicate that the files are memory dumps, however they only contain strings, and thus can be viewed in any simple text editor.<br><br>A subset of the first 10000 APK files from the original AndroZoo dataset is also included within this dataset. The metadata of these APK files is present in the file "AndroZoo-First-10000" and the 2375 Android Apps that are the main subjects of our dataset are extracted from here..<br><br>Our dataset is intended to be used in furthering our research related to Machine Learning-based Triage for Android Memory Forensics. It has been made openly available in order to foster opportunities for collaboration with other researchers, to enable validation of research results as well as to enhance the body of knowledge in related areas of research.<br><br>References:<br>[1]. K. Allix, T. F. Bissyandé, J. Klein, and Y. Le Traon. AndroZoo: Collecting Millions of Android Apps for the Research Community. Mining Software Repositories (MSR) 2016<br></div
    corecore