392 research outputs found

    Data visualisation in digital forensics

    Get PDF
    As digital crimes have risen, so has the need for digital forensics. Numerous state-of-the-art tools have been developed to assist digital investigators conduct proper investigations into digital crimes. However, digital investigations are becoming increasingly complex and time consuming due to the amount of data involved, and digital investigators can find themselves unable to conduct them in an appropriately efficient and effective manner. This situation has prompted the need for new tools capable of handling such large, complex investigations. Data mining is one such potential tool. It is still relatively unexplored from a digital forensics perspective, but the purpose of data mining is to discover new knowledge from data where the dimensionality, complexity or volume of data is prohibitively large for manual analysis. This study assesses the self-organising map (SOM), a neural network model and data mining technique that could potentially offer tremendous benefits to digital forensics. The focus of this study is to demonstrate how the SOM can help digital investigators to make better decisions and conduct the forensic analysis process more efficiently and effectively during a digital investigation. The SOM’s visualisation capabilities can not only be used to reveal interesting patterns, but can also serve as a platform for further, interactive analysis.Dissertation (MSc (Computer Science))--University of Pretoria, 2007.Computer Scienceunrestricte

    Forensic triage of email network narratives through visualisation

    Get PDF
    Purpose – The purpose of this paper is to propose a novel approach that automates the visualisation of both quantitative data (the network) and qualitative data (the content) within emails to aid the triage of evidence during a forensics investigation. Email remains a key source of evidence during a digital investigation, and a forensics examiner may be required to triage and analyse large email data sets for evidence. Current practice utilises tools and techniques that require a manual trawl through such data, which is a time-consuming process. Design/methodology/approach – This paper applies the methodology to the Enron email corpus, and in particular one key suspect, to demonstrate the applicability of the approach. Resulting visualisations of network narratives are discussed to show how network narratives may be used to triage large evidence data sets. Findings – Using the network narrative approach enables a forensics examiner to quickly identify relevant evidence within large email data sets. Within the case study presented in this paper, the results identify key witnesses, other actors of interest to the investigation and potential sources of further evidence. Practical implications – The implications are for digital forensics examiners or for security investigations that involve email data. The approach posited in this paper demonstrates the triage and visualisation of email network narratives to aid an investigation and identify potential sources of electronic evidence. Originality/value – There are a number of network visualisation applications in use. However, none of these enable the combined visualisation of quantitative and qualitative data to provide a view of what the actors are discussing and how this shapes the network in email data sets

    A suspect-oriented intelligent and automated computer forensic analysis

    Get PDF
    Computer forensics faces a range of challenges due to the widespread use of computing technologies. Examples include the increasing volume of data and devices that need to be analysed in any single case, differing platforms, use of encryption and new technology paradigms (such as cloud computing and the Internet of Things). Automation within forensic tools exists, but only to perform very simple tasks, such as data carving and file signature analysis. Investigators are responsible for undertaking the cognitively challenging and time-consuming process of identifying relevant artefacts. Due to the volume of cyber-dependent (e.g., malware and hacking) and cyber-enabled (e.g., fraud and online harassment) crimes, this results in a large backlog of cases. With the aim of speeding up the analysis process, this paper investigates the role that unsupervised pattern recognition can have in identifying notable artefacts. A study utilising the Self-Organising Map (SOM) to automatically cluster notable artefacts was devised using a series of four cases. Several SOMs were created – a File List SOM containing the metadata of files based upon the file system, and a series of application level SOMs based upon metadata extracted from files themselves (e.g., EXIF data extracted from JPEGs and email metadata extracted from email files). A total of 275 sets of experiments were conducted to determine the viability of clustering across a range of network configurations. The results reveal that more than 93.5% of notable artefacts were grouped within the rank-five clusters in all four cases. The best performance was achieved by using a 10 × 10 SOM where all notables were clustered in a single cell with only 1.6% of the non-notable artefacts (noise) being present, highlighting that SOM-based analysis does have the potential to cluster notable versus noise files to a degree that would significantly reduce the investigation time. Whilst clustering has proven to be successful, operationalizing it is still a challenge (for example, how to identify the cluster containing the largest proportion of notables within the case). The paper continues to propose a process that capitalises upon SOM and other parameters such as the timeline to identify notable artefacts whilst minimising noise files. Overall, based solely upon unsupervised learning, the approach is able to achieve a recall rate of up to 93%. © 2016 Elsevier Lt

    A systematic survey of online data mining technology intended for law enforcement

    Get PDF
    As an increasing amount of crime takes on a digital aspect, law enforcement bodies must tackle an online environment generating huge volumes of data. With manual inspections becoming increasingly infeasible, law enforcement bodies are optimising online investigations through data-mining technologies. Such technologies must be well designed and rigorously grounded, yet no survey of the online data-mining literature exists which examines their techniques, applications and rigour. This article remedies this gap through a systematic mapping study describing online data-mining literature which visibly targets law enforcement applications, using evidence-based practices in survey making to produce a replicable analysis which can be methodologically examined for deficiencies

    Decision Support Elements and Enabling Techniques to Achieve a Cyber Defence Situational Awareness Capability

    Full text link
    [ES] La presente tesis doctoral realiza un análisis en detalle de los elementos de decisión necesarios para mejorar la comprensión de la situación en ciberdefensa con especial énfasis en la percepción y comprensión del analista de un centro de operaciones de ciberseguridad (SOC). Se proponen dos arquitecturas diferentes basadas en el análisis forense de flujos de datos (NF3). La primera arquitectura emplea técnicas de Ensemble Machine Learning mientras que la segunda es una variante de Machine Learning de mayor complejidad algorítmica (lambda-NF3) que ofrece un marco de defensa de mayor robustez frente a ataques adversarios. Ambas propuestas buscan automatizar de forma efectiva la detección de malware y su posterior gestión de incidentes mostrando unos resultados satisfactorios en aproximar lo que se ha denominado un SOC de próxima generación y de computación cognitiva (NGC2SOC). La supervisión y monitorización de eventos para la protección de las redes informáticas de una organización debe ir acompañada de técnicas de visualización. En este caso, la tesis aborda la generación de representaciones tridimensionales basadas en métricas orientadas a la misión y procedimientos que usan un sistema experto basado en lógica difusa. Precisamente, el estado del arte muestra serias deficiencias a la hora de implementar soluciones de ciberdefensa que reflejen la relevancia de la misión, los recursos y cometidos de una organización para una decisión mejor informada. El trabajo de investigación proporciona finalmente dos áreas claves para mejorar la toma de decisiones en ciberdefensa: un marco sólido y completo de verificación y validación para evaluar parámetros de soluciones y la elaboración de un conjunto de datos sintéticos que referencian unívocamente las fases de un ciberataque con los estándares Cyber Kill Chain y MITRE ATT & CK.[CA] La present tesi doctoral realitza una anàlisi detalladament dels elements de decisió necessaris per a millorar la comprensió de la situació en ciberdefensa amb especial èmfasi en la percepció i comprensió de l'analista d'un centre d'operacions de ciberseguretat (SOC). Es proposen dues arquitectures diferents basades en l'anàlisi forense de fluxos de dades (NF3). La primera arquitectura empra tècniques de Ensemble Machine Learning mentre que la segona és una variant de Machine Learning de major complexitat algorítmica (lambda-NF3) que ofereix un marc de defensa de major robustesa enfront d'atacs adversaris. Totes dues propostes busquen automatitzar de manera efectiva la detecció de malware i la seua posterior gestió d'incidents mostrant uns resultats satisfactoris a aproximar el que s'ha denominat un SOC de pròxima generació i de computació cognitiva (NGC2SOC). La supervisió i monitoratge d'esdeveniments per a la protecció de les xarxes informàtiques d'una organització ha d'anar acompanyada de tècniques de visualització. En aquest cas, la tesi aborda la generació de representacions tridimensionals basades en mètriques orientades a la missió i procediments que usen un sistema expert basat en lògica difusa. Precisament, l'estat de l'art mostra serioses deficiències a l'hora d'implementar solucions de ciberdefensa que reflectisquen la rellevància de la missió, els recursos i comeses d'una organització per a una decisió més ben informada. El treball de recerca proporciona finalment dues àrees claus per a millorar la presa de decisions en ciberdefensa: un marc sòlid i complet de verificació i validació per a avaluar paràmetres de solucions i l'elaboració d'un conjunt de dades sintètiques que referencien unívocament les fases d'un ciberatac amb els estàndards Cyber Kill Chain i MITRE ATT & CK.[EN] This doctoral thesis performs a detailed analysis of the decision elements necessary to improve the cyber defence situation awareness with a special emphasis on the perception and understanding of the analyst of a cybersecurity operations center (SOC). Two different architectures based on the network flow forensics of data streams (NF3) are proposed. The first architecture uses Ensemble Machine Learning techniques while the second is a variant of Machine Learning with greater algorithmic complexity (lambda-NF3) that offers a more robust defense framework against adversarial attacks. Both proposals seek to effectively automate the detection of malware and its subsequent incident management, showing satisfactory results in approximating what has been called a next generation cognitive computing SOC (NGC2SOC). The supervision and monitoring of events for the protection of an organisation's computer networks must be accompanied by visualisation techniques. In this case, the thesis addresses the representation of three-dimensional pictures based on mission oriented metrics and procedures that use an expert system based on fuzzy logic. Precisely, the state-of-the-art evidences serious deficiencies when it comes to implementing cyber defence solutions that consider the relevance of the mission, resources and tasks of an organisation for a better-informed decision. The research work finally provides two key areas to improve decision-making in cyber defence: a solid and complete verification and validation framework to evaluate solution parameters and the development of a synthetic dataset that univocally references the phases of a cyber-attack with the Cyber Kill Chain and MITRE ATT & CK standards.Llopis Sánchez, S. (2023). Decision Support Elements and Enabling Techniques to Achieve a Cyber Defence Situational Awareness Capability [Tesis doctoral]. Universitat Politècnica de València. https://doi.org/10.4995/Thesis/10251/19424

    Container and VM Visualization for Rapid Forensic Analysis

    Get PDF
    Cloud-hosted software such as virtual machines and containers are notoriously difficult to access, observe, and inspect during ongoing security events. This research describes a new, out-of-band forensic tool for rapidly analyzing cloud based software. The proposed tool renders two-dimensional visualizations of container contents and virtual machine disk images. The visualizations can be used to identify container / VM contents, pinpoint instances of embedded malware, and find modified code. The proposed new forensic tool is compared against other forensic tools in a double-blind experiment. The results confirm the utility of the proposed tool. Implications and future research directions are also described

    Automated Digital Forensics and Computer Crime Profiling

    Get PDF
    Edited version embargoed until 01.12.2017 Full version: Access restricted permanently due to 3rd party copyright restrictions. Restriction set on 08.12.2016 by SC, Graduate SchoolOver the past two decades, technology has developed tremendously, at an almost exponential rate. While this development has served the nation in numerous different positive ways, negatives have also emerged. One such negative is that of computer crime. This criminality has even grown so fast as to leave current digital forensic tools lagging behind in terms of development, and capabilities to manage such increasing and sophisticated types of crime. In essence the time taken to analyse a case is huge and increasing, and cases are not fully or properly investigated. This results in an ever-increasing number of pending and unsolved cases pertaining to computer crime. Digital forensics has become an essential tool in the fight against computer crime, providing both procedures and tools for the acquisition, examination and analysis of digital evidence. However, the use of technology is expanding at an ever-increasing rate, with the number of devices a single user might engage with increasing from a single device to 3 or more, the data capacity of those devices reaching far into the Terabytes, and the nature of the underlying technology evolving (for example, the use of cloud services). This results in an incredible challenge for forensic examiners to process and analyse cases in an efficient and effective manner. This thesis focuses upon the examination and analysis phases of the investigative process and considers whether automation of the process is possible. The investigation begins with researching the current state of the art, and illustrates a wide range of challenges that are facing the digital forensics investigators when analysing a case. Supported by a survey of forensic researchers and practitioners, key challenges were identified and prioritised. It was found that 95% of participants believed that the number of forensic investigations would increase in the coming times, with 75% of participants believing that the time consumed in such cases would increase. With regards to the digital forensic sophistication, 95% of the participants expected a rise in the complexity level and sophistication of digital forensics. To this end, an automated intelligent system that could be used to reduce the investigator’s time and cognitive load was found to be a promising solution. A series of experiments are devised around the use of Self-Organising Maps (SOMs) – a technique well known for unsupervised clustering of objects. The analysis is performed on a range of file system and application-level objects (e.g. email, internet activity) across four forensic cases. Experiment evaluations revealed SOMs are able to successfully cluster forensic artefacts from the remaining files. Having established SOMs are capable of clustering wanted artefacts from the case, a novel algorithm referred to as the Automated Evidence Profiler (AEP), is proposed to encapsulate the process and provide further refinement of the artefact identification process. The algorithm led to achieving identification rates in examined cases of 100% in two cases and 94% in a third. A novel architecture is proposed to support the algorithm in an operational capacity – considering standard forensic techniques such as hashing for known files, file signature analysis, application-level analysis. This provides a mechanism that is capable of utilising the A E P with several other components that are able to filter, prioritise and visualise artefacts of interest to investigator. The approach, known as Automated Forensic Examiner (AFE), is capable of identifying potential evidence in a more efficient and effective manner. The approach was evaluated by a number of experts in the field, and it was unanimously agreed that the chosen research problem was one with great validity. Further to this, the experts all showed support for the Automated Forensic Examiner based on the results of cases analysed

    Unsupervised discovery of relations for analysis of textual data in digital forensics

    Get PDF
    This dissertation addresses the problem of analysing digital data in digital forensics. It will be shown that text mining methods can be adapted and applied to digital forensics to aid analysts to more quickly, efficiently and accurately analyse data to reveal truly useful information. Investigators who wish to utilise digital evidence must examine and organise the data to piece together events and facts of a crime. The difficulty with finding relevant information quickly using the current tools and methods is that these tools rely very heavily on background knowledge for query terms and do not fully utilise the content of the data. A novel framework in which to perform evidence discovery is proposed in order to reduce the quantity of data to be analysed, aid the analysts' exploration of the data and enhance the intelligibility of the presentation of the data. The framework combines information extraction techniques with visual exploration techniques to provide a novel approach to performing evidence discovery, in the form of an evidence discovery system. By utilising unrestricted, unsupervised information extraction techniques, the investigator does not require input queries or keywords for searching, thus enabling the investigator to analyse portions of the data that may not have been identified by keyword searches. The evidence discovery system produces text graphs of the most important concepts and associations extracted from the full text to establish ties between the concepts and provide an overview and general representation of the text. Through an interactive visual interface the investigator can explore the data to identify suspects, events and the relations between suspects. Two models are proposed for performing the relation extraction process of the evidence discovery framework. The first model takes a statistical approach to discovering relations based on co-occurrences of complex concepts. The second model utilises a linguistic approach using named entity extraction and information extraction patterns. A preliminary study was performed to assess the usefulness of a text mining approach to digital forensics as against the traditional information retrieval approach. It was concluded that the novel approach to text analysis for evidence discovery presented in this dissertation is a viable and promising approach. The preliminary experiment showed that the results obtained from the evidence discovery system, using either of the relation extraction models, are sensible and useful. The approach advocated in this dissertation can therefore be successfully applied to the analysis of textual data for digital forensics CopyrightDissertation (MSc)--University of Pretoria, 2010.Computer Scienceunrestricte

    DARIAH and the Benelux

    Get PDF

    A Novel User Oriented Network Forensic Analysis Tool

    Get PDF
    In the event of a cybercrime, it is necessary to examine the suspect’s digital device(s) in a forensic fashion so that the culprit can be presented in court along with the extracted evidence(s). But, factors such as existence and availability of anti-forensic tools/techniques and increasing replacement of hard disk drives with solid state disks have the ability to eradicate critical evidences and/or ruin their integrity. Therefore, having an alternative source of evidence with a lesser chance of being tampered with can be beneficial for the investigation. The organisational network traffic can fit into this role as it is an independent source of evidence and will contain a copy of all online user activities. Limitations of prevailing network traffic analysis techniques – packet based and flow based – are reflected as certain challenges in the investigation. The enormous volume and increasing encrypted nature of traffic, the dynamic nature of IP addresses of users’ devices, and the difficulty in extracting meaningful information from raw traffic are among those challenges. Furthermore, current network forensic tools, unlike the sophisticated computer forensic tools, are limited in their capability to exhibit functionalities such as collaborative working, visualisation, reporting and extracting meaningful user-level information. These factors increase the complexity of the analysis, and the time and effort required from the investigator. The research goal was set to design a system that can assist in the investigation by minimising the effects of the aforementioned challenges, thereby reducing the cognitive load on the investigator, which, the researcher thinks, can take the investigator one step closer to the culprit. The novelty of this system comes from a newly proposed interaction based analysis approach, which will extract online user activities from raw network metadata. Practicality of the novel interaction-based approach was tested by designing an experimental methodology, which involved an initial phase of the researcher looking to identify unique signatures for activities performed on popular Internet applications (BBC, Dropbox, Facebook, Hotmail, Google Docs, Google Search, Skype, Twitter, Wikipedia, and YouTube) from the researcher’s own network metadata. With signatures obtained, the project moved towards the second phase of the experiment in which a much larger dataset (network traffic collected from 27 users for over 2 months) was analysed. Results showed that it is possible to extract unique signature of online user activities from raw network metadata. However, due to the complexities of the applications, signatures were not found for some activities. The interaction-based approach was able to reduce the data volume by eliminating the noise (machine to machine communication packets) and to find a way around the encryption issue by using only the network metadata. A set of system requirements were generated, based on which a web based, client-server architecture for the proposed system (i.e. the User-Oriented Network Forensic Analysis Tool) was designed. The system functions in a case management premise while minimising the challenges that were identified earlier. The system architecture led to the development of a functional prototype. An evaluation of the system by academic experts from the field acted as a feedback mechanism. While the evaluators were satisfied with the system’s capability to assist in the investigation and meet the requirements, drawbacks such as inability to analyse real-time traffic and meeting the HCI standards were pointed out. The future work of the project will involve automated signature extraction, real-time processing and facilitation of integrated visualisation
    corecore