69 research outputs found

    Methodology for the Automated Metadata-Based Classification of Incriminating Digital Forensic Artefacts

    Full text link
    The ever increasing volume of data in digital forensic investigation is one of the most discussed challenges in the field. Usually, most of the file artefacts on seized devices are not pertinent to the investigation. Manually retrieving suspicious files relevant to the investigation is akin to finding a needle in a haystack. In this paper, a methodology for the automatic prioritisation of suspicious file artefacts (i.e., file artefacts that are pertinent to the investigation) is proposed to reduce the manual analysis effort required. This methodology is designed to work in a human-in-the-loop fashion. In other words, it predicts/recommends that an artefact is likely to be suspicious rather than giving the final analysis result. A supervised machine learning approach is employed, which leverages the recorded results of previously processed cases. The process of features extraction, dataset generation, training and evaluation are presented in this paper. In addition, a toolkit for data extraction from disk images is outlined, which enables this method to be integrated with the conventional investigation process and work in an automated fashion

    Automated Artefact Relevancy Determination from Artefact Metadata and Associated Timeline Events

    Get PDF
    The 2020 IEEE International Conference on Cyber Security And Protection Of Digital Services (Cyber Security 2020), Dublin City University, Ireland (held online due to coronavirus outbreak, 15-17 June 2020Case-hindering, multi-year digital forensic evidence backlogs have become commonplace in law enforcement agencies throughout the world. This is due to an ever-growing number of cases requiring digital forensic investigation coupled with the growing volume of data to be processed per case. Leveraging previously processed digital forensic cases and their component artefact relevancy classifications can facilitate an opportunity for training automated artificial intelligence based evidence processing systems. These can significantly aid investigators in the discovery and prioritisation of evidence. This paper presents one approach for file artefact relevancy determination building on the growing trend towards a centralised, Digital Forensics as a Service (DFaaS) paradigm. This approach enables the use of previously encountered pertinent files to classify newly discovered files in an investigation. Trained models can aid in the detection of these files during the acquisition stage, i.e., during their upload to a DFaaS system. The technique generates a relevancy score for file similarity using each artefact's filesystem metadata and associated timeline events. The approach presented is validated against three experimental usage scenarios

    ChatGPT for digital forensic investigation: The good, the bad, and the unknown

    Get PDF
    The disruptive application of ChatGPT (GPT-3.5, GPT-4) to a variety of domains has become a topic of much discussion in the scientific community and society at large. Large Language Models (LLMs), e.g., BERT, Bard, Generative Pre-trained Transformers (GPTs), LLaMA, etc., have the ability to take instructions, or prompts, from users and generate answers and solutions based on very large volumes of text-based training data. This paper assesses the impact and potential impact of ChatGPT on the field of digital forensics, specifically looking at its latest pre-trained LLM, GPT-4. A series of experiments are conducted to assess its capability across several digital forensic use cases including artefact understanding, evidence searching, code generation, anomaly detection, incident response, and education. Across these topics, its strengths and risks are outlined and a number of general conclusions are drawn. Overall this paper concludes that while there are some potential low-risk applications of ChatGPT within digital forensics, many are either unsuitable at present, since the evidence would need to be uploaded to the service, or they require sufficient knowledge of the topic being asked of the tool to identify incorrect assumptions, inaccuracies, and mistakes. However, to an appropriately knowledgeable user, it could act as a useful supporting tool in some circumstances

    Machine-learning forensics : state of the art in the use of machine-learning techniques for digital forensic investigations within smart environments

    Get PDF
    Recently, a world-wide trend has been observed that there is widespread adoption across all fields to embrace smart environments and automation. Smart environments include a wide variety of Internet-of-Things (IoT) devices, so many challenges face conventional digital forensic investigation (DFI) in such environments. These challenges include data heterogeneity, data distribution, and massive amounts of data, which exceed digital forensic (DF) investigators’ human capabilities to deal with all of these challenges within a short period of time. Furthermore, they significantly slow down or even incapacitate the conventional DFI process. With the increasing frequency of digital crimes, better and more sophisticated DFI procedures are desperately needed, particularly in such environments. Since machine-learning (ML) techniques might be a viable option in smart environments, this paper presents the integration of ML into DF, through reviewing the most recent papers concerned with the applications of ML in DF, specifically within smart environments. It also explores the potential further use of ML techniques in DF in smart environments to reduce the hard work of human beings, as well what to expect from future ML applications to the conventional DFI process.https://www.mdpi.com/journal/applsciComputer Scienc

    Integrated examination and analysis model for improving mobile cloud forensic investigation

    Get PDF
    Advanced forensic techniques become inevitable to investigate the malicious activities in Cloud-based Mobile Applications (CMA). It is challenging to analyse the casespecific evidential artifact from the Mobile Cloud Computing (MCC) environment under forensically sound conditions. The Mobile Cloud Investigation (MCI) encounters many research issues in tracing and fine-tuning the relevant evidential artifacts from the MCC environment. This research proposes an integrated Examination and Analysis (EA) model for a generalised application architecture of CMA deployable on the public cloud to trace the case-specific evidential artifacts. The proposed model effectively validates MCI and enhances the accuracy and speed of the investigation. In this context, proposing Forensic Examination and Analysis Methodology using Data mining (FED) and Forensic Examination and analysis methodology using Data mining and Optimization (FEDO) models address these issues. The FED incorporates key sub-phases such as timeline analysis, hash filtering, data carving, and data transformation to filter out case-specific artifacts. The Long Short-Term Memory (LSTM) assisted forensic methodology decides the amount of potential information to be retained for further investigation and categorizes the forensic evidential artifacts for the relevancy of the crime event. Finally, the FED model constructs the forensic evidence taxonomy and maintains the precision and recall above 85% for effective decision-making. FEDO facilitates cloud evidence by examining the key features and indexing the evidence. The FEDO incorporates several sub-phases to precisely handle the evidence, such as evidence indexing, crossreferencing, and keyword searching. It analyses the temporal and geographic information and performs cross-referencing to fine-tune the evidence towards the casespecific evidence. FEDO models the Linearly Decreasing Weight (LDW) strategy based Particle Swarm Optimization (PSO) algorithm on the case-specific evidence to improve the searching capability of the investigation across the massive MCC environment. FEDO delivers the evidence tracing rate at 90%, and thus the integrated EA ensures improved MCI performance

    DeepUAge: Improving Underage Age Estimation Accuracy to Aid CSEM Investigation

    Get PDF
    Age is a soft biometric trait that can aid law enforcement in the identification of victims of Child Sexual Exploitation Material (CSEM) creation/distribution. Accurate age estimation of subjects can classify explicit content possession as illegal during an investigation. Automation of this age classification has the potential to expedite content discovery and focus the investigation of digital evidence through the prioritisation of evidence containing CSEM. In recent years, artificial intelligence based approaches for automated age estimation have been created, and many public cloud service providers offer this service on their platforms. The accuracy of these algorithms have been improving over recent years. These existing approaches perform satisfactorily for adult subjects, but perform wholly inadequately for underage subjects. To this end, the largest underage facial age dataset, VisAGe, has been used in this work to train a ResNet50 based deep learning model, DeepUAge, that achieved state-of-the-art beating performance for age estimation of minors. This paper describes the design and implementation of this model. An evaluation, validation and comparison of the proposed model is performed against existing facial age classifiers resulting in the best overall performance for underage subjects.Google Cloud Platform Research Credits Progra

    An Empirical Investigation of the Evidence Recovery Process in Digital Forensics

    Get PDF
    The widespread use of the digital media in committing crimes, and the steady increase of their storage capacity has created backlogs at digital forensic labs. The problem is exacerbated especially in high profile crimes. In many such cases the judicial proceedings mandate full analysis of the digital media, when doing so is rarely accomplished or practical. Prior studies have proposed different phases for forensic analysis, to lessen the backlog issues. However, these phases are not distinctly differentiated, and some proposed solutions may not be practical. This study utilized several past police forensic analyses. Each case was chosen for having five distinct forensic phases, complete with documented amount of time spent in each phase, along with the number and type of recovered evidence. Data from these cases were empirically analyzed using common descriptive statistical analyses along with linear regression. By using linear regression, we tested the factors that determine the number of recovered evidentiary artifacts. This study provides models by which future forensic analyses could be assessed. It presents distinctive boundaries for each forensics phase, thus eliminating ambiguity in the examination results, while assisting forensic examiners in determining the necessary depth of analysis

    Digital Forensics AI: on Practicality, Optimality, and Interpretability of Digital Evidence Mining Techniques

    Get PDF
    Digital forensics as a field has progressed alongside technological advancements over the years, just as digital devices have gotten more robust and sophisticated. However, criminals and attackers have devised means for exploiting the vulnerabilities or sophistication of these devices to carry out malicious activities in unprecedented ways. Their belief is that electronic crimes can be committed without identities being revealed or trails being established. Several applications of artificial intelligence (AI) have demonstrated interesting and promising solutions to seemingly intractable societal challenges. This thesis aims to advance the concept of applying AI techniques in digital forensic investigation. Our approach involves experimenting with a complex case scenario in which suspects corresponded by e-mail and deleted, suspiciously, certain communications, presumably to conceal evidence. The purpose is to demonstrate the efficacy of Artificial Neural Networks (ANN) in learning and detecting communication patterns over time, and then predicting the possibility of missing communication(s) along with potential topics of discussion. To do this, we developed a novel approach and included other existing models. The accuracy of our results is evaluated, and their performance on previously unseen data is measured. Second, we proposed conceptualizing the term “Digital Forensics AI” (DFAI) to formalize the application of AI in digital forensics. The objective is to highlight the instruments that facilitate the best evidential outcomes and presentation mechanisms that are adaptable to the probabilistic output of AI models. Finally, we enhanced our notion in support of the application of AI in digital forensics by recommending methodologies and approaches for bridging trust gaps through the development of interpretable models that facilitate the admissibility of digital evidence in legal proceedings

    Digital Forensics AI: on Practicality, Optimality, and Interpretability of Digital Evidence Mining Techniques

    Get PDF
    Digital forensics as a field has progressed alongside technological advancements over the years, just as digital devices have gotten more robust and sophisticated. However, criminals and attackers have devised means for exploiting the vulnerabilities or sophistication of these devices to carry out malicious activities in unprecedented ways. Their belief is that electronic crimes can be committed without identities being revealed or trails being established. Several applications of artificial intelligence (AI) have demonstrated interesting and promising solutions to seemingly intractable societal challenges. This thesis aims to advance the concept of applying AI techniques in digital forensic investigation. Our approach involves experimenting with a complex case scenario in which suspects corresponded by e-mail and deleted, suspiciously, certain communications, presumably to conceal evidence. The purpose is to demonstrate the efficacy of Artificial Neural Networks (ANN) in learning and detecting communication patterns over time, and then predicting the possibility of missing communication(s) along with potential topics of discussion. To do this, we developed a novel approach and included other existing models. The accuracy of our results is evaluated, and their performance on previously unseen data is measured. Second, we proposed conceptualizing the term “Digital Forensics AI” (DFAI) to formalize the application of AI in digital forensics. The objective is to highlight the instruments that facilitate the best evidential outcomes and presentation mechanisms that are adaptable to the probabilistic output of AI models. Finally, we enhanced our notion in support of the application of AI in digital forensics by recommending methodologies and approaches for bridging trust gaps through the development of interpretable models that facilitate the admissibility of digital evidence in legal proceedings

    تطويع الميتاداتا كدليل مساند ضمن عمليات التحقيق الجنائي الرقمي: نموذج مقترح

    Get PDF
    في عالم العولمة الذي نشهده اليوم، تعد التكنولوجيا جزءًا من الأنشطة اليومية للعديد من الأشخاص، وعلى وجه التحديد المحققون الجنائيون الذين أصبحوا يستقون المعلومات التي تتكامل مع عملية التحقيق من خلال شبكة الإنترنت عن طريق العديد من الأدوات التي تتكامل بأشكال مختلفة ومتنوعة لتجعل التكنولوجية قابلة للتطبيق في النظام القضائي، وتصبح عملية التحقيق الجنائي الرقمي أكثر فاعلية. وتكتسب الدراسة الحالية أهميتها من الميتاداتا التي تسهم في التحقيق الجنائي الرقمي، وتحقق الترابط التكاملي بين الأدلة المعلوماتية والمحققين وعملية التحقيق الجنائي الرقمي، وكيفية التعامل مع الأدلة المعلوماتية لتفادي سوء استعمالها لعرقلة التحقيقات الجنائية والكشف عن دور برنامج EXIF Viewer Pro في عملية استخلاص المعلومات من خلال الميتاداتا بشكل يحقق العدالة الواعدة. وتهدف الدراسة إلى توضيح أهمية الميتاداتا في التحقيق الجنائي الرقمي، وبناءً على ذلك اعتمدت على المنهج الوصفي التحليلي، واستعانت في التطبيق ببرنامج EXIF Viewer Pro لاستخراج الميتاداتا من الصور الرقمية. كما استطاعت الدراسة وضع توصيات لما يجب أخذه بعين الاعتبار، سواء من الباحثين عند تصميم حلول مماثلة، أو من المسؤولين عند التخطيط لاستخدامها، بحيث تكون فعالة للغاية. أهم ما توصلت إليه الدراسة: تسهم برمجيات كشف الميتاداتا مثل EXIF Viewer Pro في توفير تحليل ودليل قوي يمنح اليقين والموثوقية بإصدار الأحكام بشأن القضايا الجنائية، بالإضافة لكون البرمجيات الحديثة مثل EXIF Viewer Pro لاستخراج الميتاداتا تُجيب عن الأسئلة (مَن، متى، كيف) للمحقق الجنائي، واستغنت البرمجيات الحديثة للكشف عن الميتاداتا عن عنصر الموقع الجغرافي؛ نظرًا لسوء استخدامها، وبالتالي لا تستطيع الإجابة بدقة عن التساؤل (أين) للمحقق الجنائي
    corecore