Search CORE

2,428 research outputs found

Machine Learning Aided Static Malware Analysis: A Survey and Tutorial

Author: Andrii Shalaginov
D Krishna Sandeep Reddy
Farid Daryabar
Igor Santos
Reinaldo Jose Mangialardo
Smita Naval
Steve Watson
Teuvo Kohonen
Yanfang Ye
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 03/08/2018
Field of study

Malware analysis and detection techniques have been evolving during the last decade as a reflection to development of different malware techniques to evade network-based and host-based security protections. The fast growth in variety and number of malware species made it very difficult for forensics investigators to provide an on time response. Therefore, Machine Learning (ML) aided malware analysis became a necessity to automate different aspects of static and dynamic malware investigation. We believe that machine learning aided static analysis can be used as a methodological approach in technical Cyber Threats Intelligence (CTI) rather than resource-consuming dynamic malware analysis that has been thoroughly studied before. In this paper, we address this research gap by conducting an in-depth survey of different machine learning methods for classification of static characteristics of 32-bit malicious Portable Executable (PE32) Windows files and develop taxonomy for better understanding of these techniques. Afterwards, we offer a tutorial on how different machine learning techniques can be utilized in extraction and analysis of a variety of static characteristic of PE binaries and evaluate accuracy and practical generalization of these techniques. Finally, the results of experimental study of all the method using common data was given to demonstrate the accuracy and complexity. This paper may serve as a stepping stone for future researchers in cross-disciplinary field of machine learning aided malware forensics.Comment: 37 Page

arXiv.org e-Print Archive

Crossref

Quantifying the need for supervised machine learning in conducting live forensic analysis of emergent configurations (ECO) in IoT environments

Author: Al-Dhaqm Arafat
Alawadi Sadi
Choo Kim-Kwang Raymond
Ikuesan Richard A.
Karie Nickson M.
Kebande Victor R.
Publication venue: Edith Cowan University, Research Online, Perth, Western Australia
Publication date: 01/01/2020
Field of study

© 2020 The Author(s) Machine learning has been shown as a promising approach to mine larger datasets, such as those that comprise data from a broad range of Internet of Things devices, across complex environment(s) to solve different problems. This paper surveys existing literature on the potential of using supervised classical machine learning techniques, such as K-Nearest Neigbour, Support Vector Machines, Naive Bayes and Random Forest algorithms, in performing live digital forensics for different IoT configurations. There are also a number of challenges associated with the use of machine learning techniques, as discussed in this paper

Research Online @ ECU

Digitala Vetenskapliga Arkivet - Academic Archive On-line

Malmö University

A Scaling Robust Copy-Paste Tampering Detection for Digital Image Forensics

Author: Dharaskar R.V.
Thakare V.M.
Warbhe Anil Dada
Publication venue: The Authors. Published by Elsevier B.V.
Publication date: 31/12/2016
Field of study

AbstractIt is crucial in image forensics to prove the authenticity of the digital images. Due to the availability of the using sophisticated image editing software programs, anyone can manipulate the images easily. There are various types of digital image manipulation or tampering possible; like image compositing, splicing, copy-paste, etc. In this paper, we propose a passive scaling robust algorithm for the detection of Copy-Paste tampering. Sometimes the copied region of an image is scaled before pasting to some other location in the image. In such cases, the normal Copy-Paste detection algorithm fails to detect the forgeries. We have implemented and used an improved customized Normalized Cross Correlation for detecting highly correlated areas from the image and the image blocks, thereby detecting the tampered regions from an image. The experimental results demonstrate that the proposed approach can be effectively used to detect copy-paste forgeries accurately and is scaling robust

Elsevier - Publisher Connector

Data Mining Methods Applied to a Digital Forensics Task for Supervised Machine Learning

Author: Computational Intelligence in Digital Forensics: Forensic Investigation and Applications Vol. 555, Studies in Computational Intelligence pp 413-428 (2014) (Coordinador)
Riquelme Santos José Cristóbal
Tallón Ballesteros Antonio Javier
Publication venue
Publication date: 01/01/2014
Field of study

Digital forensics research includes several stages. Once we have collected the data the last goal is to obtain a model in order to predict the output with unseen data. We focus on supervised machine learning techniques. This chapter performs an experimental study on a forensics data task for multi-class classification including several types of methods such as decision trees, bayes classifiers, based on rules, artificial neural networks and based on nearest neighbors. The classifiers have been evaluated with two performance measures: accuracy and Cohen’s kappa. The followed experimental design has been a 4-fold cross validation with thirty repetitions for non-deterministic algorithms in order to obtain reliable results, averaging the results from 120 runs. A statistical analysis has been conducted in order to compare each pair of algorithms by means of t-tests using both the accuracy and Cohen’s kappa metrics

idUS. Depósito de Investigación Universidad de Sevilla

MemTri: A Memory Forensics Triage Tool using Bayesian Network and Volatility

Author: Michalas A.
Michalas A.
Murray R.
Murray R.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2017
Field of study

This work explores the development of MemTri. A memory forensics triage tool that can assess the likelihood of criminal activity in a memory image, based on evidence data artefacts generated by several applications. Fictitious illegal suspect activity scenarios were performed on virtual machines to generate 60 test memory images for input into MemTri. Four categories of applications (i.e. Internet Browsers, Instant Messengers, FTP Client and Document Processors) are examined for data artefacts located through the use of regular expressions. These identified data artefacts are then analysed using a Bayesian Network, to assess the likelihood that a seized memory image contained evidence of illegal firearms trading activity. MemTri's normal mode of operation achieved a high artefact identification accuracy performance of 95.7% when the applications' processes were running. However, this fell significantly to 60% as applications processes' were terminated. To explore improving MemTri's accuracy performance, a second mode was developed, which achieved more stable results of around 80% accuracy, even after applications processes' were terminated

Crossref

WestminsterResearch

Automatic Labelling and Document Clustering for Forensic Analysis

Author: Ms. Raksha K.Mundhe, Prof. Ankush Maind
Publication venue: 'Auricle Technologies, Pvt., Ltd.'
Publication date: 30/09/2014
Field of study

In computer forensic analysis, retrieved data is in unstructured text, whose analysis by computer examiners is difficult to be performed. In proposed approach the forensic analysis is done very systematically i.e. retrieved data is in unstructured format get particular structure by using high quality well known algorithm and automatic cluster labelling method. Indexing is performed on txt, doc, and pdf file which automatically estimate the number of clusters with automatic labelling to it. In the proposed approach DBSCAN algorithm and K-mean algorithm are used; which makes it very easy to retrieve most relevant information for forensic analysis also the automated methods of analysis are of great interest. In particular, algorithms for clustering documents can facilitate the discovery of new and useful knowledge from the documents under analysis. Two methods are used for document clustering for forensic analysis; the first method uses an x2 test of significance to detect different word usage across categories in the hierarchy which is well suited for testing dependencies when count data is available. The second method selects words which both occur frequently in a cluster and effectively discriminate the given cluster from the other clusters. Finally, we also present and discuss several practical results that can be useful for researchers of forensic analysis

International Journal on Recent and Innovation Trends in Computing and Communication

Hybrid feature selection technique for intrusion detection system

Author: Kamarudin Muhammad Hilmi
Maple Carsten
Watson Tim
Publication venue: 'Inderscience Publishers'
Publication date: 01/02/2019
Field of study

High dimensionality’s problems have make feature selection as one of the most important criteria in determining the efficiency of intrusion detection systems. In this study we have selected a hybrid feature selection model that potentially combines the strengths of both the filter and the wrapper selection procedure. The potential hybrid solution is expected to effectively select the optimal set of features in detecting intrusion. The proposed hybrid model was carried out using correlation feature selection (CFS) together with three different search techniques known as best-first, greedy stepwise and genetic algorithm. The wrapper-based subset evaluation uses a random forest (RF) classifier to evaluate each of the features that were first selected by the filter method. The reduced feature selection on both KDD99 and DARPA 1999 dataset was tested using RF algorithm with ten-fold cross-validation in a supervised environment. The experimental result shows that the hybrid feature selections had produced satisfactory outcome

Warwick Research Archives Portal Repository

Using Visual Capabilities to Improve Efficiency in Computer Forensic Analysis

Author: Forcht Karen A A
Hubbard Joan C
Publication venue: AIS Electronic Library (AISeL)
Publication date: 01/05/2009
Field of study

Computer forensics is the preservation, analysis, and interpretation of computer data. Computer forensics is dependent on the availability of software tools and applications. Such tools are critical components in law enforcement investigations. Due to the diversity of cyber crime and cyber assisted crime, advanced software tools are essential apparatus for typical law enforcement investigators, national security analysts, corporate emergency response teams, civil lawyers, risk management personnel, etc. Typical tools available to investigators are text-based, which are sorely inadequate given the volume of data needing analysis in today’s environment. Many modern tools essentially provide simple GUIs to simplify access to typical textbased commands but the capabilities are essentially the same. For simplicity we continue to refer to these as text-based and command-based in constrast to the visualization tools and associated direct manipulation interfaces we are attempting to develop. The reading of such large volumes of textual information is extremely time-consuming in contrast with the interpretation of images through which the user can interpret large amounts of information simultaneously. Forensic analysts have a growing need for new capabilities to aid in locating files holding evidence of criminal activity. Such capabilities must improve both the efficiency of the analysis process and the identification of additionally hidden files. This paper discusses visualization research that more perceptually and intuitively represents file characteristics. Additionally, we integrate interaction capabilities for more complete exploration, significantly improving analysis efficiency. Finally, we discuss the results of an applied user study designed specifically to measure the efficacy of the developed visualization capabilities in the analysis of computer forensic related data

AIS Electronic Library (AISeL)