9 research outputs found

    Distinct Sector Hashes for Target File Detection

    Get PDF
    Using an alternative approach to traditional file hashing, digital forensic investigators can hash individually sampled subject drives on sector boundaries and then check these hashes against a prebuilt database, making it possible to process raw media without reference to the underlying file system

    Bytewise Approximate Matching: The Good, The Bad, and The Unknown

    Get PDF
    Hash functions are established and well-known in digital forensics, where they are commonly used for proving integrity and file identification (i.e., hash all files on a seized device and compare the fingerprints against a reference database). However, with respect to the latter operation, an active adversary can easily overcome this approach because traditional hashes are designed to be sensitive to altering an input; output will significantly change if a single bit is flipped. Therefore, researchers developed approximate matching, which is a rather new, less prominent area but was conceived as a more robust counterpart to traditional hashing. Since the conception of approximate matching, the community has constructed numerous algorithms, extensions, and additional applications for this technology, and are still working on novel concepts to improve the status quo. In this survey article, we conduct a high-level review of the existing literature from a non-technical perspective and summarize the existing body of knowledge in approximate matching, with special focus on bytewise algorithms. Our contribution allows researchers and practitioners to receive an overview of the state of the art of approximate matching so that they may understand the capabilities and challenges of the field. Simply, we present the terminology, use cases, classification, requirements, testing methods, algorithms, applications, and a list of primary and secondary literature

    Accelerating Malware Detection via a Graphics Processing Unit

    Get PDF
    Real-time malware analysis requires processing large amounts of data storage to look for suspicious files. This is a time consuming process that (requires a large amount of processing power) often affecting other applications running on a personal computer. This research investigates the viability of using Graphic Processing Units (GPUs), present in many personal computers, to distribute the workload normally processed by the standard Central Processing Unit (CPU). Three experiments are conducted using an industry standard GPU, the NVIDIA GeForce 9500 GT card. The goal of the first experiment is to find the optimal number of threads per block for calculating MD5 file hash. The goal of the second experiment is to find the optimal number of threads per block for searching an MD5 hash database for matches. In the third experiment, the size of the executable, executable type (benign or malicious), and processing hardware are varied in a full factorial experimental design. The experiment records if the file is benign or malicious and measure the time required to identify the executable. This information can be used to analyze the performance of GPU hardware against CPU hardware. Experimental results show that a GPU can calculate a MD5 signature hash and scan a database of malicious signatures 82% faster than a CPU for files between 0 96 kB. If the file size is increased to 97 - 192 kB the GPU is 85% faster than the CPU. This demonstrates that the GPU can provide a greater performance increase over a CPU. These results could help achieve faster anti-malware products, faster network intrusion detection system response times, and faster firewall applications

    Inferring Previously Uninstalled Applications from Residual Partial Artifacts

    Get PDF
    In this paper, we present an approach and experimental results to suggest the past presence of an application after the application has been uninstalled and the system has remained in use. Current techniques rely on the recovery of intact artifacts and traces, e.g., whole files, Windows Registry entries, or log file entries, while our approach requires no intact artifact recovery and leverages trace evidence in the form of residual partial files. In the case of recently uninstalled applications or an instrumented infrastructure, artifacts and traces may be intact and complete. In most cases, however, digital artifacts and traces are al- tered, destroyed, and disassociated over time due to normal system operation and deliberate obfuscation activity. As a result, analysts are often presented with partial and incomplete artifacts and traces from which defensible conclusions must be drawn. In this work, we match the sectors from a hard disk of interest to a previously constructed catalog of full files captured while various applications were installed, used, and uninstalled. The sectors composing the files in the catalog are not necessarily unique to each file or application, so we use an inverse frequency-weighting scheme to compute the inferential value of matched sectors. Similarly, we compute the fraction of full files associated with each application that is matched, where each file with a sector match is weighted by the fraction of total catalog sectors matched for that file. We compared results using both the sector-weighted and file- weighted values for known ground truth test images and final snapshot images from the M57 Patents Scenario data set. The file-weighted measure was slightly more accurate than the sector-weighted measure, although both identified all of the uninstalled applications in the test images and a high percentage of installed and uninstalled applications in the M57 data set, with minimal false positives for both sets. The key contribution of our work is the sug- gestion of uninstalled applications through weighted measurement of residual file fragments. Our experimental results indicate that past application activity can be reliably indicated even after an application has been uninstalled and the host system has been rebooted and used. The rapid and reliable indication of previously uninstalled applications is useful for cyber defense, law enforcement, and intelligence operations. Keywords: digital forensics; digital artifact; digital trace; partial artifact; residual artifact; uninstalled applicatio

    Massively parallel landscape-evolution modelling using general purpose graphical processing units

    Get PDF
    As our expectations of what computer systems can do and our ability to capture data improves, the desire to perform ever more computationally intensive tasks increases. Often these tasks, comprising vast numbers of repeated computations, are highly interdependent on each other – a closely coupled problem. The process of Landscape-Evolution Modelling is an example of such a problem. In order to produce realistic models it is necessary to process landscapes containing millions of data points over time periods extending up to millions of years. This leads to non-tractable execution times, often in the order of years. Researchers therefore seek multiple orders of magnitude reduction in the execution time of these models. The massively parallel programming environment offered through General Purpose Graphical Processing Units offers the potential for multiple orders of magnitude speedup in code execution times. In this paper we demonstrate how the time dominant parts of a Landscape-Evolution Model can be recoded for a massively parallel architecture providing two orders of magnitude reduction in execution time

    Utilizing Graphics Processing Units for Network Anomaly Detection

    Get PDF
    This research explores the benefits of using commonly-available graphics processing units (GPUs) to perform classification of network traffic using supervised machine learning algorithms. Two full factorial experiments are conducted using a NVIDIA GeForce GTX 280 graphics card. The goal of the first experiment is to create a baseline for the relative performance of the CPU and GPU implementations of artificial neural network (ANN) and support vector machine (SVM) detection methods under varying loads. The goal of the second experiment is to determine the optimal ensemble configuration for classifying processed packet payloads using the GPU anomaly detector. The GPU ANN achieves speedups of 29x over the CPU ANN. The GPU SVM detection method shows training speedups of 85x over the CPU. The GPU ensemble classification system provides accuracies of 99% when classifying network payload traffic, while achieving speedups of 2-15x over the CPU configurations
    corecore