1,865 research outputs found
Machine Learning Aided Static Malware Analysis: A Survey and Tutorial
Malware analysis and detection techniques have been evolving during the last
decade as a reflection to development of different malware techniques to evade
network-based and host-based security protections. The fast growth in variety
and number of malware species made it very difficult for forensics
investigators to provide an on time response. Therefore, Machine Learning (ML)
aided malware analysis became a necessity to automate different aspects of
static and dynamic malware investigation. We believe that machine learning
aided static analysis can be used as a methodological approach in technical
Cyber Threats Intelligence (CTI) rather than resource-consuming dynamic malware
analysis that has been thoroughly studied before. In this paper, we address
this research gap by conducting an in-depth survey of different machine
learning methods for classification of static characteristics of 32-bit
malicious Portable Executable (PE32) Windows files and develop taxonomy for
better understanding of these techniques. Afterwards, we offer a tutorial on
how different machine learning techniques can be utilized in extraction and
analysis of a variety of static characteristic of PE binaries and evaluate
accuracy and practical generalization of these techniques. Finally, the results
of experimental study of all the method using common data was given to
demonstrate the accuracy and complexity. This paper may serve as a stepping
stone for future researchers in cross-disciplinary field of machine learning
aided malware forensics.Comment: 37 Page
Analyzing Social and Stylometric Features to Identify Spear phishing Emails
Spear phishing is a complex targeted attack in which, an attacker harvests
information about the victim prior to the attack. This information is then used
to create sophisticated, genuine-looking attack vectors, drawing the victim to
compromise confidential information. What makes spear phishing different, and
more powerful than normal phishing, is this contextual information about the
victim. Online social media services can be one such source for gathering vital
information about an individual. In this paper, we characterize and examine a
true positive dataset of spear phishing, spam, and normal phishing emails from
Symantec's enterprise email scanning service. We then present a model to detect
spear phishing emails sent to employees of 14 international organizations, by
using social features extracted from LinkedIn. Our dataset consists of 4,742
targeted attack emails sent to 2,434 victims, and 9,353 non targeted attack
emails sent to 5,912 non victims; and publicly available information from their
LinkedIn profiles. We applied various machine learning algorithms to this
labeled data, and achieved an overall maximum accuracy of 97.76% in identifying
spear phishing emails. We used a combination of social features from LinkedIn
profiles, and stylometric features extracted from email subjects, bodies, and
attachments. However, we achieved a slightly better accuracy of 98.28% without
the social features. Our analysis revealed that social features extracted from
LinkedIn do not help in identifying spear phishing emails. To the best of our
knowledge, this is one of the first attempts to make use of a combination of
stylometric features extracted from emails, and social features extracted from
an online social network to detect targeted spear phishing emails.Comment: Detection of spear phishing using social media feature
Investigating visualisation techniques for rapid triage of digital forensic evidence
This study investigates the feasibility of a tool that allows digital forensics (DF) investigators to efficiently triage device datasets during the collection phase of an investigation. This tool utilises data visualisation techniques to display images found in near real-time to the end user. Findings indicate that participants were able to accurately identify contraband material whilst using this tool, however, classification accuracy dropped slightly with larger datasets. Combined with participant feedback, the results show that the proposed triage method is indeed feasible, and this tool provides a solid foundation for the continuation of further work
A Comprehensive Analysis of the Role of Artificial Intelligence and Machine Learning in Modern Digital Forensics and Incident Response
In the dynamic landscape of digital forensics, the integration of Artificial
Intelligence (AI) and Machine Learning (ML) stands as a transformative
technology, poised to amplify the efficiency and precision of digital forensics
investigations. However, the use of ML and AI in digital forensics is still in
its nascent stages. As a result, this paper gives a thorough and in-depth
analysis that goes beyond a simple survey and review. The goal is to look
closely at how AI and ML techniques are used in digital forensics and incident
response. This research explores cutting-edge research initiatives that cross
domains such as data collection and recovery, the intricate reconstruction of
cybercrime timelines, robust big data analysis, pattern recognition,
safeguarding the chain of custody, and orchestrating responsive strategies to
hacking incidents. This endeavour digs far beneath the surface to unearth the
intricate ways AI-driven methodologies are shaping these crucial facets of
digital forensics practice. While the promise of AI in digital forensics is
evident, the challenges arising from increasing database sizes and evolving
criminal tactics necessitate ongoing collaborative research and refinement
within the digital forensics profession. This study examines the contributions,
limitations, and gaps in the existing research, shedding light on the potential
and limitations of AI and ML techniques. By exploring these different research
areas, we highlight the critical need for strategic planning, continual
research, and development to unlock AI's full potential in digital forensics
and incident response. Ultimately, this paper underscores the significance of
AI and ML integration in digital forensics, offering insights into their
benefits, drawbacks, and broader implications for tackling modern cyber
threats
Invesitigation of Malware and Forensic Tools on Internet
Malware is anĀ application that is harmful to your forensic information. Basically, malware analyses is the process of analysing the behaviours of malicious code and then create signatures to detect and defend against it.Malware, such as Trojan horse, Worms and Spyware severely threatens the forensic security. This research observed that although malware and its variants may vary a lot from content signatures, they share some behaviour features at a higher level which are more precise in revealing the real intent of malware. This paper investigates the various techniques of malware behaviour extraction and analysis. In addition, we discuss the implications of malware analysis tools for malware detection based on various techniques
Applying Machine Learning to Advance Cyber Security: Network Based Intrusion Detection Systems
Many new devices, such as phones and tablets as well as traditional computer systems, rely on wireless connections to the Internet and are susceptible to attacks. Two important types of attacks are the use of malware and exploiting Internet protocol vulnerabilities in devices and network systems. These attacks form a threat on many levels and therefore any approach to dealing with these nefarious attacks will take several methods to counter. In this research, we utilize machine learning to detect and classify malware, visualize, detect and classify worms, as well as detect deauthentication attacks, a form of Denial of Service (DoS). This work also includes two prevention mechanisms for DoS attacks, namely a one- time password (OTP) and through the use of machine learning. Furthermore, we focus on an exploit of the widely used IEEE 802.11 protocol for wireless local area networks (WLANs). The work proposed here presents a threefold approach for intrusion detection to remedy the effects of malware and an Internet protocol exploit employing machine learning as a primary tool. We conclude with a comparison of dimensionality reduction methods to a deep learning classifier to demonstrate the effectiveness of these methods without compromising the accuracy of classification
tf-Darshan: Understanding Fine-grained I/O Performance in Machine Learning Workloads
Machine Learning applications on HPC systems have been gaining popularity in
recent years. The upcoming large scale systems will offer tremendous
parallelism for training through GPUs. However, another heavy aspect of Machine
Learning is I/O, and this can potentially be a performance bottleneck.
TensorFlow, one of the most popular Deep-Learning platforms, now offers a new
profiler interface and allows instrumentation of TensorFlow operations.
However, the current profiler only enables analysis at the TensorFlow platform
level and does not provide system-level information. In this paper, we extend
TensorFlow Profiler and introduce tf-Darshan, both a profiler and tracer, that
performs instrumentation through Darshan. We use the same Darshan shared
instrumentation library and implement a runtime attachment without using a
system preload. We can extract Darshan profiling data structures during
TensorFlow execution to enable analysis through the TensorFlow profiler. We
visualize the performance results through TensorBoard, the web-based TensorFlow
visualization tool. At the same time, we do not alter Darshan's existing
implementation. We illustrate tf-Darshan by performing two case studies on
ImageNet image and Malware classification. We show that by guiding optimization
using data from tf-Darshan, we increase POSIX I/O bandwidth by up to 19% by
selecting data for staging on fast tier storage. We also show that Darshan has
the potential of being used as a runtime library for profiling and providing
information for future optimization.Comment: Accepted for publication at the 2020 International Conference on
Cluster Computing (CLUSTER 2020
- ā¦