11,084 research outputs found
Malware Detection Module using Machine Learning Algorithms to Assist in Centralized Security in Enterprise Networks
Malicious software is abundant in a world of innumerable computer users, who
are constantly faced with these threats from various sources like the internet,
local networks and portable drives. Malware is potentially low to high risk and
can cause systems to function incorrectly, steal data and even crash. Malware
may be executable or system library files in the form of viruses, worms,
Trojans, all aimed at breaching the security of the system and compromising
user privacy. Typically, anti-virus software is based on a signature definition
system which keeps updating from the internet and thus keeping track of known
viruses. While this may be sufficient for home-users, a security risk from a
new virus could threaten an entire enterprise network. This paper proposes a
new and more sophisticated antivirus engine that can not only scan files, but
also build knowledge and detect files as potential viruses. This is done by
extracting system API calls made by various normal and harmful executable, and
using machine learning algorithms to classify and hence, rank files on a scale
of security risk. While such a system is processor heavy, it is very effective
when used centrally to protect an enterprise network which maybe more prone to
such threats.Comment: 6 page
BlogForever: D3.1 Preservation Strategy Report
This report describes preservation planning approaches and strategies recommended by the BlogForever project as a core component of a weblog repository design. More specifically, we start by discussing why we would want to preserve weblogs in the first place and what it is exactly that we are trying to preserve. We further present a review of past and present work and highlight why current practices in web archiving do not address the needs of weblog preservation adequately. We make three distinctive contributions in this volume: a) we propose transferable practical workflows for applying a combination of established metadata and repository standards in developing a weblog repository, b) we provide an automated approach to identifying significant properties of weblog content that uses the notion of communities and how this affects previous strategies, c) we propose a sustainability plan that draws upon community knowledge through innovative repository design
Dynamic Analysis of Executables to Detect and Characterize Malware
It is needed to ensure the integrity of systems that process sensitive
information and control many aspects of everyday life. We examine the use of
machine learning algorithms to detect malware using the system calls generated
by executables-alleviating attempts at obfuscation as the behavior is monitored
rather than the bytes of an executable. We examine several machine learning
techniques for detecting malware including random forests, deep learning
techniques, and liquid state machines. The experiments examine the effects of
concept drift on each algorithm to understand how well the algorithms
generalize to novel malware samples by testing them on data that was collected
after the training data. The results suggest that each of the examined machine
learning algorithms is a viable solution to detect malware-achieving between
90% and 95% class-averaged accuracy (CAA). In real-world scenarios, the
performance evaluation on an operational network may not match the performance
achieved in training. Namely, the CAA may be about the same, but the values for
precision and recall over the malware can change significantly. We structure
experiments to highlight these caveats and offer insights into expected
performance in operational environments. In addition, we use the induced models
to gain a better understanding about what differentiates the malware samples
from the goodware, which can further be used as a forensics tool to understand
what the malware (or goodware) was doing to provide directions for
investigation and remediation.Comment: 9 pages, 6 Tables, 4 Figure
On the Reverse Engineering of the Citadel Botnet
Citadel is an advanced information-stealing malware which targets financial
information. This malware poses a real threat against the confidentiality and
integrity of personal and business data. A joint operation was recently
conducted by the FBI and the Microsoft Digital Crimes Unit in order to take
down Citadel command-and-control servers. The operation caused some disruption
in the botnet but has not stopped it completely. Due to the complex structure
and advanced anti-reverse engineering techniques, the Citadel malware analysis
process is both challenging and time-consuming. This allows cyber criminals to
carry on with their attacks while the analysis is still in progress. In this
paper, we present the results of the Citadel reverse engineering and provide
additional insight into the functionality, inner workings, and open source
components of the malware. In order to accelerate the reverse engineering
process, we propose a clone-based analysis methodology. Citadel is an offspring
of a previously analyzed malware called Zeus; thus, using the former as a
reference, we can measure and quantify the similarities and differences of the
new variant. Two types of code analysis techniques are provided in the
methodology, namely assembly to source code matching and binary clone
detection. The methodology can help reduce the number of functions requiring
manual analysis. The analysis results prove that the approach is promising in
Citadel malware analysis. Furthermore, the same approach is applicable to
similar malware analysis scenarios.Comment: 10 pages, 17 figures. This is an updated / edited version of a paper
appeared in FPS 201
Malware Pattern of Life Analysis
Many malware classifications include viruses, worms, trojans, ransomware, bots, adware, spyware, rootkits, file-less downloaders, malvertising, and many more. Each type may share unique behavioral characteristics with its methods of operations (MO), a pattern of behavior so distinctive that it could be recognized as having the same creator. The research shows the extraction of malware methods of operation using the step-by-step process of Artificial-Based Intelligence (ABI) with built-in Density-based spatial clustering of applications with noise (DBSCAN) machine learning to quantify the actions for their similarities, differences, baseline behaviors, and anomalies. The collected data of the research is from the ransomware sample repositories of Malware Bazaar and Virus Share, totaling 1300 live malicious codes ingested into the CAPEv2 malware sandbox, allowing the capture of traces of static, dynamic, and network behavior features. The ransomware features have shown significant activity of varying identified functions used in encryption, file application programming interface (API), and network function calls. During the machine learning categorization phase, there are eight identified clusters that have similar and different features regarding function-call sequencing events and file access manipulation for dropping file notes and writing encryption. Having compared all the clusters using a “supervenn” pictorial diagram, the characteristics of the static and dynamic behavior of the ransomware give the initial baselines for comparison with other variants that may have been added to the collected data for intelligence gathering. The findings provide a novel practical approach for intelligence gathering to address ransomware or any other malware variants’ activity patterns to discern similarities, anomalies, and differences between malware actions under study
- …