99 research outputs found
Detection and Analysis of Drive-by Downloads and Malicious Websites
A drive by download is a download that occurs without users action or
knowledge. It usually triggers an exploit of vulnerability in a browser to
downloads an unknown file. The malicious program in the downloaded file
installs itself on the victims machine. Moreover, the downloaded file can be
camouflaged as an installer that would further install malicious software.
Drive by downloads is a very good example of the exponential increase in
malicious activity over the Internet and how it affects the daily use of the
web. In this paper, we try to address the problem caused by drive by downloads
from different standpoints. We provide in depth understanding of the
difficulties in dealing with drive by downloads and suggest appropriate
solutions. We propose machine learning and feature selection solutions to
remedy the the drive-by download problem. Experimental results reported 98.2%
precision, 98.2% F-Measure and 97.2% ROC area
Evaluation with uncertainty
Experimental uncertainty arises as a consequence of: (1) bias (systematic error), and (2) variance in measurements. Popular evaluation techniques only account for the variance due to sampling of experimental units, and assume the other sources of uncertainty can be ignored. For example, only the uncertainty due to sampling of topics (queries) and sampling of training:test datasets is considered in standard information retrieval (IR) and classifier system evaluation respectively. However, incomplete relevance judgements, assessor disagreement, non-deterministic systems, and the measurement bias can also cause uncertainty in these experiments. In this thesis, the impact of other sources of uncertainty on evaluating IR and classification experiments are investigated. The uncertainty due to:(1) incomplete relevance judgements in IR test collections,(2) non-determinism in IR systems / classifiers, and (3) high variance of classifiers is analysed using case studies from distributed information retrieval and information security. The thesis illustrates the importance of reducing and accurately accounting for uncertainty when evaluating complex IR and classifier systems. Novel techniques to(1) reduce uncertainty due to test collection bias in IR evaluation and high classifier variance (overfitting) in detecting drive-by download attacks,(2) account for multidimensional variance due to sampling of IR systems instances from non-deterministic IR systems in addition to sampling of topics, and (3) account for repeated measurements due to non-deterministic classification algorithms are introduced
Malware Resistant Data Protection in Hyper-connected Networks: A survey
Data protection is the process of securing sensitive information from being
corrupted, compromised, or lost. A hyperconnected network, on the other hand,
is a computer networking trend in which communication occurs over a network.
However, what about malware. Malware is malicious software meant to penetrate
private data, threaten a computer system, or gain unauthorised network access
without the users consent. Due to the increasing applications of computers and
dependency on electronically saved private data, malware attacks on sensitive
information have become a dangerous issue for individuals and organizations
across the world. Hence, malware defense is critical for keeping our computer
systems and data protected. Many recent survey articles have focused on either
malware detection systems or single attacking strategies variously. To the best
of our knowledge, no survey paper demonstrates malware attack patterns and
defense strategies combinedly. Through this survey, this paper aims to address
this issue by merging diverse malicious attack patterns and machine learning
(ML) based detection models for modern and sophisticated malware. In doing so,
we focus on the taxonomy of malware attack patterns based on four fundamental
dimensions the primary goal of the attack, method of attack, targeted exposure
and execution process, and types of malware that perform each attack. Detailed
information on malware analysis approaches is also investigated. In addition,
existing malware detection techniques employing feature extraction and ML
algorithms are discussed extensively. Finally, it discusses research
difficulties and unsolved problems, including future research directions.Comment: 30 pages, 9 figures, 7 tables, no where submitted ye
Multi-level analysis of Malware using Machine Learning
Multi-level analysis of Malware using Machine Learnin
Recommended from our members
Android Security: A Survey of Issues, Malware Penetration, and Defenses
Smartphones have become pervasive due to the availability of office applications, Internet, games, vehicle guidance using location-based services apart from conventional services such as voice calls, SMSes, and multimedia services. Android devices have gained huge market share due to the open architecture of Android and the popularity of its application programming interface (APIs) in the developer community. Increased popularity of the Android devices and associated monetary benefits attracted the malware developers, resulting in big rise of the Android malware apps between 2010 and 2014. Academic researchers and commercial antimalware companies have realized that the conventional signature-based and static analysis methods are vulnerable. In particular, the prevalent stealth techniques, such as encryption, code transformation, and environment-aware approaches, are capable of generating variants of known malware. This has led to the use of behavior-, anomaly-, and dynamic-analysis-based methods. Since a single approach may be ineffective against the advanced techniques, multiple complementary approaches can be used in tandem for effective malware detection. The existing reviews extensively cover the smartphone OS security. However, we believe that the security of Android, with particular focus on malware growth, study of antianalysis techniques, and existing detection methodologies, needs an extensive coverage. In this survey, we discuss the Android security enforcement mechanisms, threats to the existing security enforcements and related issues, malware growth timeline between 2010 and 2014, and stealth techniques employed by the malware authors, in addition to the existing detection methods. This review gives an insight into the strengths and shortcomings of the known research methodologies and provides a platform, to the researchers and practitioners, toward proposing the next-generation Android security, analysis, and malware detection techniques
Ransomware Simulator for In-Depth Analysis and Detection: Leveraging Centralized Logging and Sysmon for Improved Cybersecurity
Abstract
Ransomware attacks have become increasingly prevalent and sophisticated, posing significant threats to organizations and individuals worldwide. To effectively combat these threats,
security professionals must continuously develop and adapt their detection and mitigation
strategies. This master thesis presents the design and implementation of a ransomware simulator to facilitate an in-depth analysis of ransomware Tactics, Techniques, and Procedures
(TTPs) and to evaluate the effectiveness of centralized logging and Sysmon, including the
latest event types, in detecting and responding to such attacks.
The study explores the advanced capabilities of Sysmon as a logging tool and data source,
focusing on its ability to capture multiple event types, such as file creation, process execution,
and network traffic, as well as the newly added event types. The aim is to demonstrate the
effectiveness of Sysmon in detecting and analyzing malicious activities, with an emphasis on
the latest features. By focusing on the comprehensive aspects of a cyber-attack, the study
showcases the versatility and utility of Sysmon in detecting and addressing various attack
vectors.
The ransomware simulator is developed using a PowerShell script that emulates various
ransomware TTPs and attack scenarios, providing a comprehensive and realistic simulation
of a ransomware attack. Sysmon, a powerful system monitoring tool, is utilized to monitor
and log the activities associated with the simulated attack, including the events generated by
the new Sysmon features. Centralized logging is achieved through the integration of Splunk
Enterprise, a widely used platform for log analysis and management. The collected logs are
then analyzed to identify patterns, indicators of compromise (IoCs), and potential detection
and mitigation strategies.
Through the development of the ransomware simulator and the subsequent analysis of
Sysmon logs, this research contributes to strengthening the security posture of organizations
and improving cybersecurity measures against ransomware threats, with a focus on the latest
Sysmon capabilities. The results demonstrate the importance of monitoring and analyzing
system events to effectively detect and respond to ransomware attacks. This research can serve
as a basis for further exploration of ransomware detection and response strategies, contributing
to the advancement of cybersecurity practices and the development of more robust security
measures against ransomware threats
Machine learning approaches for malware classification based on hybrid artefacts
Malware could be developed and transformed into various forms to deceive users and evade antivirus and security endpoint detection. Furthermore, if one machine in the network is compromised, it could be used for lateral movement--when malware spreads stealthily without sending an alarm to monitoring systems. Malware attacks pose security threats to modern enterprises and can cause massive financial, reputation, and data loss to major enterprises. Therefore, it is important to detect these attacks effectively to reduce the loss to the minimum level. The current research uses different approaches, including static and dynamic analysis, to detect and analyze malware categories using distinct feature sets, such as imported modules, opcodes, and API calls, which can improve performance in binary and multi-class classification problems.
This thesis proposes a method for identifying and analyzing malware samples via static and dynamic approaches, including memory analysis and consecutive application operation sequences performed on the Windows 10 virtual environment. Standard classifiers and frequently used sequence models are utilized to expose the malware characteristics and benefit predictive capabilities. The features used in these algorithms are extracted from the static and dynamic analysis of malware samples, such as the rich header feature, debug information, temporary files, prefetch files, and event logs. The measurement of the classifiers and the degree of correctness are calculated using the accuracy, f1-score, Mean Absolute Error (MAE), confusion matrix, and Area under the ROC Curve (AUC). Combining two feature sets can provide the best classification performance on static file properties and dynamic analysis results, regardless of whether applying feature selection or not, achieving the accuracy and f1_score at 97% for integrating two datasets. For consecutive sequences, concatenating the Gated Recurrent Unit (GRU) and Transformers model can yield the highest accuracy at 97% for Noriben operations, while GRU can achieve the maximum accuracy for Opcode sequences at 89%
Identifying Code Injection and Reuse Payloads In Memory Error Exploits
Today's most widely exploited applications are the web browsers and document readers we use every day. The immediate goal of these attacks is to compromise target systems by executing a snippet of malicious code in the context of the exploited application. Technical tactics used to achieve this can be classified as either code injection - wherein malicious instructions are directly injected into the vulnerable program - or code reuse, where bits of existing program code are pieced together to form malicious logic. In this thesis, I present a new code reuse strategy that bypasses existing and up-and-coming mitigations, and two methods for detecting attacks by identifying the presence of code injection or reuse payloads. Fine-grained address space layout randomization efficiently scrambles program code, limiting one's ability to predict the location of useful instructions to construct a code reuse payload. To expose the inadequacy of this exploit mitigation, a technique for "just-in-time" exploitation is developed. This new technique maps memory on-the-fly and compiles a code reuse payload at runtime to ensure it works in a randomized application. The attack also works in face of all other widely deployed mitigations, as demonstrated with a proof-of-concept attack against Internet Explorer 10 in Windows 8. This motivates the need for detection of such exploits rather than solely relying on prevention. Two new techniques are presented for detecting attacks by identifying the presence of a payload. Code reuse payloads are identified by first taking a memory snapshot of the target application, then statically profiling the memory for chains of code pointers that reuse code to implement malicious logic. Code injection payloads are identified with runtime heuristics by leveraging hardware virtualization for efficient sandboxed execution of all buffers in memory. Employing both detection methods together to scan program memory takes about a second and produces negligible false positives and false negatives provided that the given exploit is functional and triggered in the target application version. Compared to other strategies, such as the use of signatures, this approach requires relatively little effort spent on maintenance over time and is capable of detecting never before seen attacks. Moving forward, one could use these contributions to form the basis of a unique and effective network intrusion detection system (NIDS) to augment existing systems.Doctor of Philosoph
Proposed Framework to Improving Performance of Familial Classification in Android Malware
Because of the recent developments in hardware and software technologies for mobile phones, people depend on their smartphones more than ever before. Today, people conduct a variety of business, health, and financial transactions on their mobile devices. This trend has caused an influx of mobile applications that require users' sensitive information. As these applications increase so too have the number of malicious applications increased, which may compromise users' sensitive information. Between all smartphone, Android receives major attention from security practitioners and researchers due to the large number of malicious applications. For the past twelve years, Android malicious applications have been clustered into groups for better identification. Characterizing the malware families can improve the detection process and understand the malware patterns. However, in the research community, detecting new malware families is a challenge. In this research, a framework is proposed to improve the performance of familial classification in Android malware. The framework is named a Reverse Engineering Framework (RevEng). Within RevEng, applications' permissions were selected and then fed into machine learning algorithms. Through our research, we created a reduced set of permissions using Extremely Randomized Trees algorithm that achieved high accuracy and a shorter execution time. Furthermore, we conducted two approaches based on the extracted information. The first approach used a binary value representation of the permissions. The second approach used the features' importance. We represented each selected permission in latter approach by its weight value instead of its binary value in the former approach. We conducted a comparison between the results of our two approaches and other relevant works. Our approaches achieved better results in both accuracy and time performance with a reduced number of permissions
MalDetConv: Automated Behaviour-based Malware Detection Framework Based on Natural Language Processing and Deep Learning Techniques
The popularity of Windows attracts the attention of hackers/cyber-attackers,
making Windows devices the primary target of malware attacks in recent years.
Several sophisticated malware variants and anti-detection methods have been
significantly enhanced and as a result, traditional malware detection
techniques have become less effective. This work presents MalBehavD-V1, a new
behavioural dataset of Windows Application Programming Interface (API) calls
extracted from benign and malware executable files using the dynamic analysis
approach. In addition, we present MalDetConV, a new automated behaviour-based
framework for detecting both existing and zero-day malware attacks. MalDetConv
uses a text processing-based encoder to transform features of API calls into a
suitable format supported by deep learning models. It then uses a hybrid of
convolutional neural network (CNN) and bidirectional gated recurrent unit
(CNN-BiGRU) automatic feature extractor to select high-level features of the
API Calls which are then fed to a fully connected neural network module for
malware classification. MalDetConv also uses an explainable component that
reveals features that contributed to the final classification outcome, helping
the decision-making process for security analysts. The performance of the
proposed framework is evaluated using our MalBehavD-V1 dataset and other
benchmark datasets. The detection results demonstrate the effectiveness of
MalDetConv over the state-of-the-art techniques with detection accuracy of
96.10%, 95.73%, 98.18%, and 99.93% achieved while detecting unseen malware from
MalBehavD-V1, Allan and John, Brazilian, and Ki-D datasets, respectively. The
experimental results show that MalDetConv is highly accurate in detecting both
known and zero-day malware attacks on Windows devices
- …