348 research outputs found
Learning Fast and Slow: PROPEDEUTICA for Real-time Malware Detection
In this paper, we introduce and evaluate PROPEDEUTICA, a novel methodology
and framework for efficient and effective real-time malware detection,
leveraging the best of conventional machine learning (ML) and deep learning
(DL) algorithms. In PROPEDEUTICA, all software processes in the system start
execution subjected to a conventional ML detector for fast classification. If a
piece of software receives a borderline classification, it is subjected to
further analysis via more performance expensive and more accurate DL methods,
via our newly proposed DL algorithm DEEPMALWARE. Further, we introduce delays
to the execution of software subjected to deep learning analysis as a way to
"buy time" for DL analysis and to rate-limit the impact of possible malware in
the system. We evaluated PROPEDEUTICA with a set of 9,115 malware samples and
877 commonly used benign software samples from various categories for the
Windows OS. Our results show that the false positive rate for conventional ML
methods can reach 20%, and for modern DL methods it is usually below 6%.
However, the classification time for DL can be 100X longer than conventional ML
methods. PROPEDEUTICA improved the detection F1-score from 77.54% (conventional
ML method) to 90.25%, and reduced the detection time by 54.86%. Further, the
percentage of software subjected to DL analysis was approximately 40% on
average. Further, the application of delays in software subjected to ML reduced
the detection time by approximately 10%. Finally, we found and discussed a
discrepancy between the detection accuracy offline (analysis after all traces
are collected) and on-the-fly (analysis in tandem with trace collection). Our
insights show that conventional ML and modern DL-based malware detectors in
isolation cannot meet the needs of efficient and effective malware detection:
high accuracy, low false positive rate, and short classification time.Comment: 17 pages, 7 figure
Malicious code detection in android : the role of sequence characteristics and disassembling methods
The acceptance and widespread use of the Android operating system drew the attention of both legitimate developers and malware authors, which resulted in a significant number of benign and malicious applications available on various online markets. Since the signature-based methods fall short for detecting malicious software effectively considering the vast number of applications, machine learning techniques in this field have also become widespread. In this context, stating the acquired
accuracy values in the contingency tables in malware detection studies has become a popular and efficient method and enabled researchers to evaluate their methodologies comparatively. In this study, we wanted to investigate and emphasize the factors that may affect the accuracy values of the models managed by researchers, particularly the disassembly method and the input data characteristics. Firstly, we developed a model that tackles the malware detection problem from a Natural Language Processing (NLP) perspective using Long Short-Term Memory (LSTM). Then, we experimented with different base units (instruction, basic block, method, and class) and representations of source code obtained from three commonly used disassembling tools (JEB, IDA, and Apktool) and examined the results. Our findings exhibit that the disassembly method and different input representations affect the model results. More specifically, the datasets collected by the Apktool achieved better results compared to the other two disassemblers
Deep Learning Models for Detecting Malware Attacks
Malware is one of the most common and severe cyber-attack today. Malware
infects millions of devices and can perform several malicious activities
including mining sensitive data, encrypting data, crippling system performance,
and many more. Hence, malware detection is crucial to protect our computers and
mobile devices from malware attacks. Deep learning (DL) is one of the emerging
and promising technologies for detecting malware. The recent high production of
malware variants against desktop and mobile platforms makes DL algorithms
powerful approaches for building scalable and advanced malware detection models
as they can handle big datasets. This work explores current deep learning
technologies for detecting malware attacks on the Windows, Linux, and Android
platforms. Specifically, we present different categories of DL algorithms,
network optimizers, and regularization methods. Different loss functions,
activation functions, and frameworks for implementing DL models are presented.
We also present feature extraction approaches and a review of recent DL-based
models for detecting malware attacks on the above platforms. Furthermore, this
work presents major research issues on malware detection including future
directions to further advance knowledge and research in this field.Comment: Revised figures 2 and 3, revised title, remove typos page 1
Protecting Android Devices from Malware Attacks: A State-of-the-Art Report of Concepts, Modern Learning Models and Challenges
Advancements in microelectronics have increased the popularity of mobile devices like
cellphones, tablets, e-readers, and PDAs. Android, with its open-source platform, broad device support,
customizability, and integration with the Google ecosystem, has become the leading operating system for
mobile devices. While Android's openness brings benefits, it has downsides like a lack of official support,
fragmentation, complexity, and security risks if not maintained. Malware exploits these vulnerabilities for
unauthorized actions and data theft. To enhance device security, static and dynamic analysis techniques can
be employed. However, current attackers are becoming increasingly sophisticated, and they are employing
packaging, code obfuscation, and encryption techniques to evade detection models. Researchers prefer
flexible artificial intelligence methods, particularly deep learning models, for detecting and classifying
malware on Android systems. In this survey study, a detailed literature review was conducted to investigate
and analyze how deep learning approaches have been applied to malware detection on Android systems. The
study also provides an overview of the Android architecture, datasets used for deep learning-based detection,
and open issues that will be studied in the future
Enhancing cloud security through the integration of deep learning and data mining techniques: A comprehensive review
Cloud computing is crucial in all areas of data storage and online service delivery. It adds various benefits to the conventional storage and sharing system, such as simple access, on-demand storage, scalability, and cost savings. The employment of its rapidly expanding technologies may give several benefits in protecting the Internet of Things (IoT) and physical cyber systems (CPS) from various cyber threats, with IoT and CPS providing facilities for people in their everyday lives. Because malware (malware) is on the rise and there is no well-known strategy for malware detection, leveraging the cloud environment to identify malware might be a viable way forward. To avoid detection, a new kind of malware employs complex jamming and packing methods. Because of this, it is very hard to identify sophisticated malware using typical detection methods. The article presents a detailed assessment of cloud-based malware detection technologies, as well as insight into understanding the cloud's use in protecting the Internet of Things and critical infrastructure from intrusions. This study examines the benefits and drawbacks of cloud environments in malware detection, as well as presents a methodology for detecting cloud-based malware using deep learning and data extraction and highlights new research on the issues of propagating existing malware. Finally, similarities and variations across detection approaches will be exposed, as well as detection technique flaws. The findings of this work may be utilized to highlight the current issue being tackled in malware research in the future
Visualizing the outcome of dynamic analysis of Android malware with VizMal
Malware detection techniques based on signature extraction require security analysts to manually inspect samples to find evidences of malicious behavior. This time-consuming task received little attention by researchers and practitioners, as most of the effort is on the identification as malware or non-malware of an entire sample. There are no tools for supporting the analyst in identifying when the malicious behavior occurs, given a sample. In this paper we propose VizMal, a tool able to visualize the execution traces of Android applications and to highlight which portions of the traces correspond to a potentially malicious behavior. The aim of VizMal is twofold: assisting the malware analyst during the inspection of an application and pushing the research community to organize and focus its effort on the malicious behavior localization. VizMal is able to discern if the application behavior during each second of execution are legitimate or malicious and to show this information in a simple and understandable way. We validate VizMal experimentally and by means of a user study: the results are promising and confirm that the tool can be useful
Generic Black-Box End-to-End Attack Against State of the Art API Call Based Malware Classifiers
In this paper, we present a black-box attack against API call based machine
learning malware classifiers, focusing on generating adversarial sequences
combining API calls and static features (e.g., printable strings) that will be
misclassified by the classifier without affecting the malware functionality. We
show that this attack is effective against many classifiers due to the
transferability principle between RNN variants, feed forward DNNs, and
traditional machine learning classifiers such as SVM. We also implement GADGET,
a software framework to convert any malware binary to a binary undetected by
malware classifiers, using the proposed attack, without access to the malware
source code.Comment: Accepted as a conference paper at RAID 201
- …