222 research outputs found
Using HTML5 to Prevent Detection of Drive-by-Download Web Malware
The web is experiencing an explosive growth in the last years. New
technologies are introduced at a very fast-pace with the aim of narrowing the
gap between web-based applications and traditional desktop applications. The
results are web applications that look and feel almost like desktop
applications while retaining the advantages of being originated from the web.
However, these advancements come at a price. The same technologies used to
build responsive, pleasant and fully-featured web applications, can also be
used to write web malware able to escape detection systems. In this article we
present new obfuscation techniques, based on some of the features of the
upcoming HTML5 standard, which can be used to deceive malware detection
systems. The proposed techniques have been experimented on a reference set of
obfuscated malware. Our results show that the malware rewritten using our
obfuscation techniques go undetected while being analyzed by a large number of
detection systems. The same detection systems were able to correctly identify
the same malware in its original unobfuscated form. We also provide some hints
about how the existing malware detection systems can be modified in order to
cope with these new techniques.Comment: This is the pre-peer reviewed version of the article: \emph{Using
HTML5 to Prevent Detection of Drive-by-Download Web Malware}, which has been
published in final form at \url{http://dx.doi.org/10.1002/sec.1077}. This
article may be used for non-commercial purposes in accordance with Wiley
Terms and Conditions for Self-Archivin
Machine Learning Interpretability in Malware Detection
The ever increasing processing power of modern computers, as well as the increased availability of large and complex data sets, has led to an explosion in machine learning research. This has led to increasingly complex machine learning algorithms, such as Convolutional Neural Networks, with increasingly complex applications, such as malware detection. Recently, malware authors have become increasingly successful in bypassing traditional malware detection methods, partly due to advanced evasion techniques such as obfuscation and server-side polymorphism. Further, new programming paradigms such as fileless malware, that is malware that exist only in the main memory (RAM) of the infected host, add to the challenges faced with modern day malware detection. This has led security specialists to turn to machine learning to augment their malware detection systems. However, with this new technology comes new challenges. One of these challenges is the need for interpretability in machine learning. Machine learning interpretability is the process of giving explanations of a machine learning model\u27s predictions to humans. Rather than trying to understand everything that is learnt by the model, it is an attempt to find intuitive explanations which are simple enough and provide relevant information for downstream tasks. Cybersecurity analysts always prefer interpretable solutions because of the need to fine tune these solutions. If malware analysts can\u27t interpret the reason behind a misclassification, they will not accept the non-interpretable or black box detector. In this thesis, we provide an overview of machine learning and discuss its roll in cyber security, the challenges it faces, and potential improvements to current approaches in the literature. We showcase its necessity as a result of new computing paradigms by implementing a proof of concept fileless malware with JavaScript. We then present techniques for interpreting machine learning based detectors which leverage n-gram analysis and put forward a novel and fully interpretable approach for malware detection which uses convolutional neural networks. We also define a novel approach for evaluating the robustness of a machine learning based detector
Security of data science and data science for security
In this chapter, we present a brief overview of important topics regarding the connection of data science and security. In the first part, we focus on the security of data science and discuss a selection of security aspects that data scientists should consider to make their services and products more secure. In the second part about security for data science, we switch sides and present some applications where data science plays a critical role in pushing the state-of-the-art in securing information systems. This includes a detailed look at the potential and challenges of applying machine learning to the problem of detecting obfuscated JavaScripts
Novel Attacks and Defenses in the Userland of Android
In the last decade, mobile devices have spread rapidly, becoming more and more part of our everyday lives; this is due to their feature-richness, mobility, and affordable price. At the time of writing, Android is the leader of the market among operating systems, with a share of 76% and two and a half billion active Android devices around the world. Given that such small devices contain a massive amount of our private and sensitive information, the economic interests in the mobile ecosystem skyrocketed. For this reason, not only legitimate apps running on mobile environments have increased dramatically, but also malicious apps have also been on a steady rise. On the one hand, developers of mobile operating systems learned from security mistakes of the past, and they made significant strides in blocking those threats right from the start. On the other hand, these high-security levels did not deter attackers. In this thesis, I present my research contribution about the most meaningful attack and defense scenarios in the userland of the modern Android operating system. I have emphasized "userland'' because attack and defense solutions presented in this thesis are executing in the userspace of the operating system, due to the fact that Android is slightly different from traditional operating systems. After the necessary technical background, I show my solution, RmPerm, in order to enable Android users to better protect their privacy by selectively removing permissions from any app on any Android version. This operation does not require any modification to the underlying operating system because we repack the original application. Then, using again repackaging, I have developed Obfuscapk; it is a black-box obfuscation tool that can work with every Android app and offers a free solution with advanced state of the art obfuscation techniques -- especially the ones used by malware authors. Subsequently, I present a machine learning-based technique that focuses on the identification of malware in resource-constrained devices such as Android smartphones. This technique has a very low resource footprint and does not rely on resources outside the protected device. Afterward, I show how it is possible to mount a phishing attack -- the historically preferred attack vector -- by exploiting two recent Android features, initially introduced in the name of convenience. Although a technical solution to this problem certainly exists, it is not solvable from a single entity, and there is the need for a push from the entire community. But sometimes, even though there exists a solution to a well-known vulnerability, developers do not take proper precautions. In the end, I discuss the Frame Confusion vulnerability; it is often present in hybrid apps, and it was discovered some years ago, but I show how it is still widespread. I proposed a methodology, implemented in the FCDroid tool, for systematically detecting the Frame Confusion vulnerability in hybrid Android apps. The results of an extensive analysis carried out through FCDroid on a set of the most downloaded apps from the Google Play Store prove that 6.63% (i.e., 1637/24675) of hybrid apps are potentially vulnerable to Frame Confusion. The impact of such results on the Android users' community is estimated in 250.000.000 installations of vulnerable apps
DeltaPhish: Detecting Phishing Webpages in Compromised Websites
The large-scale deployment of modern phishing attacks relies on the automatic
exploitation of vulnerable websites in the wild, to maximize profit while
hindering attack traceability, detection and blacklisting. To the best of our
knowledge, this is the first work that specifically leverages this adversarial
behavior for detection purposes. We show that phishing webpages can be
accurately detected by highlighting HTML code and visual differences with
respect to other (legitimate) pages hosted within a compromised website. Our
system, named DeltaPhish, can be installed as part of a web application
firewall, to detect the presence of anomalous content on a website after
compromise, and eventually prevent access to it. DeltaPhish is also robust
against adversarial attempts in which the HTML code of the phishing page is
carefully manipulated to evade detection. We empirically evaluate it on more
than 5,500 webpages collected in the wild from compromised websites, showing
that it is capable of detecting more than 99% of phishing webpages, while only
misclassifying less than 1% of legitimate pages. We further show that the
detection rate remains higher than 70% even under very sophisticated attacks
carefully designed to evade our system.Comment: Preprint version of the work accepted at ESORICS 201
Software for malicious macro detection
The objective of this work is to give a detailed study of the development process of a software tool for the detection of the Emotet virus in Microsoft Office files, Emotet is a virus that has been wreaking havoc mainly in the business environment, from its beginnings as a banking Trojan to nowadays. In fact, this polymorphic family has managed to generate evident, incalculable and global inconveniences in the business activity without discriminating by corporate typology, affecting any company regardless of its size or sector, even entering into government agencies, as well as the citizens themselves as a whole. The existence of two main obstacles for the detection of this virus, constitute an intrinsic reality to it, on the one hand, the obfuscation in its macros and on the other, its polymorphism, are essential pieces of the analysis, focusing our tool in facing precisely two obstacles, descending to the analysis of the macros features and the creation of a neuron network that uses machine learning to recognize the detection patterns and deliberate its malicious nature. With Emotet's in-depth nature analysis, our goal is to draw out a set of features from the malicious macros and build a machine learning model for their detection. After the feasibility study of this project, its design and implementation, the results that emerge endorse the intention to detect Emotet starting only from the static analysis and with the application of machine learning techniques. The detection ratios shown by the tests performed on the final model, present a accuracy of 84% and only 3% of false positives during this detection process.Grado en IngenierĂa Informátic
Recommended from our members
A universal malicious documents static detection framework based on feature generalization
In this study, Portable Document Format (PDF), Word, Excel, Rich Test format (RTF) and image documents are taken as the research objects to study a static and fast method by which to detect malicious documents. Malicious PDF and Word document features are abstracted and extended, which can be used to detect other types of documents. A universal static detection framework for malicious documents based on feature generalization is then proposed. The generalized features include specification check errors, the structure path, code keywords, and the number of objects. The proposed method is verified on two datasets, and is compared with Kaspersky, NOD32, and McAfee antivirus software. The experimental results demonstrate that the proposed method achieves good performance in terms of the detection accuracy, runtime, and scalability. The average F1-score of all types of documents is found to be 0.99, and the average detection time of a document is 0.5926 s, which is at the same level as the compared antivirus software.</jats:p
- …