14 research outputs found

    A survey on current malicious javascript behavior of infected web content in detection of malicious web pages

    Get PDF
    In recent years, the advance growth of cybercrime has become an urgent issue to the security authorities. With the improvement of web technologies enable attackers to launch the web-based attacks and other malicious code easily without having prior expert knowledge. Recently, JavaScript has become the most common attack construction language as it is the primary browser scripting language which allow developer to develop sophisticated client-side interfaces for web application. This lead to the growth of malicious websites and as main platform for distributing malware or malicious script to the user's computer when the user access to these webpages. Initial act and detection on such threats early in a timely manner is vital in order to reduce the damages which have caused billions of dollars lost every year. A number of approaches have been proposed to detect malicious web pages. However, the efficient detection of malicious web pages previously has generated many false alarm by the use of sophisticated obfuscation techniques in benign JavaScript code in web pages. Therefore, in this paper, a thoroughly survey and detailed understanding of malicious JavaScript code features will be provided, which have been collected from the web content. We conduct a thorough analysis and studies on the usage of different JavaScript features and JavaScript detection technique systematically and present the most important features of malicious threats in web pages. Then the analysis will be presented along with different dimensions (features representation, detection techniques analysis, and sample of malicious script)

    Using HTML5 to Prevent Detection of Drive-by-Download Web Malware

    Get PDF
    The web is experiencing an explosive growth in the last years. New technologies are introduced at a very fast-pace with the aim of narrowing the gap between web-based applications and traditional desktop applications. The results are web applications that look and feel almost like desktop applications while retaining the advantages of being originated from the web. However, these advancements come at a price. The same technologies used to build responsive, pleasant and fully-featured web applications, can also be used to write web malware able to escape detection systems. In this article we present new obfuscation techniques, based on some of the features of the upcoming HTML5 standard, which can be used to deceive malware detection systems. The proposed techniques have been experimented on a reference set of obfuscated malware. Our results show that the malware rewritten using our obfuscation techniques go undetected while being analyzed by a large number of detection systems. The same detection systems were able to correctly identify the same malware in its original unobfuscated form. We also provide some hints about how the existing malware detection systems can be modified in order to cope with these new techniques.Comment: This is the pre-peer reviewed version of the article: \emph{Using HTML5 to Prevent Detection of Drive-by-Download Web Malware}, which has been published in final form at \url{http://dx.doi.org/10.1002/sec.1077}. This article may be used for non-commercial purposes in accordance with Wiley Terms and Conditions for Self-Archivin

    Classification of cross site scripting web pages using machine learning techniques

    Get PDF
    There are many web application threats such as SQL injection and Cross Site Scripting. According to OWASP 2013 security report, Cross Site Scripting came in third place. Cross Site Scripting is an attack that targets web applications which lack security countermeasures against untrusted data that is provided by the user, and this attack take advantage of these web applications because they do not apply any input validation or output sanitization methods. Few previous works which used machine learning to detect cross site scripting attacks via classification of the web pages into two classes; malicious or benign. The previous works used too many features which considered as irrelevant and noise data because they do not have significant value on accuracy ratio which would cause complexity and decrease the performance of the classification process. They also used URL features which considered unnecessary since URL is considered as the entry point of the attack but cannot activate it since all the different kinds of cross site scripting get activated and run inside the HTML source code. In this study, we focus on how to implement feature selection through Information Gain (IG) to select the most significant features that lead to better performance and less execution time. The selected features used to classify the datasets with three different classifiers to test the performance of these features. The features used in this study were used by previous works, however with IG feature selection, we selected 14 features as the most significant features and the accuracy obtained by using these features was 95.78% compared to when using all features which was 93.11%. The recall was also improved from 88% when all features used to 92.33% when only using the 14 selected features

    Towards a Feature Rich Model for Predicting Spam Emails containing Malicious Attachments and URLs

    Get PDF
    Malicious content in spam emails is increasing in the form of attachments and URLs. Malicious attachments and URLs attempt to deliver software that can compromise the security of a computer. These malicious attachments also try to disguise their content to avoid virus scanners used by most email services to screen for such risks. Malicious URLs add another layer of disguise, where the email content tries to entice the recipient to click on a URL that links to a malicious Web site or downloads a malicious attachment. In this paper, based on two real world data sets we present our preliminary research on predicting the kind of spam email most likely to contain these highly dangerous spam emails. We propose a rich set of features for the content of emails to capture regularities in emails containing malicious content. We show these features can predict malicious attachments within an area under the precious recall curve (AUC-PR) up to 95.2%, and up to 68.1% for URLs. Our work can help reduce reliance on virus scanners and URL blacklists, which often do not update as quickly as the malicious content it attempts to identify. Such methods could reduce the many different resources now needed to identify malicious content

    DETECTION, DIAGNOSIS AND MITIGATION OF MALICIOUS JAVASCRIPT WITH ENRICHED JAVASCRIPT EXECUTIONS

    Get PDF
    Malicious JavaScript has become an important attack vector for software exploitation attacks and imposes a severe threat to computer security. In particular, three major class of problems, malware detection, exploit diagnosis, and exploits mitigation, bring considerable challenges to security researchers. Although a lot of research efforts have been made to address these threats, they have fundamental limitations and thus cannot solve the problems. Existing analysis techniques fall into two general categories: static analysis and dynamic analysis. Static analysis tends to produce inaccurate results (both false positive and false negative) and is vulnerable to a wide series of obfuscation techniques. Thus, dynamic analysis is constantly gaining popularity for exposing the typical features of malicious JavaScript. However, existing dynamic analysis techniques possess limitations such as limited code coverage and incomplete environment setup, leaving a broad attack surface for evading the detection. Once a zero-day exploit is captured, it is critical to quickly pinpoint the JavaScript statements that uniquely characterize the exploit and the payload location in the exploit. However, the current diagnosis techniques are inadequate because they approach the problem either from a JavaScript perspective and fail to account for “implicit” data flow invisible at JavaScript level, or from a binary execution perspective and fail to present the JavaScript level view of exploit. Although software vendors have deployed techniques like ASLR, sandbox, etc. to mitigate JavaScript exploits, hacking contests (e.g.,PWN2OWN, GeekPWN) have demonstrated that the latest software (e.g., Chrome, IE, Edge, Safari) can still be exploited. An ideal JavaScript exploit mitigation solution should be flexible and allow for deployment without requiring code changes. To combat malicious JavaScript, this dissertation addresses these problems through enriched executions, which explore arbitrary paths for detection, preserve JS-binary semantics for diagnosis, and perturbs memory with chaff code for mitigation. Firstly, JSForce, a forced execution engine for JavaScript, is proposed and developed to improve the detection results of current malicious JavaScript detection techniques. It drives an arbitrary JavaScript snippet to execute along different paths without any input or environment setup. While increasing code coverage, JSForce can tolerate invalid object accesses while introducing no runtime errors during execution. Secondly, JScalpel, a system that utilizes the JavaScript context information from the JavaScript level to perform context-aware binary analysis, is presented for JavaScript exploit diagnosis. In essence, it performs JS-Binary analysis to (1) generate a minimized exploit script, which in turn helps to generate a signature for the exploit, and (2) precisely locate the payload within the exploit. It replaces the malicious payload with a friendly payload and generates a PoV for the exploit. Thirdly, ChaffyScript, a vulnerability-agnostic mitigation system, is introduced to block JavaScript exploits via undermining the memory preparation stage. Specifically, given suspicious JavaScript, ChaffyScript rewrites the code to insert memory perturbation code, and then generates semantically-equivalent code. JavaScript exploits will fail as a result of unexpected memory states introduced by memory perturbation code, while the benign JavaScript still behaves as expected since the memory perturbation code does not change the JavaScript’s original semantics
    corecore