241 research outputs found

    Using Context to Improve Network-based Exploit Kit Detection

    Get PDF
    Today, our computers are routinely compromised while performing seemingly innocuous activities like reading articles on trusted websites (e.g., the NY Times). These compromises are perpetrated via complex interactions involving the advertising networks that monetize these sites. Web-based compromises such as exploit kits are similar to any other scam -- the attacker wants to lure an unsuspecting client into a trap to steal private information, or resources -- generating 10s of millions of dollars annually. Exploit kits are web-based services specifically designed to capitalize on vulnerabilities in unsuspecting client computers in order to install malware without a user's knowledge. Sadly, it only takes a single successful infection to ruin a user's financial life, or lead to corporate breaches that result in millions of dollars of expense and loss of customer trust. Exploit kits use a myriad of techniques to obfuscate each attack instance, making current network-based defenses such as signature-based network intrusion detection systems far less effective than in years past. Dynamic analysis or honeyclient analysis on these exploits plays a key role in identifying new attacks for signature generation, but provides no means of inspecting end-user traffic on the network to identify attacks in real time. As a result, defenses designed to stop such malfeasance often arrive too late or not at all resulting in high false positive and false negative (error) rates. In order to deal with these drawbacks, three new detection approaches are presented. To deal with the issue of a high number of errors, a new technique for detecting exploit kit interactions on a network is proposed. The technique capitalizes on the fact that an exploit kit leads its potential victim through a process of exploitation by forcing the browser to download multiple web resources from malicious servers. This process has an inherent structure that can be captured in HTTP traffic and used to significantly reduce error rates. The approach organizes HTTP traffic into tree-like data structures, and, using a scalable index of exploit kit traces as samples, models the detection process as a subtree similarity search problem. The technique is evaluated on 3,800 hours of web traffic on a large enterprise network, and results show that it reduces false positive rates by four orders of magnitude over current state-of-the-art approaches. While utilizing structure can vastly improve detection rates over current approaches, it does not go far enough in helping defenders detect new, previously unseen attacks. As a result, a new framework that applies dynamic honeyclient analysis directly on network traffic at scale is proposed. The framework captures and stores a configurable window of reassembled HTTP objects network wide, uses lightweight content rendering to establish the chain of requests leading up to a suspicious event, then serves the initial response content back to the honeyclient in an isolated network. The framework is evaluated on a diverse collection of exploit kits as they evolve over a 1 year period. The empirical evaluation suggests that the approach offers significant operational value, and a single honeyclient can support a campus deployment of thousands of users. While the above approaches attempt to detect exploit kits before they have a chance to infect the client, they cannot protect a client that has already been infected. The final technique detects signs of post infection behavior by intrusions that abuses the domain name system (DNS) to make contact with an attacker. Contemporary detection approaches utilize the structure of a domain name and require hundreds of DNS messages to detect such malware. As a result, these detection mechanisms cannot detect malware in a timely manner and are susceptible to high error rates. The final technique, based on sequential hypothesis testing, uses the DNS message patterns of a subset of DNS traffic to detect malware in as little as four DNS messages, and with orders of magnitude reduction in error rates. The results of this work can make a significant operational impact on network security analysis, and open several exciting future directions for network security research.Doctor of Philosoph

    Bulk Analysis of Malicious PDF Documents

    Get PDF
    From 2007 onward, the PDF document has proven to be a successful vector for malware infections, making up 80% of all exploits found by Cisco ScanSafe in 2009 [1]. Creating new PDF documents is very easy and the volume of PDF documents identified as malicious has grown beyond the capabilities of security researchers to analyze by hand. The solution proposed by this thesis is to automatically extract features from the PDF documents to group and classify them, so that similar malware may be identified without manual analysis, thus reducing the workload of the malware analyst. These features may also be studied to identify trends within the PDF documents, such as similar exploits or obfuscation techniques. Our results show that the object graph structure of the PDF document is an effective way to create an initial grouping of malicious PDF documents. Finding similarities in PDF documents reveals further information about a data set. In our first case study, we examine the entire data set to identify large groups of similar PDF documents and make conjectures about their origins. In our second case study, we use a PDF document of known origin to find similar PDF documents within a data set. Through the two case studies, we were able to identify 50.3% of our data set with very little manual analysis of the malicious PDF documents

    Studying JavaScript Security Through Static Analysis

    Get PDF
    Mit dem stetigen Wachstum des Internets wächst auch das Interesse von Angreifern. Ursprünglich sollte das Internet Menschen verbinden; gleichzeitig benutzen aber Angreifer diese Vernetzung, um Schadprogramme wirksam zu verbreiten. Insbesondere JavaScript ist zu einem beliebten Angriffsvektor geworden, da es Angreifer ermöglicht Bugs und weitere Sicherheitslücken auszunutzen, und somit die Sicherheit und Privatsphäre der Internetnutzern zu gefährden. In dieser Dissertation fokussieren wir uns auf die Erkennung solcher Bedrohungen, indem wir JavaScript Code statisch und effizient analysieren. Zunächst beschreiben wir unsere zwei Detektoren, welche Methoden des maschinellen Lernens mit statischen Features aus Syntax, Kontroll- und Datenflüssen kombinieren zur Erkennung bösartiger JavaScript Dateien. Wir evaluieren daraufhin die Verlässlichkeit solcher statischen Systeme, indem wir bösartige JavaScript Dokumente umschreiben, damit sie die syntaktische Struktur von bestehenden gutartigen Skripten reproduzieren. Zuletzt studieren wir die Sicherheit von Browser Extensions. Zu diesem Zweck modellieren wir Extensions mit einem Graph, welcher Kontroll-, Daten-, und Nachrichtenflüsse mit Pointer Analysen kombiniert, wodurch wir externe Flüsse aus und zu kritischen Extension-Funktionen erkennen können. Insgesamt wiesen wir 184 verwundbare Chrome Extensions nach, welche die Angreifer ausnutzen könnten, um beispielsweise beliebigen Code im Browser eines Opfers auszuführen.As the Internet keeps on growing, so does the interest of malicious actors. While the Internet has become widespread and popular to interconnect billions of people, this interconnectivity also simplifies the spread of malicious software. Specifically, JavaScript has become a popular attack vector, as it enables to stealthily exploit bugs and further vulnerabilities to compromise the security and privacy of Internet users. In this thesis, we approach these issues by proposing several systems to statically analyze real-world JavaScript code at scale. First, we focus on the detection of malicious JavaScript samples. To this end, we propose two learning-based pipelines, which leverage syntactic, control and data-flow based features to distinguish benign from malicious inputs. Subsequently, we evaluate the robustness of such static malicious JavaScript detectors in an adversarial setting. For this purpose, we introduce a generic camouflage attack, which consists in rewriting malicious samples to reproduce existing benign syntactic structures. Finally, we consider vulnerable browser extensions. In particular, we abstract an extension source code at a semantic level, including control, data, and message flows, and pointer analysis, to detect suspicious data flows from and toward an extension privileged context. Overall, we report on 184 Chrome extensions that attackers could exploit to, e.g., execute arbitrary code in a victim's browser

    Reeling in Big Phish with a Deep MD5 Net

    Get PDF
    Phishing continues to grow as phishers discover new exploits and attack vectors for hosting malicious content; the traditional response using takedowns and blacklists does not appear to impede phishers significantly. A handful of law enforcement projects — for example the FBI\u27s Digital PhishNet and the Internet Crime and Complaint Center (ic3.gov) — have demonstrated that they can collect phishing data in substantial volumes, but these collections have not yet resulted in a significant decline in criminal phishing activity. In this paper, a new system is demonstrated for prioritizing investigative resources to help reduce the time and effort expended examining this particular form of online criminal activity. This research presents a means to correlate phishing websites by showing that certain websites are created by the same phishing kit. Such kits contain the content files needed to create the counterfeit website and often contain additional clues to the identity of the creators. A clustering algorithm is presented that uses collected phishing kits to establish clusters of related phishing websites. The ability to correlate websites provides law enforcement or other potential stakeholders with a means for prioritizing the allocation of limited investigative resources by identifying frequently repeating phishing offenders

    Analyzing and Defending Against Evolving Web Threats

    Get PDF
    The browser has evolved from a simple program that displays static web pages into a continuously-changing platform that is shaping the Internet as we know it today. The fierce competition among browser vendors has led to the introduction of a plethora of features in the past few years. At the same time, it remains the de facto way to access the Internet for billions of users. Because of such rapid evolution and wide popularity, the browser has attracted attackers, who pose new threats to unsuspecting Internet surfers.In this dissertation, I present my work on securing the browser againstcurrent and emerging threats. First, I discuss my work on honeyclients,which are tools that identify malicious pages that compromise the browser, and how one can evade such systems. Then, I describe a new system that I built, called Revolver, that automatically tracks the evolution of JavaScriptand is capable of identifying evasive web-based malware by finding similarities in JavaScript samples with different classifications. Finally, I present Hulk, a system that automatically analyzes and classifies browser extensions

    Measuring and Disrupting Malware Distribution Networks: An Interdisciplinary Approach

    Get PDF
    Malware Delivery Networks (MDNs) are networks of webpages, servers, computers, and computer files that are used by cybercriminals to proliferate malicious software (or malware) onto victim machines. The business of malware delivery is a complex and multifaceted one that has become increasingly profitable over the last few years. Due to the ongoing arms race between cybercriminals and the security community, cybercriminals are constantly evolving and streamlining their techniques to beat security countermeasures and avoid disruption to their operations, such as by security researchers infiltrating their botnet operations, or law enforcement taking down their infrastructures and arresting those involved. So far, the research community has conducted insightful but isolated studies into the different facets of malicious file distribution. Hence, only a limited picture of the malicious file delivery ecosystem has been provided thus far, leaving many questions unanswered. Using a data-driven and interdisciplinary approach, the purpose of this research is twofold. One, to study and measure the malicious file delivery ecosystem, bringing prior research into context, and to understand precisely how these malware operations respond to security and law enforcement intervention. And two, taking into account the overlapping research efforts of the information security and crime science communities towards preventing cybercrime, this research aims to identify mitigation strategies and intervention points to disrupt this criminal economy more effectively

    Revealing Malicious Contents Hidden In The Internet

    Get PDF
    In this age of ubiquitous communication in which we can stay constantly connected with the rest of the world, for most of the part, we have to be grateful for one particular invention - the Internet. But as the popularity of Internet connectivity grows, it has become a very dangerous place where objects of malicious content and intent can be hidden in plain sight. In this dissertation, we investigate different ways to detect and capture these malicious contents hidden in the Internet. First, we propose an automated system that mimics high-risk browsing activities such as clicking on suspicious online ads, and as a result collects malicious executable files for further analysis and diagnosis. Using our system we crawled over the Internet and collected a considerable amount of malicious executables with very limited resources. Malvertising has been one of the major recent threats against cyber security. Malvertisers apply a variety of evasion techniques to evade detection, whereas the ad networks apply inspection techniques to reveal the malicious ads. However, both the malvertiser and the ad network are under the constraints of resource and time. In the second part of this dissertation, we propose a game theoretic approach to formulate the problem of inspecting the malware inserted by the malvertisers into the Web-based advertising system. During malware collection, we used the online multi-AV scanning service VirusTotal to scan and analyze the samples, which can only generate an aggregation of antivirus scan reports. We need a multi-scanner solution that can accurately determine the maliciousness of a given sample. In the third part of this dissertation, we introduce three theoretical models, which enable us to predict the accuracy levels of different combination of scanners and determine the optimum configuration of a multi-scanner detection system to achieve maximum accuracy. Malicious communication generated by malware also can reveal the presence of it. In the case of botnets, their command and control (C&C) communication is good candidate for it. Among the widely used C&C protocols, HTTP is becoming the most preferred one. However, detecting HTTP-based C&C packets that constitute a minuscule portion of everyday HTTP traffic is a formidable task. In the final part of this dissertation, we present an anomaly detection based approach to detect HTTP-based C&C traffic using statistical features based on client generated HTTP request packets and DNS server generated response packets

    A taxonomy of phishing research

    Get PDF
    Phishing is a widespread threat that has attracted a lot of attention from the security community. A significant amount of research has focused on designing automated mitigation techniques. However, these techniques have largely only proven successful at catching previously witnessed phishing campaigns. Characteristics of phishing emails and web pages were thoroughly analyzed, but not enough emphasis was put on exploring alternate attack vectors. Novel education approaches were shown to be effective at teaching users to recognize phishing attacks and are adaptable to other kinds of threats. In this thesis, we explore a large amount of existing literature on phishing and present a comprehensive taxonomy of the current state of phishing research. With our extensive literature review, we will illuminate both areas of phishing research we believe will prove fruitful and areas that seem to be oversaturated

    Applications in security and evasions in machine learning : a survey

    Get PDF
    In recent years, machine learning (ML) has become an important part to yield security and privacy in various applications. ML is used to address serious issues such as real-time attack detection, data leakage vulnerability assessments and many more. ML extensively supports the demanding requirements of the current scenario of security and privacy across a range of areas such as real-time decision-making, big data processing, reduced cycle time for learning, cost-efficiency and error-free processing. Therefore, in this paper, we review the state of the art approaches where ML is applicable more effectively to fulfill current real-world requirements in security. We examine different security applications' perspectives where ML models play an essential role and compare, with different possible dimensions, their accuracy results. By analyzing ML algorithms in security application it provides a blueprint for an interdisciplinary research area. Even with the use of current sophisticated technology and tools, attackers can evade the ML models by committing adversarial attacks. Therefore, requirements rise to assess the vulnerability in the ML models to cope up with the adversarial attacks at the time of development. Accordingly, as a supplement to this point, we also analyze the different types of adversarial attacks on the ML models. To give proper visualization of security properties, we have represented the threat model and defense strategies against adversarial attack methods. Moreover, we illustrate the adversarial attacks based on the attackers' knowledge about the model and addressed the point of the model at which possible attacks may be committed. Finally, we also investigate different types of properties of the adversarial attacks

    On the evolution of digital evidence: novel approaches for cyber investigation

    Get PDF
    2012-2013Nowadays Internet is the fulcrum of our world, and the World Wide Web is the key to access it. We develop relationships on social networks and entrust sensitive documents to online services. Desktop applications are being replaced by fully-fledged web-applications that can be accessed from any devices. This is possible thanks to new web technologies that are being introduced at a very fast pace. However, these advances come at a price. Today, the web is the principal means used by cyber-criminals to perform attacks against people and organizations. In a context where information is extremely dynamic and volatile, the fight against cyber-crime is becoming more and more difficult. This work is divided in two main parts, both aimed at fueling research against cybercrimes. The first part is more focused on a forensic perspective and exposes serious limitations of current investigation approaches when dealing with modern digital information. In particular, it shows how it is possible to leverage common Internet services in order to forge digital evidence, which can be exploited by a cyber-criminal to claim an alibi. Hereinafter, a novel technique to track cyber-criminal activities on the Internet is proposed, aimed at the acquisition and analysis of information from highly dynamic services such as online social networks. The second part is more concerned about the investigation of criminal activities on the web. Aiming at raising awareness for upcoming threats, novel techniques for the obfuscation of web-based attacks are presented. These attacks leverage the same cuttingedge technology used nowadays to build pleasant and fully-featured web applications. Finally, a comprehensive study of today’s top menaces on the web, namely exploit kits, is presented. The result of this study has been the design of new techniques and tools that can be employed by modern honeyclients to better identify and analyze these menaces in the wild. [edited by author]XII n.s
    • …
    corecore