Search CORE

42 research outputs found

Using Malware Analysis to Evaluate Botnet Resilience

Author: Rossow C.
Publication venue: Amsterdam: Vrije Universiteit
Publication date: 01/01/2013
Field of study

Bos, H.J. [Promotor]Steen, M.R. van [Promotor

VU Research Portal

Machine Learning and Security of Non-Executable Files

Author: Šrndić Nedim
Publication venue: Universität Tübingen
Publication date: 01/01/2017
Field of study

Computer malware is a well-known threat in security which, despite the enormous time and effort invested in fighting it, is today more prevalent than ever. Recent years have brought a surge in one particular type: malware embedded in non-executable file formats, e.g., PDF, SWF and various office file formats. The result has been a massive number of infections, owed primarily to the trust that ordinary computer users have in these file formats. In addition, their feature-richness and implementation complexity have created enormous attack surfaces in widely deployed client software, resulting in regular discoveries of new vulnerabilities. The traditional approach to malware detection – signature matching, heuristics and behavioral profiling – has from its inception been a labor-intensive manual task, always lagging one step behind the attacker. With the exponential growth of computers and networks, malware has become more diverse, wide-spread and adaptive than ever, scaling much faster than the available talent pool of human malware analysts. An automated and scalable approach is needed to fill the gap between automated malware adaptation and manual malware detection, and machine learning is emerging as a viable solution. Its branch called adversarial machine learning studies the security of machine learning algorithms and the special conditions that arise when machine learning is applied for security. This thesis is a study of adversarial machine learning in the context of static detection of malware in non-executable file formats. It evaluates the effectiveness, efficiency and security of machine learning applications in this context. To this end, it introduces 3 data-driven detection methods developed using very large, high quality datasets. PJScan detects malicious PDF files based on lexical properties of embedded JavaScript code and is the fastest method published to date. SL2013 extends its coverage to all PDF files, regardless of JavaScript presence, by analyzing the hierarchical structure of PDF logical building blocks and demonstrates excellent performance in a novel long-term realistic experiment. Finally, Hidost generalizes the hierarchical-structure-based feature set to become the first machine-learning-based malware detector operating on multiple file formats. In a comprehensive experimental evaluation on PDF and SWF, it outperforms other academic methods and commercial antivirus systems in detection effectiveness. Furthermore, the thesis presents a framework for security evaluation of machine learning classifiers in a case study performed on an independent PDF malware detector. The results show that the ability to manipulate a part of the classifier’s feature set allows a malicious adversary to disguise malware so that it appears benign to the classifier with a high success rate. The presented methods are released as open-source software.Schadsoftware ist eine gut bekannte Sicherheitsbedrohung. Trotz der enormen Zeit und des Aufwands die investiert werden, um sie zu beseitigen, ist sie heute weiter verbreitet als je zuvor. In den letzten Jahren kam es zu einem starken Anstieg von Schadsoftware, welche in nicht-ausführbaren Dateiformaten, wie PDF, SWF und diversen Office-Formaten, eingebettet ist. Die Folge war eine massive Anzahl von Infektionen, ermöglicht durch das Vertrauen, das normale Rechnerbenutzer in diese Dateiformate haben. Außerdem hat die Komplexität und Vielseitigkeit dieser Dateiformate große Angriffsflächen in weitverbreiteter Klient-Software verursacht, und neue Sicherheitslücken werden regelmäßig entdeckt. Der traditionelle Ansatz zur Erkennung von Schadsoftware – Mustererkennung, Heuristiken und Verhaltensanalyse – war vom Anfang an eine äußerst mühevolle Handarbeit, immer einen Schritt hinter den Angreifern zurück. Mit dem exponentiellen Wachstum von Rechenleistung und Netzwerkgeschwindigkeit ist Schadsoftware diverser, zahlreicher und schneller-anpassend geworden als je zuvor, doch die Verfügbarkeit von menschlichen Schadsoftware-Analysten kann nicht so schnell skalieren. Ein automatischer und skalierbarer Ansatz ist gefragt, und maschinelles Lernen tritt als eine brauchbare Lösung hervor. Ein Bereich davon, Adversarial Machine Learning, untersucht die Sicherheit von maschinellen Lernverfahren und die besonderen Verhältnisse, die bei der Anwendung von machinellem Lernen für Sicherheit entstehen. Diese Arbeit ist eine Studie von Adversarial Machine Learning im Kontext statischer Schadsoftware-Erkennung in nicht-ausführbaren Dateiformaten. Sie evaluiert die Wirksamkeit, Leistungsfähigkeit und Sicherheit von maschinellem Lernen in diesem Kontext. Zu diesem Zweck stellt sie 3 datengesteuerte Erkennungsmethoden vor, die alle auf sehr großen und diversen Datensätzen entwickelt wurden. PJScan erkennt bösartige PDF-Dateien anhand lexikalischer Eigenschaften von eingebettetem JavaScript-Code und ist die schnellste bisher veröffentliche Methode. SL2013 erweitert die Erkennung auf alle PDF-Dateien, unabhängig davon, ob sie JavaScript enthalten, indem es die hierarchische Struktur von logischen PDF-Bausteinen analysiert. Es zeigt hervorragende Leistung in einem neuen, langfristigen und realistischen Experiment. Schließlich generalisiert Hidost den auf hierarchischen Strukturen basierten Merkmalsraum und wurde zum ersten auf maschinellem Lernen basierten Schadsoftware-Erkennungssystem, das auf mehreren Dateiformaten anwendbar ist. In einer umfassenden experimentellen Evaulierung auf PDF- und SWF-Formaten schlägt es andere akademische Methoden und kommerzielle Antiviren-Lösungen bezüglich Erkennungswirksamkeit. Überdies stellt diese Doktorarbeit ein Framework für Sicherheits-Evaluierung von auf machinellem Lernen basierten Klassifikatoren vor und wendet es in einer Fallstudie auf eine unabhängige akademische Schadsoftware-Erkennungsmethode an. Die Ergebnisse zeigen, dass die Fähigkeit, nur einen Teil von Features, die ein Klasifikator verwendet, zu manipulieren, einem Angreifer ermöglicht, Schadsoftware in Dateien so einzubetten, dass sie von der Erkennungsmethode mit hoher Erfolgsrate als gutartig fehlklassifiziert wird. Die vorgestellten Methoden wurden als Open-Source-Software veröffentlicht

Publikationsserver der Universität Tübingen

A Touch of Evil: High-Assurance Cryptographic Hardware from Untrusted Components

Author: Cerulli Andrea
Cvrcek Dan
Danezis George
Klinec Dusan
Mavroudis Vasilios
Svenda Petr
Publication venue
Publication date: 28/10/2017
Field of study

The semiconductor industry is fully globalized and integrated circuits (ICs) are commonly defined, designed and fabricated in different premises across the world. This reduces production costs, but also exposes ICs to supply chain attacks, where insiders introduce malicious circuitry into the final products. Additionally, despite extensive post-fabrication testing, it is not uncommon for ICs with subtle fabrication errors to make it into production systems. While many systems may be able to tolerate a few byzantine components, this is not the case for cryptographic hardware, storing and computing on confidential data. For this reason, many error and backdoor detection techniques have been proposed over the years. So far all attempts have been either quickly circumvented, or come with unrealistically high manufacturing costs and complexity. This paper proposes Myst, a practical high-assurance architecture, that uses commercial off-the-shelf (COTS) hardware, and provides strong security guarantees, even in the presence of multiple malicious or faulty components. The key idea is to combine protective-redundancy with modern threshold cryptographic techniques to build a system tolerant to hardware trojans and errors. To evaluate our design, we build a Hardware Security Module that provides the highest level of assurance possible with COTS components. Specifically, we employ more than a hundred COTS secure crypto-coprocessors, verified to FIPS140-2 Level 4 tamper-resistance standards, and use them to realize high-confidentiality random number generation, key derivation, public key decryption and signing. Our experiments show a reasonable computational overhead (less than 1% for both Decryption and Signing) and an exponential increase in backdoor-tolerance as more ICs are added

arXiv.org e-Print Archive

UCL Discovery

Active Data Collection Techniques to Understand Online Scammers and Cybercriminals

Author: Park Young Sam
Publication venue
Publication date: 01/01/2016
Field of study

Nigerian scam, also known as advance fee fraud or 419 scam, is a prevalent form of online fraudulent activity that causes financial loss to individuals and businesses. Nigerian scam has evolved from simple non-targeted email messages to more sophisticated scams targeted at users of classifieds, dating and other websites. Even though such scams are observed and reported by users frequently, the community’s understanding of Nigerian scams is limited since the scammers operate “underground”. To better understand the underground Nigerian scam ecosystem and seek effective methods to deter Nigerian scam and cybercrime in general, we conduct a series of active and passive measurement studies. Relying upon the analysis and insight gained from the measurement studies, we make four contributions: (1) we analyze the taxonomy of Nigerian scam and derive long-term trends in scams; (2) we provide an insight on Nigerian scam and cybercrime ecosystems and their underground operation; (3) we propose a payment intervention as a potential deterrent to cybercrime operation in general and evaluate its effectiveness; and (4) we offer active and passive measurement tools and techniques that enable in-depth analysis of cybercrime ecosystems and deterrence on them. We first created and analyze a repository of more than two hundred thousand user-reported scam emails, stretching from 2006 to 2014, from four major scam reporting websites. We select ten most commonly observed scam categories and tag 2,000 scam emails randomly selected from our repository. Based upon the manually tagged dataset, we train a machine learning classifier and cluster all scam emails in the repository. From the clustering result, we find a strong and sustained upward trend for targeted scams and downward trend for non-targeted scams. We then focus on two types of targeted scams: sales scams and rental scams targeted users on Craigslist. We built an automated scam data collection system and gathered large-scale sales scam emails. Using the system we posted honeypot ads on Craigslist and conversed automatically with the scammers. Through the email conversation, the system obtained additional confirmation of likely scam activities and collected additional information such as IP addresses and shipping addresses. Our analysis revealed that around 10 groups were responsible for nearly half of the over 13,000 total scam attempts we received. These groups used IP addresses and shipping addresses in both Nigeria and the U.S. We also crawled rental ads on Craigslist, identified rental scam ads amongst the large number of benign ads and conversed with the potential scammers. Through in-depth analysis of the rental scams, we found seven major scam campaigns employing various operations and monetization methods. We also found that unlike sales scammers, most rental scammers were in the U.S. The large-scale scam data and in-depth analysis provide useful insights on how to design effective deterrence techniques against cybercrime in general. We study underground DDoS-for-hire services, also known as booters, and measure the effectiveness of undermining a payment system of DDoS Services. Our analysis shows that the payment intervention can have the desired effect of limiting cybercriminals’ ability and increasing the risk of accepting payments

Digital Repository at the University of Maryland

Towards secure web browsing on mobile devices

Author: Amrutkar Chaitrali Vijay
Publication venue: Georgia Institute of Technology
Publication date: 08/06/2015
Field of study

The Web is increasingly being accessed by portable, multi-touch wireless devices. Despite the popularity of platform-specific (native) mobile apps, a recent study of smartphone usage shows that more people (81%) browse the Web than use native apps (68%) on their phone. Moreover, many popular native apps such as BBC depend on browser-like components (e.g., Webview) for their functionality. The popularity and prevalence of web browsers on modern mobile phones warrants characterizing existing and emerging threats to mobile web browsing, and building solutions for the same. Although a range of studies have focused on the security of native apps on mobile devices, efforts in characterizing the security of web transactions originating at mobile browsers are limited. This dissertation presents three main contributions: First, we show that porting browsers to mobile platforms leads to new vulnerabilities previously not observed in desktop browsers. The solutions to these vulnerabilities require careful balancing between usability and security and might not always be equivalent to those in desktop browsers. Second, we empirically demonstrate that the combination of reduced screen space and an independent selection of security indicators not only make it difficult for experts to determine the security standing of mobile browsers, but actually make mobile browsing more dangerous for average users as they provide a false sense of security. Finally, we experimentally demonstrate the need for mobile specific techniques to detect malicious webpages. We then design and implement kAYO, the first mobile specific static tool to detect malicious webpages in real-time.Ph.D

Scholarly Materials And Research @ Georgia Tech

Stepping Up the Cybersecurity Game: Protecting Online Services from Malicious Activity

Author: Stringhini Gianluca
Publication venue: eScholarship, University of California
Publication date: 01/01/2014
Field of study

The rise in popularity of online services such as social networks,web-based emails, and blogs has made them a popular platform for attackers.Cybercriminals leverage such services to spread spam, malware, and stealpersonal information from their victims.In a typical cybercriminal operation, miscreants first infect their victims' machines with malicious software and have themjoin a botnet, which is a network of compromised computers. In the second step,the infected machines are often leveraged to connect to legitimate onlineservices and perform malicious activities.As a consequence, online services receive activity from both legitimate and malicious users. However, while legitimate users use these services for thepurposes they were designed for, malicious parties exploit them for theirillegal actions, which are often linked to an economic gain. In this thesis, I showthat the way in which malicious users and legitimate ones interact with Internetservices presents differences. I then develop mitigation techniques thatleverage such differences to detect and block malicious parties that misuseInternet services.As examples of this research approach, I first study the problem of spammingbotnets, which are misused to send hundreds of millions of spam emails tomailservers spread across the globe. I show that botmasters typically split alist of victim email addresses among their bots, and that it is possible toidentify bots belonging to the same botnet by enumerating the mailservers thatare contacted by IP addresses over time. I developed a system, calledBotMagnifier, which learns the set of mailservers contacted by the bots belongingto a certain botnet, and finds more bots belonging to that same botnet.I then study the problem of misused accounts on online social networks. I firstlook at the problem of fake accounts that are set up by cybercriminals to spreadmalicious content. I study the modus operandi of the cybercriminalscontrolling such accounts, and I then develop a system to automatically flag asocial network accounts as fake. I then look at the problem of legitimateaccounts getting compromised by miscreants, and I present COMPA, a system thatlearns the typical habits of social network users and considers messages thatdeviate from the learned behavior as possible compromises. As a last example, I present EvilCohort, a system that detects communities ofonline accounts that are accessed by the same botnet. EvilCohort works byclustering together accounts that are accessed by a common set of IP addresses,and can work on any online service that requires the use of accounts (socialnetworks, web-based emails, blogs, etc.)

Ezid

eScholarship - University of California

Identification and Recognition of Remote-Controlled Malware

Author: Dietrich Christian
Publication venue: Universität Mannheim
Publication date: 01/01/2012
Field of study

This thesis encapsulates research on the detection of botnets. First, we design and implement Sandnet, an observation and monitoring infrastructure to study the botnet phenomenon. Using Sandnet, we evaluate detection approaches based on traffic analysis and rogue visual monetization. Therefore, we identify and recognize botnet C&C channels by help of traffic analysis. To a large degree, our clustering and classification leverage the sequence of message lengths per flow. As a result, our implementation, CoCoSpot, proves to reliably detect active C&C communication of a variety of botnet families, even in face of fully encrypted C&C messages. Furthermore, we found a botnet that uses DNS as carrier protocol for its command and control channel. By help of statistical entropy as well as behavioral features, we design and implement a classifier that detects DNS-based C&C, even in mixed network traffic of benign users. Finally, perceptual clustering of Sandnet screenshots enables us to group malware into rogue visual monetization campaigns and study their monetization properties

MAnnheim DOCument Server

Recommended from our members

Identifying and Preventing Large-scale Internet Abuse

Author: Borgolte Kevin
Publication venue: eScholarship, University of California
Publication date: 01/01/2018
Field of study

The widespread access to the Internet and the ubiquity of web-based services make it easy to communicate and interact globally. Unfortunately, the software and protocols implementing the functionality of these services are often vulnerable to attacks. In turn, an attacker can exploit them to compromise, take over, and abuse the services for her own nefarious purposes. In this dissertation, we aim to better understand such attacks, and we develop methods and algorithms to detect and prevent them, which we evaluate on large-scale datasets.First, we detail Meerkat, a system to detect a visible way in which websites are being compromised, namely website defacements. They can inflict significant harm on the websites’ operators through the loss of sales, the loss in reputation, or because of legal ramifications. Meerkat requires no prior knowledge about the websites’ content or their structure, but only the Uniform Resource Identifier (URI) at which they can be reached. By design, Meerkat mimics how a human analyst decides if a website was defaced when viewing it in a browser, by using computer vision techniques. Thus, it tackles the problem of detecting website defacements through their attention-seeking nature, their goal and purpose, rather than code or data artifacts that they might exhibit. In turn, it is much harder for an attacker to evade our system, as she needs to change her modus operandi. When Meerkat detects a website as defaced, the website can automatically be put into maintenance mode or restored to a known good state.An attacker, however, is not limited to abuse a compromised website in a way that is visible to the website’s visitors. Instead, she can misuse the website to infect its visitors with malicious software (malware). Although malware is well studied, identifying malicious websites remains a major challenge in today’s Internet. Second, we introduce Delta, a novel, purely static analysis approach that extracts change-related features between two versions of the same website, uses machine learning to derive a model of website changes, detects if an introduced change was malicious or benign, identifies the underlying infection vector based on clustering, and generates an identifying signature. Furthermore, due to the way Delta clusters campaigns, it can uncover infection campaigns that leverage specific vulnerable applications as a distribution channel, and it can greatly reduce the human labor necessary to uncover the application responsible for a service’s compromise.Third, we investigate the practicality and impact of domain takeover attacks, which an attacker can similarly abuse to spread misinformation or malware, and we present a defense on how such takeover attacks can be rendered toothless. Specifically, the new elasticity of Internet resources, in particular Internet protocol (IP) addresses in the context of Infrastructure-as-a-Service cloud service providers, combined with previously made protocol assumptions can lead to security issues. In Cloud Strife, we show that this dynamic component paired with recent developments in trust-based ecosystems (e.g., Transport Layer Security (TLS) certificates) creates so far unknown attack vectors. For example, a substantial number of stale domain name system (DNS) records points to readily available IP addresses in clouds, yet, they are still actively attempted to be accessed. Often, these records belong to discontinued services that were previously hosted in the cloud. We demonstrate that it is practical, and time and cost-efficient for attackers to allocate the IP addresses to which stale DNS records point. Further considering the ubiquity of domain validation in trust ecosystems, an attacker can impersonate the service by obtaining and using a valid certificate that is trusted by all major operating systems and browsers, which severely increases the attackers’ capabilities. The attacker can then also exploit residual trust in the domain name for phishing, receiving and sending emails, or possibly distributing code to clients that load remote code from the domain (e.g., loading of native code by mobile apps, or JavaScript libraries by websites). To prevent such attacks, we introduce a new authentication method for trust-based domain validation that mitigates staleness issues without incurring additional certificate requester effort by incorporating existing trust into the validation process.Finally, the analyses of Delta, Meerkat, and Cloud Strife have made use of large-scale measurements to assess our approaches’ impact and viability. Indeed, security research in general has made extensive use of exhaustive Internet-wide scans over the recent years, as they can provide significant insights into the state of security of the Internet (e.g., if classes of devices are behaving maliciously, or if they might be insecure and could turn malicious in an instant). However, the address space of the Internet’s core addressing protocol (Internet Protocol version 4; IPv4) is exhausted, and a migration to its successor (Internet Protocol version 6; IPv6), the only accepted long-term solution, is inevitable. In turn, to better understand the security of devices connected to the Internet, in particular Internet of Things devices, it is imperative to include IPv6 addresses in security evaluations and scans. Unfortunately, it is practically infeasible to iterate through the entire IPv6 address space, as it is 296 times larger than the IPv4 address space. Without enumerating hosts prior to scanning, we will be unable to retain visibility into the overall security of Internet-connected devices in the future, and we will be unable to detect and prevent their abuse or compromise. To mitigate this blind spot, we introduce a novel technique to enumerate part of the IPv6 address space by walking DNSSEC-signed IPv6 reverse zones. We show (i) that enumerating active IPv6 hosts is practical without a preferential network position contrary to common belief, (ii) that the security of active IPv6 hosts is currently still lagging behind the security state of IPv4 hosts, and (iii) that unintended default IPv6 connectivity is a major security issue

eScholarship - University of California

Information Leakage Attacks and Countermeasures

Author: Mavroudis Vasilios
Publication venue: UCL (University College London)
Publication date: 28/01/2022
Field of study

The scientific community has been consistently working on the pervasive problem of information leakage, uncovering numerous attack vectors, and proposing various countermeasures. Despite these efforts, leakage incidents remain prevalent, as the complexity of systems and protocols increases, and sophisticated modeling methods become more accessible to adversaries. This work studies how information leakages manifest in and impact interconnected systems and their users. We first focus on online communications and investigate leakages in the Transport Layer Security protocol (TLS). Using modern machine learning models, we show that an eavesdropping adversary can efficiently exploit meta-information (e.g., packet size) not protected by the TLS’ encryption to launch fingerprinting attacks at an unprecedented scale even under non-optimal conditions. We then turn our attention to ultrasonic communications, and discuss their security shortcomings and how adversaries could exploit them to compromise anonymity network users (even though they aim to offer a greater level of privacy compared to TLS). Following up on these, we delve into physical layer leakages that concern a wide array of (networked) systems such as servers, embedded nodes, Tor relays, and hardware cryptocurrency wallets. We revisit location-based side-channel attacks and develop an exploitation neural network. Our model demonstrates the capabilities of a modern adversary but also presents an inexpensive tool to be used by auditors for detecting such leakages early on during the development cycle. Subsequently, we investigate techniques that further minimize the impact of leakages found in production components. Our proposed system design distributes both the custody of secrets and the cryptographic operation execution across several components, thus making the exploitation of leaks difficult

UCL Discovery