19 research outputs found
Application of evolutionary machine learning in metamorphic malware analysis and detection
In recent times, malware detection and analysis are becoming key issues. A dangerous class of malware is metamorphic malware which is capable of modifying its own code and hiding malicious instructions within normal program code. Current malware detectors are susceptible to metamorphic malware as they are pre-trained to recognize only predicted versions of code. However, if detectors could be trained on a larger set of data that included potential mutant variants, they could be more accurate. The task of finding new evasive variants is challenging - many variants might exist. In this research, a two-phase system is proposed. First, a mutation only Evolutionary Algorithm (EA) is used to search for a diverse set of new, malicious mutants, that evade detection by existing detection algorithms. While this is shown to be successful, it requires multiple runs of the algorithm to produce multiple variants without explicit guarantee of diversity. To address this, a Quality Diversity (QD) algorithm — MAP-Elites, that traverses a high-dimensional search space in search of the best solution at every point of a feature space with low dimension, is then developed to return a large and diverse repertoire of solutions in a single run. This method produces a larger and more diverse archive of solutions than the mutation only Evolutionary Algorithm (EA) and sheds insight into the properties of a sample that lead to them being undetectable by a suite of existing detection engines. Having created a set of evasive and diverse variants, detectors are then trained using a set of classical classification methods (feature-based and sequence-based models) with results showing that classification of metamorphic malware can be improved by augmenting training data with the diverse set of evolved variant samples. This also includes the use of a pretrained Natural Language Processing (NLP) model in a transfer learning setting to show improved classification of metamorphic malware, using the evolved variants as part of the training data
The Huawei and Snowden Questions
This open access book answers two central questions: firstly, is it at all possible to verify electronic equipment procured from untrusted vendors? Secondly, can I build trust into my products in such a way that I support verification by untrusting customers? In separate chapters the book takes readers through the state of the art in fields of computer science that can shed light on these questions. In a concluding chapter it discusses realistic ways forward. In discussions on cyber security, there is a tacit assumption that the manufacturer of equipment will collaborate with the user of the equipment to stop third-party wrongdoers. The Snowden files and recent deliberations on the use of Chinese equipment in the critical infrastructures of western countries have changed this. The discourse in both cases revolves around what malevolent manufacturers can do to harm their own customers, and the importance of the matter is on par with questions of national security. This book is of great interest to ICT and security professionals who need a clear understanding of the two questions posed in the subtitle, and to decision-makers in industry, national bodies and nation states
Recommended from our members
Identifying and Preventing Large-scale Internet Abuse
The widespread access to the Internet and the ubiquity of web-based services make it easy to communicate and interact globally. Unfortunately, the software and protocols implementing the functionality of these services are often vulnerable to attacks. In turn, an attacker can exploit them to compromise, take over, and abuse the services for her own nefarious purposes. In this dissertation, we aim to better understand such attacks, and we develop methods and algorithms to detect and prevent them, which we evaluate on large-scale datasets.First, we detail Meerkat, a system to detect a visible way in which websites are being compromised, namely website defacements. They can inflict significant harm on the websites’ operators through the loss of sales, the loss in reputation, or because of legal ramifications. Meerkat requires no prior knowledge about the websites’ content or their structure, but only the Uniform Resource Identifier (URI) at which they can be reached. By design, Meerkat mimics how a human analyst decides if a website was defaced when viewing it in a browser, by using computer vision techniques. Thus, it tackles the problem of detecting website defacements through their attention-seeking nature, their goal and purpose, rather than code or data artifacts that they might exhibit. In turn, it is much harder for an attacker to evade our system, as she needs to change her modus operandi. When Meerkat detects a website as defaced, the website can automatically be put into maintenance mode or restored to a known good state.An attacker, however, is not limited to abuse a compromised website in a way that is visible to the website’s visitors. Instead, she can misuse the website to infect its visitors with malicious software (malware). Although malware is well studied, identifying malicious websites remains a major challenge in today’s Internet. Second, we introduce Delta, a novel, purely static analysis approach that extracts change-related features between two versions of the same website, uses machine learning to derive a model of website changes, detects if an introduced change was malicious or benign, identifies the underlying infection vector based on clustering, and generates an identifying signature. Furthermore, due to the way Delta clusters campaigns, it can uncover infection campaigns that leverage specific vulnerable applications as a distribution channel, and it can greatly reduce the human labor necessary to uncover the application responsible for a service’s compromise.Third, we investigate the practicality and impact of domain takeover attacks, which an attacker can similarly abuse to spread misinformation or malware, and we present a defense on how such takeover attacks can be rendered toothless. Specifically, the new elasticity of Internet resources, in particular Internet protocol (IP) addresses in the context of Infrastructure-as-a-Service cloud service providers, combined with previously made protocol assumptions can lead to security issues. In Cloud Strife, we show that this dynamic component paired with recent developments in trust-based ecosystems (e.g., Transport Layer Security (TLS) certificates) creates so far unknown attack vectors. For example, a substantial number of stale domain name system (DNS) records points to readily available IP addresses in clouds, yet, they are still actively attempted to be accessed. Often, these records belong to discontinued services that were previously hosted in the cloud. We demonstrate that it is practical, and time and cost-efficient for attackers to allocate the IP addresses to which stale DNS records point. Further considering the ubiquity of domain validation in trust ecosystems, an attacker can impersonate the service by obtaining and using a valid certificate that is trusted by all major operating systems and browsers, which severely increases the attackers’ capabilities. The attacker can then also exploit residual trust in the domain name for phishing, receiving and sending emails, or possibly distributing code to clients that load remote code from the domain (e.g., loading of native code by mobile apps, or JavaScript libraries by websites). To prevent such attacks, we introduce a new authentication method for trust-based domain validation that mitigates staleness issues without incurring additional certificate requester effort by incorporating existing trust into the validation process.Finally, the analyses of Delta, Meerkat, and Cloud Strife have made use of large-scale measurements to assess our approaches’ impact and viability. Indeed, security research in general has made extensive use of exhaustive Internet-wide scans over the recent years, as they can provide significant insights into the state of security of the Internet (e.g., if classes of devices are behaving maliciously, or if they might be insecure and could turn malicious in an instant). However, the address space of the Internet’s core addressing protocol (Internet Protocol version 4; IPv4) is exhausted, and a migration to its successor (Internet Protocol version 6; IPv6), the only accepted long-term solution, is inevitable. In turn, to better understand the security of devices connected to the Internet, in particular Internet of Things devices, it is imperative to include IPv6 addresses in security evaluations and scans. Unfortunately, it is practically infeasible to iterate through the entire IPv6 address space, as it is 296 times larger than the IPv4 address space. Without enumerating hosts prior to scanning, we will be unable to retain visibility into the overall security of Internet-connected devices in the future, and we will be unable to detect and prevent their abuse or compromise. To mitigate this blind spot, we introduce a novel technique to enumerate part of the IPv6 address space by walking DNSSEC-signed IPv6 reverse zones. We show (i) that enumerating active IPv6 hosts is practical without a preferential network position contrary to common belief, (ii) that the security of active IPv6 hosts is currently still lagging behind the security state of IPv4 hosts, and (iii) that unintended default IPv6 connectivity is a major security issue
Stinging the Predators: A collection of papers that should never have been published
This ebook collects academic papers and conference abstracts that were meant to be so terrible that nobody in their right mind would publish them. All were submitted to journals and conferences to expose weak or non-existent peer review and other exploitative practices. Each paper has a brief introduction. Short essays round out the collection
Robotics, AI, and Humanity
This open access book examines recent advances in how artificial intelligence (AI) and robotics have elicited widespread debate over their benefits and drawbacks for humanity. The emergent technologies have for instance implications within medicine and health care, employment, transport, manufacturing, agriculture, and armed conflict. While there has been considerable attention devoted to robotics/AI applications in each of these domains, a fuller picture of their connections and the possible consequences for our shared humanity seems needed. This volume covers multidisciplinary research, examines current research frontiers in AI/robotics and likely impacts on societal well-being, human – robot relationships, as well as the opportunities and risks for sustainable development and peace. The attendant ethical and religious dimensions of these technologies are addressed and implications for regulatory policies on the use and future development of AI/robotics technologies are elaborated
Electronic Literature as Digital Humanities
This book is available as open access through the Bloomsbury Open programme and is available on www.bloomsburycollections.com. Electronic Literature as Digital Humanities: Contexts, Forms & Practices is a volume of essays that provides a detailed account of born-digital literature by artists and scholars who have contributed to its birth and evolution. Rather than offering a prescriptive definition of electronic literature, this book takes an ontological approach through descriptive exploration, treating electronic literature from the perspective of the digital humanities (DH)––that is, as an area of scholarship and practice that exists at the juncture between the literary and the algorithmic. The domain of DH is typically segmented into the two seemingly disparate strands of criticism and building, with scholars either studying the synthesis between cultural expression and screens or the use of technology to make artifacts in themselves. This book regards electronic literature as fundamentally DH in that it synthesizes these two constituents. Electronic Literature as Digital Humanities provides a context for the development of the field, informed by the forms and practices that have emerged throughout the DH moment, and finally, offers resources for others interested in learning more about electronic literature