485 research outputs found

    Detection of Early-Stage Enterprise Infection by Mining Large-Scale Log Data

    Get PDF
    Recent years have seen the rise of more sophisticated attacks including advanced persistent threats (APTs) which pose severe risks to organizations and governments by targeting confidential proprietary information. Additionally, new malware strains are appearing at a higher rate than ever before. Since many of these malware are designed to evade existing security products, traditional defenses deployed by most enterprises today, e.g., anti-virus, firewalls, intrusion detection systems, often fail at detecting infections at an early stage. We address the problem of detecting early-stage infection in an enterprise setting by proposing a new framework based on belief propagation inspired from graph theory. Belief propagation can be used either with "seeds" of compromised hosts or malicious domains (provided by the enterprise security operation center -- SOC) or without any seeds. In the latter case we develop a detector of C&C communication particularly tailored to enterprises which can detect a stealthy compromise of only a single host communicating with the C&C server. We demonstrate that our techniques perform well on detecting enterprise infections. We achieve high accuracy with low false detection and false negative rates on two months of anonymized DNS logs released by Los Alamos National Lab (LANL), which include APT infection attacks simulated by LANL domain experts. We also apply our algorithms to 38TB of real-world web proxy logs collected at the border of a large enterprise. Through careful manual investigation in collaboration with the enterprise SOC, we show that our techniques identified hundreds of malicious domains overlooked by state-of-the-art security products

    Graph Mining for Cybersecurity: A Survey

    Full text link
    The explosive growth of cyber attacks nowadays, such as malware, spam, and intrusions, caused severe consequences on society. Securing cyberspace has become an utmost concern for organizations and governments. Traditional Machine Learning (ML) based methods are extensively used in detecting cyber threats, but they hardly model the correlations between real-world cyber entities. In recent years, with the proliferation of graph mining techniques, many researchers investigated these techniques for capturing correlations between cyber entities and achieving high performance. It is imperative to summarize existing graph-based cybersecurity solutions to provide a guide for future studies. Therefore, as a key contribution of this paper, we provide a comprehensive review of graph mining for cybersecurity, including an overview of cybersecurity tasks, the typical graph mining techniques, and the general process of applying them to cybersecurity, as well as various solutions for different cybersecurity tasks. For each task, we probe into relevant methods and highlight the graph types, graph approaches, and task levels in their modeling. Furthermore, we collect open datasets and toolkits for graph-based cybersecurity. Finally, we outlook the potential directions of this field for future research

    Command & Control: Understanding, Denying and Detecting - A review of malware C2 techniques, detection and defences

    Full text link
    In this survey, we first briefly review the current state of cyber attacks, highlighting significant recent changes in how and why such attacks are performed. We then investigate the mechanics of malware command and control (C2) establishment: we provide a comprehensive review of the techniques used by attackers to set up such a channel and to hide its presence from the attacked parties and the security tools they use. We then switch to the defensive side of the problem, and review approaches that have been proposed for the detection and disruption of C2 channels. We also map such techniques to widely-adopted security controls, emphasizing gaps or limitations (and success stories) in current best practices.Comment: Work commissioned by CPNI, available at c2report.org. 38 pages. Listing abstract compressed from version appearing in repor

    CAREER: adaptive intrusion detection systems

    Get PDF
    Issued as final reportNational Science Foundation (U.S.

    An Analysis on Network Flow-Based IoT Botnet Detection Using Weka

    Get PDF
    Botnets pose a significant and growing risk to modern networks. Detection of botnets remains an important area of open research in order to prevent the proliferation of botnets and to mitigate the damage that can be caused by botnets that have already been established. Botnet detection can be broadly categorised into two main categories: signature-based detection and anomaly-based detection. This paper sets out to measure the accuracy, false-positive rate, and false-negative rate of four algorithms that are available in Weka for anomaly-based detection of a dataset of HTTP and IRC botnet data. The algorithms that were selected to detect botnets in the Weka environment are J48, naïve Bayes, random forest, and UltraBoost. The dataset was generated using a realistic network environment by The University of New South Wales, Canberra. The findings showed that botnet behaviours from the selected dataset could be detected by Weka with a high degree of accuracy and low false-positive rate. With all features included, the random forest algorithm was found to achieve the highest accuracy with 96.70%, and the algorithm that attained the lowest false-positive rates was also random forest with 0.008. With a reduced feature set of IP addresses and ports, the random forest algorithm attained the highest accuracy and precision and lowest false-positive rate. With only information regarding packets per second being sent and received, J48 was this time the most accurate with its predictions and attained the highest precision

    Advances in Data Mining Knowledge Discovery and Applications

    Get PDF
    Advances in Data Mining Knowledge Discovery and Applications aims to help data miners, researchers, scholars, and PhD students who wish to apply data mining techniques. The primary contribution of this book is highlighting frontier fields and implementations of the knowledge discovery and data mining. It seems to be same things are repeated again. But in general, same approach and techniques may help us in different fields and expertise areas. This book presents knowledge discovery and data mining applications in two different sections. As known that, data mining covers areas of statistics, machine learning, data management and databases, pattern recognition, artificial intelligence, and other areas. In this book, most of the areas are covered with different data mining applications. The eighteen chapters have been classified in two parts: Knowledge Discovery and Data Mining Applications

    CHID : conditional hybrid intrusion detection system for reducing false positives and resource consumption on malicous datasets

    Get PDF
    Inspecting packets to detect intrusions faces challenges when coping with a high volume of network traffic. Packet-based detection processes every payload on the wire, which degrades the performance of network intrusion detection system (NIDS). This issue requires an introduction of a flow-based NIDS that reduces the amount of data to be processed by examining aggregated information of related packets. However, flow-based detection still suffers from the generation of the false positive alerts due to incomplete data input. This study proposed a Conditional Hybrid Intrusion Detection (CHID) by combining the flow-based with packet-based detection. In addition, it is also aimed to improve the resource consumption of the packet-based detection approach. CHID applied attribute wrapper features evaluation algorithms that marked malicious flows for further analysis by the packet-based detection. Input Framework approach was employed for triggering packet flows between the packetbased and flow-based detections. A controlled testbed experiment was conducted to evaluate the performance of detection mechanism’s CHID using datasets obtained from on different traffic rates. The result of the evaluation showed that CHID gains a significant performance improvement in terms of resource consumption and packet drop rate, compared to the default packet-based detection implementation. At a 200 Mbps, CHID in IRC-bot scenario, can reduce 50.6% of memory usage and decreases 18.1% of the CPU utilization without packets drop. CHID approach can mitigate the false positive rate of flow-based detection and reduce the resource consumption of packet-based detection while preserving detection accuracy. CHID approach can be considered as generic system to be applied for monitoring of intrusion detection systems
    • …
    corecore