802 research outputs found

    Controlled Data Sharing for Collaborative Predictive Blacklisting

    Get PDF
    Although sharing data across organizations is often advocated as a promising way to enhance cybersecurity, collaborative initiatives are rarely put into practice owing to confidentiality, trust, and liability challenges. In this paper, we investigate whether collaborative threat mitigation can be realized via a controlled data sharing approach, whereby organizations make informed decisions as to whether or not, and how much, to share. Using appropriate cryptographic tools, entities can estimate the benefits of collaboration and agree on what to share in a privacy-preserving way, without having to disclose their datasets. We focus on collaborative predictive blacklisting, i.e., forecasting attack sources based on one's logs and those contributed by other organizations. We study the impact of different sharing strategies by experimenting on a real-world dataset of two billion suspicious IP addresses collected from Dshield over two months. We find that controlled data sharing yields up to 105% accuracy improvement on average, while also reducing the false positive rate.Comment: A preliminary version of this paper appears in DIMVA 2015. This is the full version. arXiv admin note: substantial text overlap with arXiv:1403.212

    Privacy-Friendly Collaboration for Cyber Threat Mitigation

    Full text link
    Sharing of security data across organizational boundaries has often been advocated as a promising way to enhance cyber threat mitigation. However, collaborative security faces a number of important challenges, including privacy, trust, and liability concerns with the potential disclosure of sensitive data. In this paper, we focus on data sharing for predictive blacklisting, i.e., forecasting attack sources based on past attack information. We propose a novel privacy-enhanced data sharing approach in which organizations estimate collaboration benefits without disclosing their datasets, organize into coalitions of allied organizations, and securely share data within these coalitions. We study how different partner selection strategies affect prediction accuracy by experimenting on a real-world dataset of 2 billion IP addresses and observe up to a 105% prediction improvement.Comment: This paper has been withdrawn as it has been superseded by arXiv:1502.0533

    On Collaborative Predictive Blacklisting

    Full text link
    Collaborative predictive blacklisting (CPB) allows to forecast future attack sources based on logs and alerts contributed by multiple organizations. Unfortunately, however, research on CPB has only focused on increasing the number of predicted attacks but has not considered the impact on false positives and false negatives. Moreover, sharing alerts is often hindered by confidentiality, trust, and liability issues, which motivates the need for privacy-preserving approaches to the problem. In this paper, we present a measurement study of state-of-the-art CPB techniques, aiming to shed light on the actual impact of collaboration. To this end, we reproduce and measure two systems: a non privacy-friendly one that uses a trusted coordinating party with access to all alerts (Soldo et al., 2010) and a peer-to-peer one using privacy-preserving data sharing (Freudiger et al., 2015). We show that, while collaboration boosts the number of predicted attacks, it also yields high false positives, ultimately leading to poor accuracy. This motivates us to present a hybrid approach, using a semi-trusted central entity, aiming to increase utility from collaboration while, at the same time, limiting information disclosure and false positives. This leads to a better trade-off of true and false positive rates, while at the same time addressing privacy concerns.Comment: A preliminary version of this paper appears in ACM SIGCOMM's Computer Communication Review (Volume 48 Issue 5, October 2018). This is the full versio

    Hardening DGA classifiers utilizing IVAP

    Get PDF
    Domain Generation Algorithms (DGAs) are used by malware to generate a deterministic set of domains, usually by utilizing a pseudo-random seed. A malicious botmaster can establish connections between their command-and-control center (C&C) and any malware-infected machines by registering domains that will be DGA-generated given a specific seed, rendering traditional domain blacklisting ineffective. Given the nature of this threat, the real-time detection of DGA domains based on incoming DNS traffic is highly important. The use of neural network machine learning (ML) models for this task has been well-studied, but there is still substantial room for improvement. In this paper, we propose to use Inductive Venn-Abers predictors (IVAPs) to calibrate the output of existing ML models for DGA classification. The IVAP is a computationally efficient procedure which consistently improves the predictive accuracy of classifiers at the expense of not offering predictions for a small subset of inputs and consuming an additional amount of training data

    Predictive Cyber Situational Awareness and Personalized Blacklisting: A Sequential Rule Mining Approach

    Get PDF
    Cybersecurity adopts data mining for its ability to extract concealed and indistinct patterns in the data, such as for the needs of alert correlation. Inferring common attack patterns and rules from the alerts helps in understanding the threat landscape for the defenders and allows for the realization of cyber situational awareness, including the projection of ongoing attacks. In this paper, we explore the use of data mining, namely sequential rule mining, in the analysis of intrusion detection alerts. We employed a dataset of 12 million alerts from 34 intrusion detection systems in 3 organizations gathered in an alert sharing platform, and processed it using our analytical framework. We execute the mining of sequential rules that we use to predict security events, which we utilize to create a predictive blacklist. Thus, the recipients of the data from the sharing platform will receive only a small number of alerts of events that are likely to occur instead of a large number of alerts of past events. The predictive blacklist has the size of only 3 % of the raw data, and more than 60 % of its entries are shown to be successful in performing accurate predictions in operational, real-world settings

    Big Data Blacklisting

    Get PDF
    “Big data blacklisting” is the process of categorizing individuals as administratively “guilty until proven innocent” by virtue of suspicious digital data and database screening results. Database screening and digital watchlisting systems are increasingly used to determine who can work, vote, fly, etc. In a big data world, through the deployment of these big data tools, both substantive and procedural due process protections may be threatened in new and nearly invisible ways. Substantive due process rights safeguard fundamental liberty interests. Procedural due process rights prevent arbitrary deprivations by the government of constitutionally protected interests. This Article frames the increasing digital mediation of rights and privileges through government-led big data programs as a constitutional harm under substantive due process, and identifies the obstruction of core liberties with big data tools as rapidly evolving and systemic. To illustrate the mass scale and unprecedented nature of the big data blacklisting phenomenon, this Article undertakes a significant descriptive burden to introduce and contextualize big data blacklisting programs. Through this descriptive effort, this Article explores how a commonality of big data harms may be associated with nonclassified big data programs, such as the No Work List and No Vote List-programs that the government uses to establish or deny an individual\u27s eligibility for certain benefits or rights through database screening. The big data blacklisting harms of big data tools to make eligibility decisions are not, of course, limited to nonclassified programs. This Article also suggests how the same consequences may be at play with classified and semi-classified big data programs such as the Terrorist Watchlist and No Fly List. This Article concludes that big data blacklisting harms interfere with and obstruct fundamental liberty interests in a way that now necessitates an evolution of the existing due process jurisprudence

    Predictive Methods in Cyber Defense: Current Experience and Research Challenges

    Get PDF
    Predictive analysis allows next-generation cyber defense that is more proactive than current approaches based on intrusion detection. In this paper, we discuss various aspects of predictive methods in cyber defense and illustrate them on three examples of recent approaches. The first approach uses data mining to extract frequent attack scenarios and uses them to project ongoing cyberattacks. The second approach uses a dynamic network entity reputation score to predict malicious actors. The third approach uses time series analysis to forecast attack rates in the network. This paper presents a unique evaluation of the three distinct methods in a common environment of an intrusion detection alert sharing platform, which allows for a comparison of the approaches and illustrates the capabilities of predictive analysis for current and future research and cybersecurity operations. Our experiments show that all three methods achieved a sufficient technology readiness level for experimental deployment in an operational setting with promising accuracy and usability. Namely prediction and projection methods, despite their differences, are highly usable for predictive blacklisting, the first provides a more detailed output, and the second is more extensible. Network security situation forecasting is lightweight and displays very high accuracy, but does not provide details on predicted events
    • …
    corecore