11 research outputs found

    Mathematical and Statistical Opportunities in Cyber Security

    Get PDF
    The role of mathematics in a complex system such as the Internet has yet to be deeply explored. In this paper, we summarize some of the important and pressing problems in cyber security from the viewpoint of open science environments. We start by posing the question "What fundamental problems exist within cyber security research that can be helped by advanced mathematics and statistics?" Our first and most important assumption is that access to real-world data is necessary to understand large and complex systems like the Internet. Our second assumption is that many proposed cyber security solutions could critically damage both the openness and the productivity of scientific research. After examining a range of cyber security problems, we come to the conclusion that the field of cyber security poses a rich set of new and exciting research opportunities for the mathematical and statistical sciences

    Mathematical and Statistical Opportunities in Cyber Security

    Full text link

    Controlled Data Sharing for Collaborative Predictive Blacklisting

    Get PDF
    Although sharing data across organizations is often advocated as a promising way to enhance cybersecurity, collaborative initiatives are rarely put into practice owing to confidentiality, trust, and liability challenges. In this paper, we investigate whether collaborative threat mitigation can be realized via a controlled data sharing approach, whereby organizations make informed decisions as to whether or not, and how much, to share. Using appropriate cryptographic tools, entities can estimate the benefits of collaboration and agree on what to share in a privacy-preserving way, without having to disclose their datasets. We focus on collaborative predictive blacklisting, i.e., forecasting attack sources based on one's logs and those contributed by other organizations. We study the impact of different sharing strategies by experimenting on a real-world dataset of two billion suspicious IP addresses collected from Dshield over two months. We find that controlled data sharing yields up to 105% accuracy improvement on average, while also reducing the false positive rate.Comment: A preliminary version of this paper appears in DIMVA 2015. This is the full version. arXiv admin note: substantial text overlap with arXiv:1403.212

    The Challenges of Effectively Anonymizing Network Data

    Get PDF
    The availability of realistic network data plays a significant role in fostering collaboration and ensuring U.S. technical leadership in network security research. Unfortunately, a host of technical, legal, policy, and privacy issues limit the ability of operators to produce datasets for information security testing. In an effort to help overcome these limitations, several data collection efforts (e.g., CRAWDAD[14], PREDICT [34]) have been established in the past few years. The key principle used in all of these efforts to assure low-risk, high-value data is that of trace anonymization—the process of sanitizing data before release so that potentially sensitive information cannot be extracted

    Prediction of Malicious Objects in Computer Network and Defense

    Full text link

    Privacy-Friendly Collaboration for Cyber Threat Mitigation

    Full text link
    Sharing of security data across organizational boundaries has often been advocated as a promising way to enhance cyber threat mitigation. However, collaborative security faces a number of important challenges, including privacy, trust, and liability concerns with the potential disclosure of sensitive data. In this paper, we focus on data sharing for predictive blacklisting, i.e., forecasting attack sources based on past attack information. We propose a novel privacy-enhanced data sharing approach in which organizations estimate collaboration benefits without disclosing their datasets, organize into coalitions of allied organizations, and securely share data within these coalitions. We study how different partner selection strategies affect prediction accuracy by experimenting on a real-world dataset of 2 billion IP addresses and observe up to a 105% prediction improvement.Comment: This paper has been withdrawn as it has been superseded by arXiv:1502.0533

    Cyber-security research by ISPs:A NetFlow and DNS Anonymization Policy

    Get PDF

    Effects of network trace sampling methods on privacy and utility metrics

    Get PDF
    Researchers studying computer networks rely on the availability of traffic trace data collected from live production networks. Those choosing to share trace data with colleagues must first remove or otherwise anonymize sensitive information. This process, called sanitization, represents a tradeoff between the removal of information in the interest of identity protection and the preservation of data within the trace that is most relevant to researchers. While several metrics exist to quantify this privacy-utility tradeoff, they are often computationally expensive. Computing these metrics using a sample of the trace, rather than the entire input trace, could potentially save precious time and space resources, provided the accuracy of these values does not suffer. In this paper, we examine several simple sampling methods to discover their effects on measurement of the privacy-utility tradeoff when anonymizing network traces prior to their sharing or publication. After sanitizing a small sample trace collected from the Dartmouth College wireless network, we tested the relative accuracy of a variety of previously implemented packet and flow-sampling methods on a few existing privacy and utility metrics. This analysis led us to conclude that, for our test trace, no single sampling method we examined allowed us to accurately measure the trade-off, and that some sampling methods can produce grossly inaccurate estimates of those values. We were unable to draw conclusions on the use of packet versus flow sampling in these instances
    corecore