9 research outputs found

    Evaluating Third-Party Bad Neighborhood Blacklists for Spam Detection

    Get PDF
    The distribution of malicious hosts over the IP address space is far from being uniform. In fact, malicious hosts tend to be concentrate in certain portions of the IP address space, forming the so-called Bad Neighborhoods. This phenomenon has been previously exploited to filter Spam by means of Bad Neighborhood blacklists. In this paper, we evaluate how much a network administrator can rely upon different Bad Neighborhood blacklists generated by third-party sources to fight Spam. One could expect that Bad Neighborhood blacklists generated from different sources contain, to a varying degree, disjoint sets of entries. Therefore, we investigate (i) how specific a blacklist is to its source, and (ii) whether different blacklists can be interchangeably used to protect a target from Spam. We analyze five Bad Neighborhood blacklists generated from real-world measurements and study their effectiveness in protecting three production mail servers from Spam. Our findings lead to several operational considerations on how a network administrator could best benefit from Bad Neighborhood-based Spam filtering

    Controlled Data Sharing for Collaborative Predictive Blacklisting

    Get PDF
    Although sharing data across organizations is often advocated as a promising way to enhance cybersecurity, collaborative initiatives are rarely put into practice owing to confidentiality, trust, and liability challenges. In this paper, we investigate whether collaborative threat mitigation can be realized via a controlled data sharing approach, whereby organizations make informed decisions as to whether or not, and how much, to share. Using appropriate cryptographic tools, entities can estimate the benefits of collaboration and agree on what to share in a privacy-preserving way, without having to disclose their datasets. We focus on collaborative predictive blacklisting, i.e., forecasting attack sources based on one's logs and those contributed by other organizations. We study the impact of different sharing strategies by experimenting on a real-world dataset of two billion suspicious IP addresses collected from Dshield over two months. We find that controlled data sharing yields up to 105% accuracy improvement on average, while also reducing the false positive rate.Comment: A preliminary version of this paper appears in DIMVA 2015. This is the full version. arXiv admin note: substantial text overlap with arXiv:1403.212

    Privacy-Friendly Collaboration for Cyber Threat Mitigation

    Full text link
    Sharing of security data across organizational boundaries has often been advocated as a promising way to enhance cyber threat mitigation. However, collaborative security faces a number of important challenges, including privacy, trust, and liability concerns with the potential disclosure of sensitive data. In this paper, we focus on data sharing for predictive blacklisting, i.e., forecasting attack sources based on past attack information. We propose a novel privacy-enhanced data sharing approach in which organizations estimate collaboration benefits without disclosing their datasets, organize into coalitions of allied organizations, and securely share data within these coalitions. We study how different partner selection strategies affect prediction accuracy by experimenting on a real-world dataset of 2 billion IP addresses and observe up to a 105% prediction improvement.Comment: This paper has been withdrawn as it has been superseded by arXiv:1502.0533

    Predictive Cyber Situational Awareness and Personalized Blacklisting: A Sequential Rule Mining Approach

    Get PDF
    Cybersecurity adopts data mining for its ability to extract concealed and indistinct patterns in the data, such as for the needs of alert correlation. Inferring common attack patterns and rules from the alerts helps in understanding the threat landscape for the defenders and allows for the realization of cyber situational awareness, including the projection of ongoing attacks. In this paper, we explore the use of data mining, namely sequential rule mining, in the analysis of intrusion detection alerts. We employed a dataset of 12 million alerts from 34 intrusion detection systems in 3 organizations gathered in an alert sharing platform, and processed it using our analytical framework. We execute the mining of sequential rules that we use to predict security events, which we utilize to create a predictive blacklist. Thus, the recipients of the data from the sharing platform will receive only a small number of alerts of events that are likely to occur instead of a large number of alerts of past events. The predictive blacklist has the size of only 3 % of the raw data, and more than 60 % of its entries are shown to be successful in performing accurate predictions in operational, real-world settings

    Analysis of malicious input issues on intelligent systems

    Get PDF
    Intelligent systems can facilitate decision making and have been widely applied to various domains. The output of intelligent systems relies on the users\u27 input. However, with the development of Web-Based Interface, users can easily provide dishonest input. Therefore, the accuracy of the generated decision will be affected. This dissertation presents three essays to discuss the defense solutions for malicious input into three types of intelligent systems: expert systems, recommender systems, and rating systems. Different methods are proposed in each domain based on the nature of each problem. The first essay addresses the input distortion issue in expert systems. It develops four methods to distinguish liars from truth-tellers, and redesign the expert systems to control the impact of input distortion by liars. Experimental results show that the proposed methods could lead to the better accuracy or the lower misclassification cost. The second essay addresses the shilling attack issue in recommender systems. It proposes an integrated Value-based Neighbor Selection (VNS) approach, which aims to select proper neighbors for recommendation systems that maximize the e-retailer\u27s profit while protecting the system from shilling attacks. Simulations are conducted to demonstrate the effectiveness of the proposed method. The third essay addresses the rating fraud issue in rating systems. It designs a two-phase procedure for rating fraud detection based on the temporal analysis on the rating series. Experiments based on the real-world data are utilized to evaluate the effectiveness of the proposed method

    Super-Scoring? Datengetriebene Sozialtechnologien als neue Bildungsherausforderung

    Get PDF
    Beim so genannten "Scoring" wird einer Person mithilfe algorithmischer Verfahren ein Zahlenwert zugeordnet, um ihr Verhalten zu bewerten und zu beeinflussen. "Super-Scoring"-Praktiken gehen noch weiter und fĂŒhren Punktesysteme und Skalen aus unterschiedlichen Lebensbereichen zusammen, wie etwa BonitĂ€t, Gesundheitsverhalten oder Lernleistungen. Diese Verfahren könnten sich zu einem neuen und ĂŒbergreifenden Governance-Prinzip in der digitalen Gesellschaft entwickeln. Ein besonders prominentes Beispiel ist das Social Credit System in China. Aber auch in westlichen Gesellschaften gewinnen Scoring-Praktiken und digitale Soziometrien an Bedeutung. Dieser Open Access Band stellt aktuelle Beispiele von datengetriebenen sozialen Steuerungsprozessen aus verschiedenen LĂ€ndern vor, diskutiert ihre normativen Grundlagen und gesellschaftspolitischen Auswirkungen und gibt erste bildungspolitische Empfehlungen. Wie ist der aktuelle Stand einschlĂ€giger Praktiken in China und in westlichen Gesellschaften? Wie sind die individuellen und sozialen Folgen zu bewerten? Wie wandelt sich das Bild vom Menschen und wie sollte bereits heute die politische und aufklĂ€rerische Bildung darauf reagieren
    corecore