Search CORE

115,152 research outputs found

Assessing Data Usefulness for Failure Analysis in Anonymized System Logs

Author: Ciorba Florina M.
Ghiasvand Siavash
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2018
Field of study

System logs are a valuable source of information for the analysis and understanding of systems behavior for the purpose of improving their performance. Such logs contain various types of information, including sensitive information. Information deemed sensitive can either directly be extracted from system log entries by correlation of several log entries, or can be inferred from the combination of the (non-sensitive) information contained within system logs with other logs and/or additional datasets. The analysis of system logs containing sensitive information compromises data privacy. Therefore, various anonymization techniques, such as generalization and suppression have been employed, over the years, by data and computing centers to protect the privacy of their users, their data, and the system as a whole. Privacy-preserving data resulting from anonymization via generalization and suppression may lead to significantly decreased data usefulness, thus, hindering the intended analysis for understanding the system behavior. Maintaining a balance between data usefulness and privacy preservation, therefore, remains an open and important challenge. Irreversible encoding of system logs using collision-resistant hashing algorithms, such as SHAKE-128, is a novel approach previously introduced by the authors to mitigate data privacy concerns. The present work describes a study of the applicability of the encoding approach from earlier work on the system logs of a production high performance computing system. Moreover, a metric is introduced to assess the data usefulness of the anonymized system logs to detect and identify the failures encountered in the system.Comment: 11 pages, 3 figures, submitted to 17th IEEE International Symposium on Parallel and Distributed Computin

arXiv.org e-Print Archive

Crossref

edoc

Controlled Data Sharing for Collaborative Predictive Blacklisting

Author: B Applebaum
C Blundo
C Song
D Gusfield
E Cristofaro De
E Cristofaro De
E De Cristofaro
E Kenneally
I Bilogrevic
MJ Freedman
Publication venue
Publication date: 16/04/2015
Field of study

Although sharing data across organizations is often advocated as a promising way to enhance cybersecurity, collaborative initiatives are rarely put into practice owing to confidentiality, trust, and liability challenges. In this paper, we investigate whether collaborative threat mitigation can be realized via a controlled data sharing approach, whereby organizations make informed decisions as to whether or not, and how much, to share. Using appropriate cryptographic tools, entities can estimate the benefits of collaboration and agree on what to share in a privacy-preserving way, without having to disclose their datasets. We focus on collaborative predictive blacklisting, i.e., forecasting attack sources based on one's logs and those contributed by other organizations. We study the impact of different sharing strategies by experimenting on a real-world dataset of two billion suspicious IP addresses collected from Dshield over two months. We find that controlled data sharing yields up to 105% accuracy improvement on average, while also reducing the false positive rate.Comment: A preliminary version of this paper appears in DIMVA 2015. This is the full version. arXiv admin note: substantial text overlap with arXiv:1403.212

arXiv.org e-Print Archive

Crossref

UCL Discovery

Privacy-Friendly Collaboration for Cyber Threat Mitigation

Author: Brito Alex
De Cristofaro Emiliano
Freudiger Julien
Publication venue
Publication date: 01/03/2017
Field of study

Sharing of security data across organizational boundaries has often been advocated as a promising way to enhance cyber threat mitigation. However, collaborative security faces a number of important challenges, including privacy, trust, and liability concerns with the potential disclosure of sensitive data. In this paper, we focus on data sharing for predictive blacklisting, i.e., forecasting attack sources based on past attack information. We propose a novel privacy-enhanced data sharing approach in which organizations estimate collaboration benefits without disclosing their datasets, organize into coalitions of allied organizations, and securely share data within these coalitions. We study how different partner selection strategies affect prediction accuracy by experimenting on a real-world dataset of 2 billion IP addresses and observe up to a 105% prediction improvement.Comment: This paper has been withdrawn as it has been superseded by arXiv:1502.0533

arXiv.org e-Print Archive

CiteSeerX

A Taxonomy for Congestion Control Algorithms in Vehicular Ad Hoc Networks

Author: Keshavarz Hassan
Noor Rafidah Md
Sattari Mohammad Reza Jabbarpour
Publication venue
Publication date: 01/01/2012
Field of study

One of the main criteria in Vehicular Ad hoc Networks (VANETs) that has attracted the researchers' consideration is congestion control. Accordingly, many algorithms have been proposed to alleviate the congestion problem, although it is hard to find an appropriate algorithm for applications and safety messages among them. Safety messages encompass beacons and event-driven messages. Delay and reliability are essential requirements for event-driven messages. In crowded networks where beacon messages are broadcasted at a high number of frequencies by many vehicles, the Control Channel (CCH), which used for beacons sending, will be easily congested. On the other hand, to guarantee the reliability and timely delivery of event-driven messages, having a congestion free control channel is a necessity. Thus, consideration of this study is given to find a solution for the congestion problem in VANETs by taking a comprehensive look at the existent congestion control algorithms. In addition, the taxonomy for congestion control algorithms in VANETs is presented based on three classes, namely, proactive, reactive and hybrid. Finally, we have found the criteria in which fulfill prerequisite of a good congestion control algorithm

arXiv.org e-Print Archive

UM Digital Repository