343 research outputs found

    Sharing Computer Network Logs for Security and Privacy: A Motivation for New Methodologies of Anonymization

    Full text link
    Logs are one of the most fundamental resources to any security professional. It is widely recognized by the government and industry that it is both beneficial and desirable to share logs for the purpose of security research. However, the sharing is not happening or not to the degree or magnitude that is desired. Organizations are reluctant to share logs because of the risk of exposing sensitive information to potential attackers. We believe this reluctance remains high because current anonymization techniques are weak and one-size-fits-all--or better put, one size tries to fit all. We must develop standards and make anonymization available at varying levels, striking a balance between privacy and utility. Organizations have different needs and trust other organizations to different degrees. They must be able to map multiple anonymization levels with defined risks to the trust levels they share with (would-be) receivers. It is not until there are industry standards for multiple levels of anonymization that we will be able to move forward and achieve the goal of widespread sharing of logs for security researchers.Comment: 17 pages, 1 figur

    Experiences with a continuous network tracing infrastructure

    Get PDF
    One of the most pressing problems in network research is the lack of long-term trace data from ISPs. The Internet carries an enormous volume and variety of data; mining this data can provide valuable insight into the design and development of new protocols and applications. Although capture cards for high-speed links exist today, actually making the network traffic available for analysis involves more than just getting the packets off the wire, but also handling large and variable traffic loads, sanitizing and anonymizing the data, and coordinating access by multiple users. In this paper we discuss the requirements, challenges, and design of an effective traffic monitoring infrastructure for network research. We describe our experience in deploying and maintaining a multi-user system for continuous trace collection at a large regional ISP. We evaluate the performance of our system and show that it can support sustained collection and processing rates of over 160–300Mbits/s

    Reference models for network trace anonymization

    Get PDF
    Network security research can benefit greatly from testing environments that are capable of generating realistic, repeatable and configurable background traffic. In order to conduct network security experiments on systems such as Intrusion Detection Systems and Intrusion Prevention Systems, researchers require isolated testbeds capable of recreating actual network environments, complete with infrastructure and traffic details. Unfortunately, due to privacy and flexibility concerns, actual network traffic is rarely shared by organizations as sensitive information, such as IP addresses, device identity and behavioral information can be inferred from the traffic. Trace data anonymization is one solution to this problem. The research community has responded to this sanitization problem with anonymization tools that aim to remove sensitive information from network traces, and attacks on anonymized traces that aim to evaluate the efficacy of the anonymization schemes. However there is continued lack of a comprehensive model that distills all elements of the sanitization problem in to a functional reference model.;In this thesis we offer such a comprehensive functional reference model that identifies and binds together all the entities required to formulate the problem of network data anonymization. We build a new information flow model that illustrates the overly optimistic nature of inference attacks on anonymized traces. We also provide a probabilistic interpretation of the information model and develop a privacy metric for anonymized traces. Finally, we develop the architecture for a highly configurable, multi-layer network trace collection and sanitization tool. In addition to addressing privacy and flexibility concerns, our architecture allows for uniformity of anonymization and ease of data aggregation

    α-MON: Anonymized Passive Traffic Monitoring

    Get PDF
    Packet measurements are essential for several applications, such as cyber-security, accounting and troubleshooting. They, however, threaten privacy by exposing sensitive information. Anonymization has been the answer to this challenge, i.e., replacing sensitive information by obfuscated copies. Anonymization of packet traces, however, comes with some drawbacks. First, it reduces the value of data. Second, it requires to consider diverse protocols because information may leak from many non-encrypted fields. Third, it must be performed at high speeds directly at the monitor, to prevent private data from leaking, calling for real-time solutions.We present α-MON, a flexible tool for privacy-preserving packet monitoring. It replicates input packet streams to different consumers while anonymizing values according to flexible policies that cover all protocol layers. Beside classic anonymization mechanisms such as IP address obfuscation, α-MON supports α-anonymization, a novel solution to obfuscate values that can be uniquely traced back to limited sets of users. Differently from classic anonymization approaches, α-anonymity works on a streaming fashion, with zero delay, operating at high-speed links on a packet-by-packet basis. We evaluate α-MON performance using packet traces collected from an ISP network. Results show that it enables α-anonymity in real-time. α-MON is available to the community as an open-source project

    Markov Models for Network-Behavior Modeling and Anonymization

    Get PDF
    Modern network security research has demonstrated a clear need for open sharing of traffic datasets between organizations, a need that has so far been superseded by the challenge of removing sensitive content beforehand. Network Data Anonymization (NDA) is emerging as a field dedicated to this problem, with its main direction focusing on removal of identifiable artifacts that might pierce privacy, such as usernames and IP addresses. However, recent research has demonstrated that more subtle statistical artifacts, also present, may yield fingerprints that are just as differentiable as the former. This result highlights certain shortcomings in current anonymization frameworks -- particularly, ignoring the behavioral idiosyncrasies of network protocols, applications, and users. Recent anonymization results have shown that the extent to which utility and privacy can be obtained is mainly a function of the information in the data that one is aware and not aware of. This paper leverages the predictability of network behavior in our favor to augment existing frameworks through a new machine-learning-driven anonymization technique. Our approach uses the substitution of individual identities with group identities where members are divided based on behavioral similarities, essentially providing anonymity-by-crowds in a statistical mix-net. We derive time-series models for network traffic behavior which quantifiably models the discriminative features of network "behavior" and introduce a kernel-based framework for anonymity which fits together naturally with network-data modeling

    α-MON: Traffic Anonymizer for Passive Monitoring

    Get PDF
    Packet measurements at scale are essential for several applications, such as cyber-security, accounting and troubleshooting. They, however, threaten users’ privacy by exposing sensitive information. Anonymization has been the answer to this challenge, i.e., replacing sensitive information with obfuscated copies. Anonymization of packet traces, however, comes with some challenges and drawbacks. First, it reduces the value of data. Second, it requires to consider diverse protocols because information may leak from many non-encrypted fields. Third, it must be performed at high speeds directly at the monitor, to prevent private data from leaking, calling for real-time solutions. We present , a flexible tool for privacy-preserving packet monitoring. It replicates input packet streams to different consumers while anonymizing protocol fields according to flexible policies that cover all protocol layers. Beside classic anonymization mechanisms such as IP address obfuscation, supports z-anonymization, a novel solution to obfuscate rare values that can be uniquely traced back to limited sets of users. Differently from classic anonymization approaches, works on a streaming fashion, with zero delay, operating at high-speed links on a packet-by-packet basis. We quantify the impact of on traffic measurements, finding that it introduces minimal error when it comes to finding heavy-hitter services. We evaluate performance using packet traces collected from an ISP network and show that it achieves a sustainable rate of 40 Gbit/s on a Commercial Off-the Shelf server. is available to the community as an open-source project

    Virtualized network framework solution to collecting private research data NEMESIS: Network Experimentation and Monitoring in Environments Safely In-Situ

    Get PDF
    The cyber security research realm is plagued with the problem of collecting and using trace data from sources. Methods of anonymizing public data sets have been proven to leak large amounts of private network data. Yet access to private and public trace data is needed, this is the problem that NEMESIS seeks to solve. NEMESIS is a virtual network system level solution to the problem where instead of bringing the data to the experiments one brings the experiments to the data. NEMESIS provides security and isolation that other approaches have not; allowing for filtering and anonymization of trace data as needed. The solution came about from a desire and need to have a system level solution that leveraged and allowed for the usages of the best current technologies, while remaining highly extendible to future needs

    FLAIM: A Multi-level Anonymization Framework for Computer and Network Logs

    Full text link
    FLAIM (Framework for Log Anonymization and Information Management) addresses two important needs not well addressed by current log anonymizers. First, it is extremely modular and not tied to the specific log being anonymized. Second, it supports multi-level anonymization, allowing system administrators to make fine-grained trade-offs between information loss and privacy/security concerns. In this paper, we examine anonymization solutions to date and note the above limitations in each. We further describe how FLAIM addresses these problems, and we describe FLAIM's architecture and features in detail.Comment: 16 pages, 4 figures, in submission to USENIX Lis

    Privacy-safe network trace sharing via secure queries

    Get PDF
    Privacy concerns relating to sharing network traces have traditionally been handled via sanitization, which includes removal of sensitive data and IP address anonymization. We argue that sanitization is a poor solution for data sharing that offers insufficient research utility to users and poor privacy guarantees to data providers. We claim that a better balance in the utility/privacy tradeoff, inherent to network data sharing, can be achieved via a new paradigm we propose: secure queries. In this paradigm, a data owner publishes a query language and an online portal, allowing researchers to submit sets of queries to be run on data. Only certain operations are allowed on certain data fields, and in specific contexts. Query restriction is achieved via the provider’s privacy policy, and enforced by the language’s interpreter. Query results, returned to researchers, consist of aggregate information such as counts, histograms, distributions, etc. and not of individual packets. We discuss why secure queries provide higher privacy guarantees and higher research utility than sanitization, and present a design of the secure query language and a privacy policy
    • …
    corecore