3 research outputs found

    k-ANONYMITY: A MODEL FOR PROTECTING PRIVACY

    Full text link

    The inference problem in multilevel secure databases

    Get PDF
    Conventional access control models, such as role-based access control, protect sensitive data from unauthorized disclosure via direct accesses, however, they fail to prevent unauthorized disclosure happening through indirect accesses. Indirect data disclosure via inference channels occurs when sensitive information can be inferred from nonsensitive data and metadata, which is also known as “the inference problem”. This problem has draw n much attention from researcher in the database community due to its great compromise of data security. It has been studied under four settings according to where it occurs. They are statistical databases, multilevel secure databases, data mining, and web-based applications. This thesis investigates previous efforts dedicated to inference problems in multilevel secure databases, and presents the latest findings of our research on this problem. Our contribution includes two methods. One is a dynamic control over this problem, which designs a set of accessing key distribution schemes to remove inference after all inference channels in the database has been identified. The other combines rough sets and entropies to form a computational solution to detect and remove inferences, which for the first time provides an integrated solution to the inference problem. Comparison with previous work has also been done, and we have proved both of them are effective and easy to implement. Since the inference problem is described as a problem of detecting and removing inference channels, this thesis contains two main parts: inference detecting techniques and inference removing techniques. In both two aspects, some techniques are selectively but extensively examined

    Computational disclosure control : a primer on data privacy protection

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2001.Includes bibliographical references (leaves 213-216) and index.Today's globally networked society places great demand on the dissemination and sharing of person specific data for many new and exciting uses. When these data are linked together, they provide an electronic shadow of a person or organization that is as identifying and personal as a fingerprint even when the information contains no explicit identifiers, such as name and phone number. Other distinctive data, such as birth date and ZIP code, often combine uniquely and can be linked to publicly available information to re-identify individuals. Producing anonymous data that remains specific enough to be useful is often a very difficult task and practice today tends to either incorrectly believe confidentiality is maintained when it is not or produces data that are practically useless. The goal of the work presented in this book is to explore computational techniques for releasing useful information in such a way that the identity of any individual or entity contained in data cannot be recognized while the data remain practically useful. I begin by demonstrating ways to learn information about entities from publicly available information. I then provide a formal framework for reasoning about disclosure control and the ability to infer the identities of entities contained within the data. I formally define and present null-map, k-map and wrong-map as models of protection. Each model provides protection by ensuring that released information maps to no, k or incorrect entities, respectively. The book ends by examining four computational systems that attempt to maintain privacy while releasing electronic information. These systems are: (1) my Scrub System, which locates personally-identifying information in letters between doctors and notes written by clinicians; (2) my Datafly II System, which generalizes and suppresses values in field-structured data sets; (3) Statistics Netherlands' pt-Argus System, which is becoming a European standard for producing public-use data; and, (4) my k-Similar algorithm, which finds optimal solutions such that data are minimally distorted while still providing adequate protection. By introducing anonymity and quality metrics, I show that Datafly II can overprotect data, Scrub and p-Argus can fail to provide adequate protection, but k-similar finds optimal results.by Latanya Sweeney.Ph.D
    corecore