187 research outputs found

    What the Surprising Failure of Data Anonymization Means for Law and Policy

    Get PDF
    Paul Ohm is an Associate Professor of Law at the University of Colorado Law School. He writes in the areas of information privacy, computer crime law, intellectual property, and criminal procedure. Through his scholarship and outreach, Professor Ohm is leading efforts to build new interdisciplinary bridges between law and computer science. Before becoming a law professor, Professor Ohm served as a federal prosecutor for the U.S. Department of Justice in the computer crimes unit. Before law school, he worked as a computer programmer and network systems administrator

    Anonymization and Risk

    Get PDF
    Perfect anonymization of data sets that contain personal information has failed. But the process of protecting data subjects in shared information remains integral to privacy practice and policy. While the deidentification debate has been vigorous and productive, there is no clear direction for policy. As a result, the law has been slow to adapt a holistic approach to protecting data subjects when data sets are released to others. Currently, the law is focused on whether an individual can be identified within a given set. We argue that the best way to move data release policy past the alleged failures of anonymization is to focus on the process of minimizing risk of reidentification and sensitive attribute disclosure, not preventing harm. Process-based data release policy, which resembles the law of data security, will help us move past the limitations of focusing on whether data sets have been “anonymized.” It draws upon different tactics to protect the privacy of data subjects, including accurate deidentification rhetoric, contracts prohibiting reidentification and sensitive attribute disclosure, data enclaves, and query-based strategies to match required protections with the level of risk. By focusing on process, data release policy can better balance privacy and utility where nearly all data exchanges carry some risk

    Strategies for Protecting Privacy in Open Data and Proactive Disclosure

    Get PDF
    In this paper, the authors explore strategies for balancing privacy with transparency in the release of government data and information as part of the growing global open government movement. The issue is important because government data or information may take many forms, may contain many different types of personal information, and may be released in a range of contexts. The legal framework is complex: personal information is typically not released as open data or under access to information regimes; nevertheless, in some cases transparency requirements take precedence over the protection of personal information. The open courts principle, for example, places a strong emphasis on transparency over privacy. The situation is complicated by the availability of technologies that facilitate widespread dissemination of information and that allow for the searching, combining and mining of information in ways that may permit the reidentification of individuals even within anonymized data sets. This paper identifies a number of strategies designed to assist in identifying whether government data sets or information contain personal information, whether it should be released notwithstanding the presence of the personal information, and what techniques might be used to minimize any possible adverse privacy impacts

    Reconsidering Anonymization-Related Concepts and the Term "Identification" Against the Backdrop of the European Legal Framework

    Get PDF
    Sharing data in biomedical contexts has become increasingly relevant, but privacy concerns set constraints for free sharing of individual-level data. Data protection law protects only data relating to an identifiable individual, whereas "anonymous" data are free to be used by everybody. Usage of many terms related to anonymization is often not consistent among different domains such as statistics and law. The crucial term "identification" seems especially hard to define, since its definition presupposes the existence of identifying characteristics, leading to some circularity. In this article, we present a discussion of important terms based on a legal perspective that it is outlined before we present issues related to the usage of terms such as unique "identifiers," "quasi-identifiers," and "sensitive attributes." Based on these terms, we have tried to circumvent a circular definition for the term "identification" by making two decisions: first, deciding which (natural) identifier should stand for the individual; second, deciding how to recognize the individual. In addition, we provide an overview of anonymization techniques/methods for preventing re-identification. The discussion of basic notions related to anonymization shows that there is some work to be done in order to achieve a mutual understanding between legal and technical experts concerning some of these notions. Using a dialectical definition process in order to merge technical and legal perspectives on terms seems important for enhancing mutual understanding

    Not So Private

    Get PDF
    Federal and state laws have long attempted to strike a balance between protecting patient privacy and health information confidentiality on the one hand and supporting important uses and disclosures of health information on the other. To this end, many health laws restrict the use and disclosure of identifiable health data but support the use and disclosure of de-identified data. The goal of health data de-identification is to prevent or minimize informational injuries to identifiable data subjects while allowing the production of aggregate statistics that can be used for biomedical and behavioral research, public health initiatives, informed health care decision making, and other important activities. Many federal and state laws assume that data are de-identified when direct and indirect demographic identifiers such as names, user names, email addresses, street addresses, and telephone numbers have been removed. An emerging reidentification literature shows, however, that purportedly de-identified data can—and increasingly will—be reidentified. This Article responds to this concern by presenting an original synthesis of illustrative federal and state identification and de-identification laws that expressly or potentially apply to health data; identifying significant weaknesses in these laws in light of the developing reidentification literature; proposing theoretical alternatives to outdated identification and de-identification standards, including alternatives based on the theories of evolving law, nonreidentification, non-collection, non-use, non-disclosure, and nondiscrimination; and offering specific, textual amendments to federal and state data protection laws that incorporate these theoretical alternatives
    • …
    corecore