197,972 research outputs found

    Privacy and Confidentiality in an e-Commerce World: Data Mining, Data Warehousing, Matching and Disclosure Limitation

    Full text link
    The growing expanse of e-commerce and the widespread availability of online databases raise many fears regarding loss of privacy and many statistical challenges. Even with encryption and other nominal forms of protection for individual databases, we still need to protect against the violation of privacy through linkages across multiple databases. These issues parallel those that have arisen and received some attention in the context of homeland security. Following the events of September 11, 2001, there has been heightened attention in the United States and elsewhere to the use of multiple government and private databases for the identification of possible perpetrators of future attacks, as well as an unprecedented expansion of federal government data mining activities, many involving databases containing personal information. We present an overview of some proposals that have surfaced for the search of multiple databases which supposedly do not compromise possible pledges of confidentiality to the individuals whose data are included. We also explore their link to the related literature on privacy-preserving data mining. In particular, we focus on the matching problem across databases and the concept of ``selective revelation'' and their confidentiality implications.Comment: Published at http://dx.doi.org/10.1214/088342306000000240 in the Statistical Science (http://www.imstat.org/sts/) by the Institute of Mathematical Statistics (http://www.imstat.org

    Creation of public use files: lessons learned from the comparative effectiveness research public use files data pilot project

    Get PDF
    In this paper we describe lessons learned from the creation of Basic Stand Alone (BSA) Public Use Files (PUFs) for the Comparative Effectiveness Research Public Use Files Data Pilot Project (CER-PUF). CER-PUF is aimed at increasing access to the Centers for Medicare and Medicaid Services (CMS) Medicare claims datasets through PUFs that: do not require user fees and data use agreements, have been de-identified to assure the confidentiality of the beneficiaries and providers, and still provide substantial analytic utility to researchers. For this paper we define PUFs as datasets characterized by free and unrestricted access to any user. We derive lessons learned from five major project activities: (i) a review of the statistical and computer science literature on best practices in PUF creation, (ii) interviews with comparative effectiveness researchers to assess their data needs, (iii) case studies of PUF initiatives in the United States, (iv) interviews with stakeholders to identify the most salient issues regarding making microdata publicly available, and (v) the actual process of creating the Medicare claims data BSA PUFs

    Watching You: Systematic Federal Surveillance of Ordinary Americans

    Get PDF
    To combat terrorism, Attorney General John Ashcroft has asked Congress to "enhance" the government's ability to conduct domestic surveillance of citizens. The Justice Department's legislative proposals would give federal law enforcement agents new access to personal information contained in business and school records. Before acting on those legislative proposals, lawmakers should pause to consider the extent to which the lives of ordinary Americans already are monitored by the federal government. Over the years, the federal government has instituted a variety of data collection programs that compel the production, retention, and dissemination of personal information about every American citizen. Linked through an individual's Social Security number, these labor, medical, education and financial databases now empower the federal government to obtain a detailed portrait of any person: the checks he writes, the types of causes he supports, and what he says "privately" to his doctor. Despite widespread public concern about preserving privacy, these data collection systems have been enacted in the name of "reducing fraud" and "promoting efficiency" in various government programs. Having exposed most areas of American life to ongoing government scrutiny and recording, Congress is now poised to expand and universalize federal tracking of citizen life. The inevitable consequence of such constant surveillance, however, is metastasizing government control over society. If that happens, our government will have perverted its most fundamental mission and destroyed the privacy and liberty that it was supposed to protect

    The IRS Individual Taxpayer Identification Number: An Operational Guide to the ITIN Program

    Get PDF
    Examines the role of the IRS's ITIN in allowing immigrants and others without Social Security identification to report their income, open savings accounts, and obtain other legal status. Includes a guide to the ITIN application process

    William H. Sorrell, Attorney General of Vermont, et al. v. IMS Health Inc., et al. - Amicus Brief in Support of Petitioners

    Get PDF
    On April 26, 2011, the US Supreme Court will hear oral arguments in the Vermont data mining case, Sorrell v. IMS Health Inc. Respondents claim this is the most important commercial speech case in a decade. Petitioner (the State of Vermont) argues this is the most important medical privacy case since Whalen v. Roe. The is an amicus brief supporting Vermont, written by law professors and submitted on behalf of the New England Journal of Medicin

    Multiple imputation for sharing precise geographies in public use data

    Full text link
    When releasing data to the public, data stewards are ethically and often legally obligated to protect the confidentiality of data subjects' identities and sensitive attributes. They also strive to release data that are informative for a wide range of secondary analyses. Achieving both objectives is particularly challenging when data stewards seek to release highly resolved geographical information. We present an approach for protecting the confidentiality of data with geographic identifiers based on multiple imputation. The basic idea is to convert geography to latitude and longitude, estimate a bivariate response model conditional on attributes, and simulate new latitude and longitude values from these models. We illustrate the proposed methods using data describing causes of death in Durham, North Carolina. In the context of the application, we present a straightforward tool for generating simulated geographies and attributes based on regression trees, and we present methods for assessing disclosure risks with such simulated data.Comment: Published in at http://dx.doi.org/10.1214/11-AOAS506 the Annals of Applied Statistics (http://www.imstat.org/aoas/) by the Institute of Mathematical Statistics (http://www.imstat.org

    PRECEPT:a framework for ethical digital forensics investigations

    Get PDF
    Purpose: Cyber-enabled crimes are on the increase, and law enforcement has had to expand many of their detecting activities into the digital domain. As such, the field of digital forensics has become far more sophisticated over the years and is now able to uncover even more evidence that can be used to support prosecution of cyber criminals in a court of law. Governments, too, have embraced the ability to track suspicious individuals in the online world. Forensics investigators are driven to gather data exhaustively, being under pressure to provide law enforcement with sufficient evidence to secure a conviction. Yet, there are concerns about the ethics and justice of untrammeled investigations on a number of levels. On an organizational level, unconstrained investigations could interfere with, and damage, the organization’s right to control the disclosure of their intellectual capital. On an individual level, those being investigated could easily have their legal privacy rights violated by forensics investigations. On a societal level, there might be a sense of injustice at the perceived inequality of current practice in this domain. This paper argues the need for a practical, ethically-grounded approach to digital forensic investigations, one that acknowledges and respects the privacy rights of individuals and the intellectual capital disclosure rights of organisations, as well as acknowledging the needs of law enforcement. We derive a set of ethical guidelines, then map these onto a forensics investigation framework. We subjected the framework to expert review in two stages, refining the framework after each stage. We conclude by proposing the refined ethically-grounded digital forensics investigation framework. Our treatise is primarily UK based, but the concepts presented here have international relevance and applicability.Design methodology: In this paper, the lens of justice theory is used to explore the tension that exists between the needs of digital forensic investigations into cybercrimes on the one hand, and, on the other, individuals’ rights to privacy and organizations’ rights to control intellectual capital disclosure.Findings: The investigation revealed a potential inequality between the practices of digital forensics investigators and the rights of other stakeholders. That being so, the need for a more ethically-informed approach to digital forensics investigations, as a remedy, is highlighted, and a framework proposed to provide this.Practical Implications: Our proposed ethically-informed framework for guiding digital forensics investigations suggest a way of re-establishing the equality of the stakeholders in this arena, and ensuring that the potential for a sense of injustice is reduced.Originality/value: Justice theory is used to highlight the difficulties in squaring the circle between the rights and expectations of all stakeholders in the digital forensics arena. The outcome is the forensics investigation guideline, PRECEpt: Privacy-Respecting EthiCal framEwork, which provides the basis for a re-aligning of the balance between the requirements and expectations of digital forensic investigators on the one hand, and individual and organizational expectations and rights, on the other
    • …
    corecore