4,739 research outputs found
Using a Remote Access Data Enclave for Data Dissemination
This article describes the approach taken by NORC and NIST to provide remote researcher access to confidential business micro data. We have combined technical, legal and organizational approaches to protect respondent confidential. We have also instituted a number of technical approaches to encourage researchers to provide metadata documentation
Administrative Transaction Data
The value of administrative transaction data, such as financial transactions, credit card purchases, telephone calls, and retail store scanning data, to study social behaviour has long been recognised. Now new types of transactions data made possible by advances in cyber-technology have the potential to further exland social scientists’ research frontier. This chapter discusses the potential for such data to be included in the scientific infrastructure. It discusses new approaches to data dissemination, as well as the privacy and confidentiality issues raised by such data collection. It also discusses the characteristics of an optimal infrastructure to support the scientific analysis of transactions data.transactions data; administrative data; cybertechnology; privacy and confidentiality; virtual organizations
Balancing Access to Data And Privacy. A review of the issues and approaches for the future
Access to sensitive micro data should be provided using remote access data enclaves. These enclaves should be built to facilitate the productive, high-quality usage of microdata. In other words, they should support a collaborative environment that facilitates the development and exchange of knowledge about data among data producers and consumers. The experience of the physical and life sciences has shown that it is possible to develop a research community and a knowledge infrastructure around both research questions and the different types of data necessary to answer policy questions. In sum, establishing a virtual organization approach would provided the research community with the ability to move away from individual, or artisan, science, towards the more generally accepted community based approach. Enclave should include a number of features: metadata documentation capacity so that knowledge about data can be shared; capacity to add data so that the data infrastructure can be augmented; communication capacity, such as wikis, blogs and discussion groups so that knowledge about the data can be deepened and incentives for information sharing so that a community of practice can be built. The opportunity to transform micro-data based research through such a organizational infrastructure could potentially be as far-reaching as the changes that have taken place in the biological and astronomical sciences. It is, however, an open research question how such an organization should be established: whether the approach should be centralized or decentralized. Similarly, it is an open research question as to the appropriate metrics of success, and the best incentives to put in place to achieve success.Methodology for Collecting, Estimating, Organizing Microeconomic Data
Secure Cloud Storage with Client-Side Encryption Using a Trusted Execution Environment
With the evolution of computer systems, the amount of sensitive data to be
stored as well as the number of threats on these data grow up, making the data
confidentiality increasingly important to computer users. Currently, with
devices always connected to the Internet, the use of cloud data storage
services has become practical and common, allowing quick access to such data
wherever the user is. Such practicality brings with it a concern, precisely the
confidentiality of the data which is delivered to third parties for storage. In
the home environment, disk encryption tools have gained special attention from
users, being used on personal computers and also having native options in some
smartphone operating systems. The present work uses the data sealing, feature
provided by the Intel Software Guard Extensions (Intel SGX) technology, for
file encryption. A virtual file system is created in which applications can
store their data, keeping the security guarantees provided by the Intel SGX
technology, before send the data to a storage provider. This way, even if the
storage provider is compromised, the data are safe. To validate the proposal,
the Cryptomator software, which is a free client-side encryption tool for cloud
files, was integrated with an Intel SGX application (enclave) for data sealing.
The results demonstrate that the solution is feasible, in terms of performance
and security, and can be expanded and refined for practical use and integration
with cloud synchronization services
Avoiding disclosure of individually identifiable health information: a literature review
Achieving data and information dissemination without arming anyone is a central task of any entity in charge of collecting data. In this article, the authors examine the literature on data and statistical confidentiality. Rather than comparing the theoretical properties of specific methods, they emphasize the main themes that emerge from the ongoing discussion among scientists regarding how best to achieve the appropriate balance between data protection, data utility, and data dissemination. They cover the literature on de-identification and reidentification methods with emphasis on health care data. The authors also discuss the benefits and limitations for the most common access methods. Although there is abundant theoretical and empirical research, their review reveals lack of consensus on fundamental questions for empirical practice: How to assess disclosure risk, how to choose among disclosure methods, how to assess reidentification risk, and how to measure utility loss.public use files, disclosure avoidance, reidentification, de-identification, data utility
Creation of public use files: lessons learned from the comparative effectiveness research public use files data pilot project
In this paper we describe lessons learned from the creation of Basic Stand Alone (BSA) Public Use Files (PUFs) for the Comparative Effectiveness Research Public Use Files Data Pilot Project (CER-PUF). CER-PUF is aimed at increasing access to the Centers for Medicare and Medicaid Services (CMS) Medicare claims datasets through PUFs that: do not require user fees and data use agreements, have been de-identified to assure the confidentiality of the beneficiaries and providers, and still provide substantial analytic utility to researchers. For this paper we define PUFs as datasets characterized by free and unrestricted access to any user. We derive lessons learned from five major project activities: (i) a review of the statistical and computer science literature on best practices in PUF creation, (ii) interviews with comparative effectiveness researchers to assess their data needs, (iii) case studies of PUF initiatives in the United States, (iv) interviews with stakeholders to identify the most salient issues regarding making microdata publicly available, and (v) the actual process of creating the Medicare claims data BSA PUFs
Administrative Transaction Data
"The value of administrative transaction data, such as financial transactions, credit card purchases, telephone calls, and retail store scanning data, to study social behaviour has long been recognised. Now new types of transactions data made possible by advances in cyber-technology have the potential to further exland social scientists’ research frontier. This chapter discusses the potential for such data to be included in the scientific infrastructure. It discusses new approaches to data dissemination, as well as the privacy and confidentiality issues raised by such data collection. It also discusses the characteristics of an optimal infrastructure to support the scientific analysis of transactions data." [author's abstract
- …