10,053 research outputs found

    Privacy preserving linkage and sharing of sensitive data

    Get PDF
    2018 Summer.Includes bibliographical references.Sensitive data, such as personal and business information, is collected by many service providers nowadays. This data is considered as a rich source of information for research purposes that could benet individuals, researchers and service providers. However, because of the sensitivity of such data, privacy concerns, legislations, and con ict of interests, data holders are reluctant to share their data with others. Data holders typically lter out or obliterate privacy related sensitive information from their data before sharing it, which limits the utility of this data and aects the accuracy of research. Such practice will protect individuals' privacy; however it prevents researchers from linking records belonging to the same individual across dierent sources. This is commonly referred to as record linkage problem by the healthcare industry. In this dissertation, our main focus is on designing and implementing ecient privacy preserving methods that will encourage sensitive information sources to share their data with researchers without compromising the privacy of the clients or aecting the quality of the research data. The proposed solution should be scalable and ecient for real-world deploy- ments and provide good privacy assurance. While this problem has been investigated before, most of the proposed solutions were either considered as partial solutions, not accurate, or impractical, and therefore subject to further improvements. We have identied several issues and limitations in the state of the art solutions and provided a number of contributions that improve upon existing solutions. Our rst contribution is the design of privacy preserving record linkage protocol using semi-trusted third party. The protocol allows a set of data publishers (data holders) who compete with each other, to share sensitive information with subscribers (researchers) while preserving the privacy of their clients and without sharing encryption keys. Our second contribution is the design and implementation of a probabilistic privacy preserving record linkage protocol, that accommodates discrepancies and errors in the data such as typos. This work builds upon the previous work by linking the records that are similar, where the similarity range is formally dened. Our third contribution is a protocol that performs information integration and sharing without third party services. We use garbled circuits secure computation to design and build a system to perform the record linkages between two parties without sharing their data. Our design uses Bloom lters as inputs to the garbled circuits and performs a probabilistic record linkage using the Dice coecient similarity measure. As garbled circuits are known for their expensive computations, we propose new approaches that reduce the computation overhead needed, to achieve a given level of privacy. We built a scalable record linkage system using garbled circuits, that could be deployed in a distributed computation environment like the cloud, and evaluated its security and performance. One of the performance issues for linking large datasets is the amount of secure computation to compare every pair of records across the linked datasets to nd all possible record matches. To reduce the amount of computations a method, known as blocking, is used to lter out as much as possible of the record pairs that will not match, and limit the comparison to a subset of the record pairs (called can- didate pairs) that possibly match. Most of the current blocking methods either require the parties to share blocking keys (called blocks identiers), extracted from the domain of some record attributes (termed blocking variables), or share reference data points to group their records around these points using some similarity measures. Though these methods reduce the computation substantially, they leak too much information about the records within each block. Toward this end, we proposed a novel privacy preserving approximate blocking scheme that allows parties to generate the list of candidate pairs with high accuracy, while protecting the privacy of the records in each block. Our scheme is congurable such that the level of performance and accuracy could be achieved according to the required level of privacy. We analyzed the accuracy and privacy of our scheme, implemented a prototype of the scheme, and experimentally evaluated its accuracy and performance against dierent levels of privacy

    Watching You: Systematic Federal Surveillance of Ordinary Americans

    Get PDF
    To combat terrorism, Attorney General John Ashcroft has asked Congress to "enhance" the government's ability to conduct domestic surveillance of citizens. The Justice Department's legislative proposals would give federal law enforcement agents new access to personal information contained in business and school records. Before acting on those legislative proposals, lawmakers should pause to consider the extent to which the lives of ordinary Americans already are monitored by the federal government. Over the years, the federal government has instituted a variety of data collection programs that compel the production, retention, and dissemination of personal information about every American citizen. Linked through an individual's Social Security number, these labor, medical, education and financial databases now empower the federal government to obtain a detailed portrait of any person: the checks he writes, the types of causes he supports, and what he says "privately" to his doctor. Despite widespread public concern about preserving privacy, these data collection systems have been enacted in the name of "reducing fraud" and "promoting efficiency" in various government programs. Having exposed most areas of American life to ongoing government scrutiny and recording, Congress is now poised to expand and universalize federal tracking of citizen life. The inevitable consequence of such constant surveillance, however, is metastasizing government control over society. If that happens, our government will have perverted its most fundamental mission and destroyed the privacy and liberty that it was supposed to protect

    Designing the Health-related Internet of Things: Ethical Principles and Guidelines

    Get PDF
    The conjunction of wireless computing, ubiquitous Internet access, and the miniaturisation of sensors have opened the door for technological applications that can monitor health and well-being outside of formal healthcare systems. The health-related Internet of Things (H-IoT) increasingly plays a key role in health management by providing real-time tele-monitoring of patients, testing of treatments, actuation of medical devices, and fitness and well-being monitoring. Given its numerous applications and proposed benefits, adoption by medical and social care institutions and consumers may be rapid. However, a host of ethical concerns are also raised that must be addressed. The inherent sensitivity of health-related data being generated and latent risks of Internet-enabled devices pose serious challenges. Users, already in a vulnerable position as patients, face a seemingly impossible task to retain control over their data due to the scale, scope and complexity of systems that create, aggregate, and analyse personal health data. In response, the H-IoT must be designed to be technologically robust and scientifically reliable, while also remaining ethically responsible, trustworthy, and respectful of user rights and interests. To assist developers of the H-IoT, this paper describes nine principles and nine guidelines for ethical design of H-IoT devices and data protocols

    A Study on Privacy Preserving Data Publishing With Differential Privacy

    Get PDF
    In the era of digitization it is important to preserve privacy of various sensitive information available around us, e.g., personal information, different social communication and video streaming sites' and services' own users' private information, salary information and structure of an organization, census and statistical data of a country and so on. These data can be represented in different formats such as Numerical and Categorical data, Graph Data, Tree-Structured data and so on. For preventing these data from being illegally exploited and protect it from privacy threats, it is required to apply an efficient privacy model over sensitive data. There have been a great number of studies on privacy-preserving data publishing over the last decades. Differential Privacy (DP) is one of the state of the art methods for preserving privacy to a database. However, applying DP to high dimensional tabular data (Numerical and Categorical) is challenging in terms of required time, memory, and high frequency computational unit. A well-known solution is to reduce the dimension of the given database, keeping its originality and preserving relations among all of its entities. In this thesis, we propose PrivFuzzy, a simple and flexible differentially private method that can publish differentially private data after reducing their original dimension with the help of Fuzzy logic. Exploiting Fuzzy mapping, PrivFuzzy can (1) reduce database columns and create a new low dimensional correlated database, (2) inject noise to each attribute to ensure differential privacy on newly created low dimensional database, and (3) sample each entry in the database and release synthesized database. Existing literatures show the difficulty of applying differential privacy over a high dimensional dataset, which we overcame by proposing a novel fuzzy based approach (PrivFuzzy). By applying our novel fuzzy mapping technique, PrivFuzzy transforms a high dimensional dataset to an equivalent low dimensional one, without losing any relationship within the dataset. Our experiments with real data and comparison with the existing privacy preserving models, PrivBayes and PrivGene, show that our proposed approach PrivFuzzy outperforms existing solutions in terms of the strength of privacy preservation, simplicity and improving utility. Preserving privacy of Graph structured data, at the time of making some of its part available, is still one of the major problems in preserving data privacy. Most of the present models had tried to solve this issue by coming up with complex solution, as well as mixed up with signal and noise, which make these solutions ineffective in real time use and practice. One of the state of the art solution is to apply differential privacy over the queries on graph data and its statistics. But the challenge to meet here is to reduce the error at the time of publishing the data as mechanism of Differential privacy adds a large amount of noise and introduces erroneous results which reduces the utility of data. In this thesis, we proposed an Expectation Maximization (EM) based novel differentially private model for graph dataset. By applying EM method iteratively in conjunction with Laplace mechanism our proposed private model applies differentially private noise over the result of several subgraph queries on a graph dataset. Besides, to ensure expected utility, by selecting a maximal noise level θ\theta, our proposed system can generate noisy result with expected utility. Comparing with existing models for several subgraph counting queries, we claim that our proposed model can generate much less noise than the existing models to achieve expected utility and can still preserve privacy

    The collection, linking and use of data in biomedical research and health care: ethical issues

    Get PDF
    This report takes as its starting point the massive accumulation of data in biomedical research and health care, and the increasing power of data science to extract value by linking and re-using that data, for example in further health or population research. It examines the scientific, policy and economic drivers to exploit these opportunities, and the concerns and potential risks associated with doing so. The faltering ability of conventional information governance measures to keep pace with these developments is identified as a significant problem. The report therefore poses and addresses the following question: "how can we define a set of morally reasonable expectations about the use of data in any given data initiative and what conditions are required to give sufficient confidence that those expectations will be satisfied?" The report sets out a number of general recommendations, including four guiding principles for ethical design and governance of data initiatives. These help to identify specific examples of existing good practice and to make recommendations for improved practice in the use of data in the fields of health care (re-use of NHS records, clinical research, etc.) and population research (biobanks, epidemiological studies, etc.)

    Summary care record early adopter programme: an independent evaluation by University College London.

    Get PDF
    Benefits The main potential benefit of the SCR is considered to be in emergency and unscheduled care settings, especially for people who are unconscious, confused, unsure of their medical details, or unable to communicate effectively in English. Other benefits may include improved efficiency of care and avoidance of hospital admission, but it is too early for potential benefits to be verified or quantified. Progress As of end April 2008, the SCR of 153,188 patients in the first two Early Adopter sites (Bolton and Bury) had been created. A total of 614,052 patients in four Early Adopter sites had been sent a letter informing them of the programme and their choices for opting out of having a SCR. Staff attitudes and usage The evaluation found that many NHS staff in Early Adopter sites (which had been selected partly for their keenness to innovate in ICT) were enthusiastic about the SCR and keen to see it up and running, but a significant minority of GPs had chosen not to participate in the programme and others had deferred participation until data quality improvement work was completed. Whilst 80 per cent of patients interviewed were either positive about the idea of having a SCR or ?did not mind?, others were strongly opposed ?on principle?. Staff who had attempted to use the SCR when caring for patients felt that the current version was technically immature (describing it as ?clunky? and ?complicated?), and were looking forward to a more definitive version of the technology. A comparable technology (the Emergency Care Summary) introduced in Scotland two years ago is now working well, and over a million records have been accessed in emergency and out-of-hours care. Patient attitudes and awareness Having a SCR is optional (people may opt out if they wish, though fewer than one per cent of people in Early Adopter sites have done so) and technical security is said to be high via a system of password protection and strict access controls. Nevertheless, the evaluation showed that recent stories about data loss by government and NHS organisations had raised concerns amongst both staff and patients that human fallibility could potentially jeopardise the operational security of the system. Despite an extensive information programme to inform the public in Early Adopter sites about the SCR, many patients interviewed by the UCL team were not aware of the programme at all. This raises important questions about the ethics of an ?implied consent? model for creating the SCR. The evaluation recommended that the developers of the SCR should consider a model in which the patient is asked for ?consent to view? whenever a member of staff wishes to access their record. Not a single patient interviewed in the evaluation was confident that the SCR would be 100 per cent secure, but they were philosophical about the risks of security breaches. Typically, people said that the potential benefit of a doctor having access to key medical details in an emergency outweighed the small but real risk of data loss due to human or technical error. Even patients whose medical record contained potentially sensitive data such as mental health problems, HIV or drug use were often (though not always) keen to have a SCR and generally trusted NHS staff to treat sensitive data appropriately. However, they and many other NHS patients wanted to be able to control which staff members were allowed to access their record at the point of care. Some doctors, nurses and receptionists, it seems, are trusted to view a person?s SCR, whereas others are not, and this is a decision which patients would like to make in real time
    • …
    corecore