1,571 research outputs found
Data Leak Detection As a Service: Challenges and Solutions
We describe a network-based data-leak detection (DLD)
technique, the main feature of which is that the detection
does not require the data owner to reveal the content of the
sensitive data. Instead, only a small amount of specialized
digests are needed. Our technique – referred to as the fuzzy
fingerprint – can be used to detect accidental data leaks due
to human errors or application flaws. The privacy-preserving
feature of our algorithms minimizes the exposure of sensitive
data and enables the data owner to safely delegate the
detection to others.We describe how cloud providers can offer
their customers data-leak detection as an add-on service
with strong privacy guarantees.
We perform extensive experimental evaluation on the privacy,
efficiency, accuracy and noise tolerance of our techniques.
Our evaluation results under various data-leak scenarios
and setups show that our method can support accurate
detection with very small number of false alarms, even
when the presentation of the data has been transformed. It
also indicates that the detection accuracy does not degrade
when partial digests are used. We further provide a quantifiable
method to measure the privacy guarantee offered by our
fuzzy fingerprint framework
Privacy in the Genomic Era
Genome sequencing technology has advanced at a rapid pace and it is now
possible to generate highly-detailed genotypes inexpensively. The collection
and analysis of such data has the potential to support various applications,
including personalized medical services. While the benefits of the genomics
revolution are trumpeted by the biomedical community, the increased
availability of such data has major implications for personal privacy; notably
because the genome has certain essential features, which include (but are not
limited to) (i) an association with traits and certain diseases, (ii)
identification capability (e.g., forensics), and (iii) revelation of family
relationships. Moreover, direct-to-consumer DNA testing increases the
likelihood that genome data will be made available in less regulated
environments, such as the Internet and for-profit companies. The problem of
genome data privacy thus resides at the crossroads of computer science,
medicine, and public policy. While the computer scientists have addressed data
privacy for various data types, there has been less attention dedicated to
genomic data. Thus, the goal of this paper is to provide a systematization of
knowledge for the computer science community. In doing so, we address some of
the (sometimes erroneous) beliefs of this field and we report on a survey we
conducted about genome data privacy with biomedical specialists. Then, after
characterizing the genome privacy problem, we review the state-of-the-art
regarding privacy attacks on genomic data and strategies for mitigating such
attacks, as well as contextualizing these attacks from the perspective of
medicine and public policy. This paper concludes with an enumeration of the
challenges for genome data privacy and presents a framework to systematize the
analysis of threats and the design of countermeasures as the field moves
forward
Secure Similar Sequence Query on Outsourced Genomic Data
The growing availability of genomic data is unlocking research potentials on genomic-data analysis. It is of great importance to outsource the genomic-analysis tasks onto clouds to leverage their powerful computational resources over the large-scale genomic sequences. However, the remote placement of the data raises personal-privacy concerns, and it is challenging to evaluate data-analysis functions on outsourced genomic data securely and efficiently. In this work, we study the secure similar-sequence-query (SSQ) problem over outsourced genomic data, which has not been fully investigated. To address the challenges of security and efficiency, we propose two protocols in the mixed form, which combine two-party secure secret sharing, garbled circuit, and partial homomorphic encryptions together and use them to jointly fulfill the secure SSQ function. In addition, our protocols support multi-user queries over a joint genomic data set collected from multiple data owners, making our solution scalable. We formally prove the security of protocols under the semi-honest adversary model, and theoretically analyze the performance. We use extensive experiments over real-world dataset on a commercial cloud platform to validate the efficacy of our proposed solution, and demonstrate the performance improvements compared with state-of-the-art works
Blockchain for Genomics:A Systematic Literature Review
Human genomic data carry unique information about an individual and offer
unprecedented opportunities for healthcare. The clinical interpretations
derived from large genomic datasets can greatly improve healthcare and pave the
way for personalized medicine. Sharing genomic datasets, however, pose major
challenges, as genomic data is different from traditional medical data,
indirectly revealing information about descendants and relatives of the data
owner and carrying valid information even after the owner passes away.
Therefore, stringent data ownership and control measures are required when
dealing with genomic data. In order to provide secure and accountable
infrastructure, blockchain technologies offer a promising alternative to
traditional distributed systems. Indeed, the research on blockchain-based
infrastructures tailored to genomics is on the rise. However, there is a lack
of a comprehensive literature review that summarizes the current
state-of-the-art methods in the applications of blockchain in genomics. In this
paper, we systematically look at the existing work both commercial and
academic, and discuss the major opportunities and challenges. Our study is
driven by five research questions that we aim to answer in our review. We also
present our projections of future research directions which we hope the
researchers interested in the area can benefit from
Blockchain for Genomics:A Systematic Literature Review
Human genomic data carry unique information about an individual and offer
unprecedented opportunities for healthcare. The clinical interpretations
derived from large genomic datasets can greatly improve healthcare and pave the
way for personalized medicine. Sharing genomic datasets, however, pose major
challenges, as genomic data is different from traditional medical data,
indirectly revealing information about descendants and relatives of the data
owner and carrying valid information even after the owner passes away.
Therefore, stringent data ownership and control measures are required when
dealing with genomic data. In order to provide secure and accountable
infrastructure, blockchain technologies offer a promising alternative to
traditional distributed systems. Indeed, the research on blockchain-based
infrastructures tailored to genomics is on the rise. However, there is a lack
of a comprehensive literature review that summarizes the current
state-of-the-art methods in the applications of blockchain in genomics. In this
paper, we systematically look at the existing work both commercial and
academic, and discuss the major opportunities and challenges. Our study is
driven by five research questions that we aim to answer in our review. We also
present our projections of future research directions which we hope the
researchers interested in the area can benefit from
- …