152,004 research outputs found
Security and Privacy in Three States of Information
In regard to computational context, information can be in either of three states at a time: in transit, in process, or in storage. When the security and privacy of information is of concern, each of these states should be addressed exclusively, i.e., network security, computer security, and database/cloud security, respectively. This chapter first introduces the three states of information and then addresses the security as well as privacy issues that relate to each state. It provides practical examples for each state discussed, introduces corresponding security and privacy algorithms for explaining the concepts, and facilitates their implementation whenever needed. Moreover, the security and privacy techniques pertaining to the three states of information are combined together to offer a more comprehensive and realistic consideration of everyday security practices
Privacy as Product Safety
Online social media confound many of our familiar expectaitons about privacy. Contrary to popular myth, users of social software like Facebook do care about privacy, deserve it, and have trouble securing it for themselves. Moreover, traditional database-focused privacy regulations on the Fair Information Practices model, while often worthwhile, fail to engage with the distinctively social aspects of these online services.
Instead, online privacy law should take inspiration from a perhaps surprising quarter: product-safety law. A web site that directs users\u27 personal information in ways they don\u27t expect is a defectively designed product, and many concepts from products liability law could usefully be applied to the structurally similar problem of privacy in social software. After setting the scene with a discussion of how people use Facebook and why standard assumptions about privacy and privacy law fail, this essay examines the parallel between physically safe products and privacy-safe social software. It illustrates the value of the product-safety approach by considering another ripped-from-the-headlines example: Google Buzz
Privacy-Preserving Tensor Factorization for Collaborative Health Data Analysis
Tensor factorization has been demonstrated as an efficient approach for
computational phenotyping, where massive electronic health records (EHRs) are
converted to concise and meaningful clinical concepts. While distributing the
tensor factorization tasks to local sites can avoid direct data sharing, it
still requires the exchange of intermediary results which could reveal
sensitive patient information. Therefore, the challenge is how to jointly
decompose the tensor under rigorous and principled privacy constraints, while
still support the model's interpretability. We propose DPFact, a
privacy-preserving collaborative tensor factorization method for computational
phenotyping using EHR. It embeds advanced privacy-preserving mechanisms with
collaborative learning. Hospitals can keep their EHR database private but also
collaboratively learn meaningful clinical concepts by sharing differentially
private intermediary results. Moreover, DPFact solves the heterogeneous patient
population using a structured sparsity term. In our framework, each hospital
decomposes its local tensors, and sends the updated intermediary results with
output perturbation every several iterations to a semi-trusted server which
generates the phenotypes. The evaluation on both real-world and synthetic
datasets demonstrated that under strict privacy constraints, our method is more
accurate and communication-efficient than state-of-the-art baseline methods
Characterizing the Sample Complexity of Private Learners
In 2008, Kasiviswanathan et al. defined private learning as a combination of
PAC learning and differential privacy. Informally, a private learner is applied
to a collection of labeled individual information and outputs a hypothesis
while preserving the privacy of each individual. Kasiviswanathan et al. gave a
generic construction of private learners for (finite) concept classes, with
sample complexity logarithmic in the size of the concept class. This sample
complexity is higher than what is needed for non-private learners, hence
leaving open the possibility that the sample complexity of private learning may
be sometimes significantly higher than that of non-private learning.
We give a combinatorial characterization of the sample size sufficient and
necessary to privately learn a class of concepts. This characterization is
analogous to the well known characterization of the sample complexity of
non-private learning in terms of the VC dimension of the concept class. We
introduce the notion of probabilistic representation of a concept class, and
our new complexity measure RepDim corresponds to the size of the smallest
probabilistic representation of the concept class.
We show that any private learning algorithm for a concept class C with sample
complexity m implies RepDim(C)=O(m), and that there exists a private learning
algorithm with sample complexity m=O(RepDim(C)). We further demonstrate that a
similar characterization holds for the database size needed for privately
computing a large class of optimization problems and also for the well studied
problem of private data release
Defining Privacy and Utility in Data Sets
Is it possible to release useful data while preserving the privacy of the individuals whose information is in the database? This question has been the subject of considerable controversy, particularly in the wake of well-publicized instances in which researchers showed how to re-identify individuals in supposedly anonymous data. Some have argued that privacy and utility are fundamentally incompatible, while others have suggested that simple steps can be taken to achieve both simultaneously. Both sides have looked to the computer science literature for support. What the existing debate has overlooked, however, is that the relationship between privacy and utility depends crucially on what one means by privacy and what one means by utility. Apparently contradictory results in the computer science literature can be explained by the use of different definitions to formalize these concepts. Without sufficient attention to these definitional issues, it is all too easy to overgeneralize the technical results. More importantly, there are nuances to how definitions of privacy and utility can differ from each other, nuances that matter for why a definition that is appropriate in one context may not be appropriate in another. Analyzing these nuances exposes the policy choices inherent in the choice of one definition over another and thereby elucidates decisions about whether and how to regulate data privacy across varying social context
Defining Privacy and Utility in Data Sets
Is it possible to release useful data while preserving the privacy of the individuals whose information is in the database? This question has been the subject of considerable controversy, particularly in the wake of well-publicized instances in which researchers showed how to re-identify individuals in supposedly anonymous data. Some have argued that privacy and utility are fundamentally incompatible, while others have suggested that simple steps can be taken to achieve both simultaneously. Both sides have looked to the computer science literature for support.
What the existing debate has overlooked, however, is that the relationship between privacy and utility depends crucially on what one means by “privacy” and what one means by “utility.” Apparently contradictory results in the computer science literature can be explained by the use of different definitions to formalize these concepts. Without sufficient attention to these definitional issues, it is all too easy to overgeneralize the technical results. More importantly, there are nuances to how definitions of “privacy” and “utility” can differ from each other, nuances that matter for why a definition that is appropriate in one context may not be appropriate in another. Analyzing these nuances exposes the policy choices inherent in the choice of one definition over another and thereby elucidates decisions about whether and how to regulate data privacy across varying social contexts
PriCL: Creating a Precedent A Framework for Reasoning about Privacy Case Law
We introduce PriCL: the first framework for expressing and automatically
reasoning about privacy case law by means of precedent. PriCL is parametric in
an underlying logic for expressing world properties, and provides support for
court decisions, their justification, the circumstances in which the
justification applies as well as court hierarchies. Moreover, the framework
offers a tight connection between privacy case law and the notion of norms that
underlies existing rule-based privacy research. In terms of automation, we
identify the major reasoning tasks for privacy cases such as deducing legal
permissions or extracting norms. For solving these tasks, we provide generic
algorithms that have particularly efficient realizations within an expressive
underlying logic. Finally, we derive a definition of deducibility based on
legal concepts and subsequently propose an equivalent characterization in terms
of logic satisfiability.Comment: Extended versio
- …