238 research outputs found

    Influence in Classification via Cooperative Game Theory

    Full text link
    A dataset has been classified by some unknown classifier into two types of points. What were the most important factors in determining the classification outcome? In this work, we employ an axiomatic approach in order to uniquely characterize an influence measure: a function that, given a set of classified points, outputs a value for each feature corresponding to its influence in determining the classification outcome. We show that our influence measure takes on an intuitive form when the unknown classifier is linear. Finally, we employ our influence measure in order to analyze the effects of user profiling on Google's online display advertising.Comment: accepted to IJCAI 201

    A Logical Method for Policy Enforcement over Evolving Audit Logs

    Full text link
    We present an iterative algorithm for enforcing policies represented in a first-order logic, which can, in particular, express all transmission-related clauses in the HIPAA Privacy Rule. The logic has three features that raise challenges for enforcement --- uninterpreted predicates (used to model subjective concepts in privacy policies), real-time temporal properties, and quantification over infinite domains (such as the set of messages containing personal information). The algorithm operates over audit logs that are inherently incomplete and evolve over time. In each iteration, the algorithm provably checks as much of the policy as possible over the current log and outputs a residual policy that can only be checked when the log is extended with additional information. We prove correctness and termination properties of the algorithm. While these results are developed in a general form, accounting for many different sources of incompleteness in audit logs, we also prove that for the special case of logs that maintain a complete record of all relevant actions, the algorithm effectively enforces all safety and co-safety properties. The algorithm can significantly help automate enforcement of policies derived from the HIPAA Privacy Rule.Comment: Carnegie Mellon University CyLab Technical Report. 51 page

    A Methodology for Information Flow Experiments

    Full text link
    Information flow analysis has largely ignored the setting where the analyst has neither control over nor a complete model of the analyzed system. We formalize such limited information flow analyses and study an instance of it: detecting the usage of data by websites. We prove that these problems are ones of causal inference. Leveraging this connection, we push beyond traditional information flow analysis to provide a systematic methodology based on experimental science and statistical analysis. Our methodology allows us to systematize prior works in the area viewing them as instances of a general approach. Our systematic study leads to practical advice for improving work on detecting data usage, a previously unformalized area. We illustrate these concepts with a series of experiments collecting data on the use of information by websites, which we statistically analyze

    Formal Verification of Differential Privacy for Interactive Systems

    Full text link
    Differential privacy is a promising approach to privacy preserving data analysis with a well-developed theory for functions. Despite recent work on implementing systems that aim to provide differential privacy, the problem of formally verifying that these systems have differential privacy has not been adequately addressed. This paper presents the first results towards automated verification of source code for differentially private interactive systems. We develop a formal probabilistic automaton model of differential privacy for systems by adapting prior work on differential privacy for functions. The main technical result of the paper is a sound proof technique based on a form of probabilistic bisimulation relation for proving that a system modeled as a probabilistic automaton satisfies differential privacy. The novelty lies in the way we track quantitative privacy leakage bounds using a relation family instead of a single relation. We illustrate the proof technique on a representative automaton motivated by PINQ, an implemented system that is intended to provide differential privacy. To make our proof technique easier to apply to realistic systems, we prove a form of refinement theorem and apply it to show that a refinement of the abstract PINQ automaton also satisfies our differential privacy definition. Finally, we begin the process of automating our proof technique by providing an algorithm for mechanically checking a restricted class of relations from the proof technique.Comment: 65 pages with 1 figur

    Differentially Private Data Analysis of Social Networks via Restricted Sensitivity

    Full text link
    We introduce the notion of restricted sensitivity as an alternative to global and smooth sensitivity to improve accuracy in differentially private data analysis. The definition of restricted sensitivity is similar to that of global sensitivity except that instead of quantifying over all possible datasets, we take advantage of any beliefs about the dataset that a querier may have, to quantify over a restricted class of datasets. Specifically, given a query f and a hypothesis H about the structure of a dataset D, we show generically how to transform f into a new query f_H whose global sensitivity (over all datasets including those that do not satisfy H) matches the restricted sensitivity of the query f. Moreover, if the belief of the querier is correct (i.e., D is in H) then f_H(D) = f(D). If the belief is incorrect, then f_H(D) may be inaccurate. We demonstrate the usefulness of this notion by considering the task of answering queries regarding social-networks, which we model as a combination of a graph and a labeling of its vertices. In particular, while our generic procedure is computationally inefficient, for the specific definition of H as graphs of bounded degree, we exhibit efficient ways of constructing f_H using different projection-based techniques. We then analyze two important query classes: subgraph counting queries (e.g., number of triangles) and local profile queries (e.g., number of people who know a spy and a computer-scientist who know each other). We demonstrate that the restricted sensitivity of such queries can be significantly lower than their smooth sensitivity. Thus, using restricted sensitivity we can maintain privacy whether or not D is in H, while providing more accurate results in the event that H holds true

    Towards Human Computable Passwords

    Get PDF
    An interesting challenge for the cryptography community is to design authentication protocols that are so simple that a human can execute them without relying on a fully trusted computer. We propose several candidate authentication protocols for a setting in which the human user can only receive assistance from a semi-trusted computer --- a computer that stores information and performs computations correctly but does not provide confidentiality. Our schemes use a semi-trusted computer to store and display public challenges Ci[n]kC_i\in[n]^k. The human user memorizes a random secret mapping σ:[n]Zd\sigma:[n]\rightarrow\mathbb{Z}_d and authenticates by computing responses f(σ(Ci))f(\sigma(C_i)) to a sequence of public challenges where f:ZdkZdf:\mathbb{Z}_d^k\rightarrow\mathbb{Z}_d is a function that is easy for the human to evaluate. We prove that any statistical adversary needs to sample m=Ω~(ns(f))m=\tilde{\Omega}(n^{s(f)}) challenge-response pairs to recover σ\sigma, for a security parameter s(f)s(f) that depends on two key properties of ff. To obtain our results, we apply the general hypercontractivity theorem to lower bound the statistical dimension of the distribution over challenge-response pairs induced by ff and σ\sigma. Our lower bounds apply to arbitrary functions ff (not just to functions that are easy for a human to evaluate), and generalize recent results of Feldman et al. As an application, we propose a family of human computable password functions fk1,k2f_{k_1,k_2} in which the user needs to perform 2k1+2k2+12k_1+2k_2+1 primitive operations (e.g., adding two digits or remembering σ(i)\sigma(i)), and we show that s(f)=min{k1+1,(k2+1)/2}s(f) = \min\{k_1+1, (k_2+1)/2\}. For these schemes, we prove that forging passwords is equivalent to recovering the secret mapping. Thus, our human computable password schemes can maintain strong security guarantees even after an adversary has observed the user login to many different accounts.Comment: Fixed bug in definition of Q^{f,j} and modified proofs accordingl
    corecore