30 research outputs found
Differentially Private Regression for Discrete-Time Survival Analysis
In survival analysis, regression models are used to understand the effects of
explanatory variables (e.g., age, sex, weight, etc.) to the survival
probability. However, for sensitive survival data such as medical data, there
are serious concerns about the privacy of individuals in the data set when
medical data is used to fit the regression models. The closest work addressing
such privacy concerns is the work on Cox regression which linearly projects the
original data to a lower dimensional space. However, the weakness of this
approach is that there is no formal privacy guarantee for such projection. In
this work, we aim to propose solutions for the regression problem in survival
analysis with the protection of differential privacy which is a golden standard
of privacy protection in data privacy research. To this end, we extend the
Output Perturbation and Objective Perturbation approaches which are originally
proposed to protect differential privacy for the Empirical Risk Minimization
(ERM) problems. In addition, we also propose a novel sampling approach based on
the Markov Chain Monte Carlo (MCMC) method to practically guarantee
differential privacy with better accuracy. We show that our proposed approaches
achieve good accuracy as compared to the non-private results while guaranteeing
differential privacy for individuals in the private data set.Comment: 19 pages, CIKM1
Mining Frequent Graph Patterns with Differential Privacy
Discovering frequent graph patterns in a graph database offers valuable
information in a variety of applications. However, if the graph dataset
contains sensitive data of individuals such as mobile phone-call graphs and
web-click graphs, releasing discovered frequent patterns may present a threat
to the privacy of individuals. {\em Differential privacy} has recently emerged
as the {\em de facto} standard for private data analysis due to its provable
privacy guarantee. In this paper we propose the first differentially private
algorithm for mining frequent graph patterns.
We first show that previous techniques on differentially private discovery of
frequent {\em itemsets} cannot apply in mining frequent graph patterns due to
the inherent complexity of handling structural information in graphs. We then
address this challenge by proposing a Markov Chain Monte Carlo (MCMC) sampling
based algorithm. Unlike previous work on frequent itemset mining, our
techniques do not rely on the output of a non-private mining algorithm.
Instead, we observe that both frequent graph pattern mining and the guarantee
of differential privacy can be unified into an MCMC sampling framework. In
addition, we establish the privacy and utility guarantee of our algorithm and
propose an efficient neighboring pattern counting technique as well.
Experimental results show that the proposed algorithm is able to output
frequent patterns with good precision
Blowfish Privacy: Tuning Privacy-Utility Trade-offs using Policies
Privacy definitions provide ways for trading-off the privacy of individuals
in a statistical database for the utility of downstream analysis of the data.
In this paper, we present Blowfish, a class of privacy definitions inspired by
the Pufferfish framework, that provides a rich interface for this trade-off. In
particular, we allow data publishers to extend differential privacy using a
policy, which specifies (a) secrets, or information that must be kept secret,
and (b) constraints that may be known about the data. While the secret
specification allows increased utility by lessening protection for certain
individual properties, the constraint specification provides added protection
against an adversary who knows correlations in the data (arising from
constraints). We formalize policies and present novel algorithms that can
handle general specifications of sensitive information and certain count
constraints. We show that there are reasonable policies under which our privacy
mechanisms for k-means clustering, histograms and range queries introduce
significantly lesser noise than their differentially private counterparts. We
quantify the privacy-utility trade-offs for various policies analytically and
empirically on real datasets.Comment: Full version of the paper at SIGMOD'14 Snowbird, Utah US
May I Suggest? Comparing Three PLE Recommender Strategies
Personal learning environment (PLE) solutions aim at empowering learners to design (ICT and web-based) environments for their learning activities, mashingup content and people and apps for different learning contexts. Widely used in other application areas, recommender systems can be very useful for supporting learners in their PLE-based activities, to help discover relevant content, peers sharing similar learning interests or experts on a specific topic. In this paper we examine the utilization of recommender technology for PLEs. However, being confronted by a variety of educational contexts we present three strategies for providing PLE recommendations to learners. Consequently, we compare these recommender strategies by discussing their strengths and weaknesses in general