3,689 research outputs found
Linear and Range Counting under Metric-based Local Differential Privacy
Local differential privacy (LDP) enables private data sharing and analytics
without the need for a trusted data collector. Error-optimal primitives (for,
e.g., estimating means and item frequencies) under LDP have been well studied.
For analytical tasks such as range queries, however, the best known error bound
is dependent on the domain size of private data, which is potentially
prohibitive. This deficiency is inherent as LDP protects the same level of
indistinguishability between any pair of private data values for each data
downer.
In this paper, we utilize an extension of -LDP called Metric-LDP or
-LDP, where a metric defines heterogeneous privacy guarantees for
different pairs of private data values and thus provides a more flexible knob
than does to relax LDP and tune utility-privacy trade-offs. We show
that, under such privacy relaxations, for analytical workloads such as linear
counting, multi-dimensional range counting queries, and quantile queries, we
can achieve significant gains in utility. In particular, for range queries
under -LDP where the metric is the -distance function scaled by
, we design mechanisms with errors independent on the domain sizes;
instead, their errors depend on the metric , which specifies in what
granularity the private data is protected. We believe that the primitives we
design for -LDP will be useful in developing mechanisms for other analytical
tasks, and encourage the adoption of LDP in practice
A Unifying Privacy Analysis Framework for Unknown Domain Algorithms in Differential Privacy
There are many existing differentially private algorithms for releasing
histograms, i.e. counts with corresponding labels, in various settings. Our
focus in this survey is to revisit some of the existing differentially private
algorithms for releasing histograms over unknown domains, i.e. the labels of
the counts that are to be released are not known beforehand. The main practical
advantage of releasing histograms over an unknown domain is that the algorithm
does not need to fill in missing labels because they are not present in the
original histogram but in a hypothetical neighboring dataset could appear in
the histogram. However, the challenge in designing differentially private
algorithms for releasing histograms over an unknown domain is that some
outcomes can clearly show which input was used, clearly violating privacy. The
goal then is to show that the differentiating outcomes occur with very low
probability. We present a unified framework for the privacy analyses of several
existing algorithms. Furthermore, our analysis uses approximate concentrated
differential privacy from Bun and Steinke'16, which can improve the privacy
loss parameters rather than using differential privacy directly, especially
when composing many of these algorithms together in an overall system
Graph-based Security and Privacy Analytics via Collective Classification with Joint Weight Learning and Propagation
Many security and privacy problems can be modeled as a graph classification
problem, where nodes in the graph are classified by collective classification
simultaneously. State-of-the-art collective classification methods for such
graph-based security and privacy analytics follow the following paradigm:
assign weights to edges of the graph, iteratively propagate reputation scores
of nodes among the weighted graph, and use the final reputation scores to
classify nodes in the graph. The key challenge is to assign edge weights such
that an edge has a large weight if the two corresponding nodes have the same
label, and a small weight otherwise. Although collective classification has
been studied and applied for security and privacy problems for more than a
decade, how to address this challenge is still an open question. In this work,
we propose a novel collective classification framework to address this
long-standing challenge. We first formulate learning edge weights as an
optimization problem, which quantifies the goals about the final reputation
scores that we aim to achieve. However, it is computationally hard to solve the
optimization problem because the final reputation scores depend on the edge
weights in a very complex way. To address the computational challenge, we
propose to jointly learn the edge weights and propagate the reputation scores,
which is essentially an approximate solution to the optimization problem. We
compare our framework with state-of-the-art methods for graph-based security
and privacy analytics using four large-scale real-world datasets from various
application scenarios such as Sybil detection in social networks, fake review
detection in Yelp, and attribute inference attacks. Our results demonstrate
that our framework achieves higher accuracies than state-of-the-art methods
with an acceptable computational overhead.Comment: Network and Distributed System Security Symposium (NDSS), 2019.
Dataset link: http://gonglab.pratt.duke.edu/code-dat
Game Theory Based Correlated Privacy Preserving Analysis in Big Data
Privacy preservation is one of the greatest concerns in big data. As one of extensive applications in big data, privacy preserving data publication (PPDP) has been an important research field. One of the fundamental challenges in PPDP is the trade-off problem between privacy and utility of the single and independent data set. However, recent research has shown that the advanced privacy mechanism, i.e., differential privacy, is vulnerable when multiple data sets are correlated. In this case, the trade-off problem between privacy and utility is evolved into a game problem, in which payoff of each player is dependent on his and his neighbors’ privacy parameters. In this paper, we firstly present the definition of correlated differential privacy to evaluate the real privacy level of a single data set influenced by the other data sets. Then, we construct a game model of multiple players, in which each publishes data set sanitized by differential privacy. Next, we analyze the existence and uniqueness of the pure Nash Equilibrium. We refer to a notion, i.e., the price of anarchy, to evaluate efficiency of the pure Nash Equilibrium. Finally, we show the correctness of our game analysis via simulation experiments
Privacy Preservation and Analytical Utility of E-Learning Data Mashups in the Web of Data
Virtual learning environments contain valuable data about students that can be correlated and analyzed to optimize learning. Modern learning environments based on data mashups that collect and integrate data from multiple sources are relevant for learning analytics systems because they provide insights into students' learning. However, data sets involved in mashups may contain personal information of sensitive nature that raises legitimate privacy concerns. Average privacy preservation methods are based on preemptive approaches that limit the published data in a mashup based on access control and authentication schemes. Such limitations may reduce the analytical utility of the data exposed to gain students' learning insights. In order to reconcile utility and privacy preservation of published data, this research proposes a new data mashup protocol capable of merging and k-anonymizing data sets in cloud-based learning environments without jeopardizing the analytical utility of the information. The implementation of the protocol is based on linked data so that data sets involved in the mashups are semantically described, thereby enabling their combination with relevant educational data sources. The k-anonymized data sets returned by the protocol still retain essential information for supporting general data exploration and statistical analysis tasks. The analytical and empirical evaluation shows that the proposed protocol prevents individuals' sensitive information from re-identifying.The Spanish National Research Agency (AEI) funded this research through the project CREPES (ref. PID2020-115844RB-I00) with ERDF funds
A quest for efficacy in data protection: a legal and behavioural analysis
[eng] The article questions the role of consent to the processing of personal data,
especially in the technological context, as a tool of self-determination or protection of
the data subject. The work draws on some results from behavioural studies and some
Italian judgements regarding other private actions in data protection law (liability for
damages). Some suggestions towards an alternative scenario are finally suggested.[spa] El artículo cuestiona el rol del consentimiento en el tratamiento de datos
personales, particularmente en el contexto tecnológico como herramienta de la
autodeterminación o protección del interesado. Este trabajo analiza algunas
conclusiones de estudios comportamentales y determinadas resoluciones judiciales del
ordenamiento italiano respecto del ejercicio de acciones privadas en el derecho de
protección de datos (responsabilidad por daños). Finalmente se exponen algunas
recomendaciones de un escenario alternativ
- …