Search CORE

10,138 research outputs found

User's Privacy in Recommendation Systems Applying Online Social Network Data, A Survey and Taxonomy

Author: Aghasian Erfan
Garg Saurabh
Montgomery James
Publication venue
Publication date: 20/06/2018
Field of study

Recommender systems have become an integral part of many social networks and extract knowledge from a user's personal and sensitive data both explicitly, with the user's knowledge, and implicitly. This trend has created major privacy concerns as users are mostly unaware of what data and how much data is being used and how securely it is used. In this context, several works have been done to address privacy concerns for usage in online social network data and by recommender systems. This paper surveys the main privacy concerns, measurements and privacy-preserving techniques used in large-scale online social networks and recommender systems. It is based on historical works on security, privacy-preserving, statistical modeling, and datasets to provide an overview of the technical difficulties and problems associated with privacy preserving in online social networks.Comment: 26 pages, IET book chapter on big data recommender system

arXiv.org e-Print Archive

Crossref

University of Tasmania Open Access Repository

p-probabilistic k-anonymous microaggregation for the anonymization of surveys with uncertain participation

Author: Forné Muñoz Jorge
Puiggalí Allepuz Jordi
Rebollo Monedero David
Soriano Ibáñez Miguel
Publication venue: 'Elsevier BV'
Publication date: 01/01/2016
Field of study

We develop a probabilistic variant of k-anonymous microaggregation which we term p-probabilistic resorting to a statistical model of respondent participation in order to aggregate quasi-identifiers in such a manner that k-anonymity is concordantly enforced with a parametric probabilistic guarantee. Succinctly owing the possibility that some respondents may not finally participate, sufficiently larger cells are created striving to satisfy k-anonymity with probability at least p. The microaggregation function is designed before the respondents submit their confidential data. More precisely, a specification of the function is sent to them which they may verify and apply to their quasi-identifying demographic variables prior to submitting the microaggregated data along with the confidential attributes to an authorized repository. We propose a number of metrics to assess the performance of our probabilistic approach in terms of anonymity and distortion which we proceed to investigate theoretically in depth and empirically with synthetic and standardized data. We stress that in addition to constituting a functional extension of traditional microaggregation, thereby broadening its applicability to the anonymization of statistical databases in a wide variety of contexts, the relaxation of trust assumptions is arguably expected to have a considerable impact on user acceptance and ultimately on data utility through mere availability.Peer ReviewedPostprint (author's final draft

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

UPCommons. Portal del coneixement obert de la UPC

FACTS: A Framework for Anonymity towards Comparability, Transparency, and Sharing (Extended Version)

Author: Buchmann Erik
Böhm Klemens
Heidinger Clemens
Richter Kai
Publication venue: Karlsruher Institut für Technologie
Publication date: 01/01/2013
Field of study

KITopen

Preserving Privacy of High-Dimensional Data by l-Diverse Constrained Slicing

Author: Ahmad A.
Amin Z.
Anjum A.
Jeon G.
Khan A.
Publication venue: 'MDPI AG'
Publication date: 01/01/2022
Field of study

In the modern world of digitalization, data growth, aggregation and sharing have escalated drastically. Users share huge amounts of data due to the widespread adoption of Internet-of-things (IoT) and cloud-based smart devices. Such data could have confidential attributes about various individuals. Therefore, privacy preservation has become an important concern. Many privacy-preserving data publication models have been proposed to ensure data sharing without privacy disclosures. However, publishing high-dimensional data with sufficient privacy is still a challenging task and very little focus has been given to propound optimal privacy solutions for high-dimensional data. In this paper, we propose a novel privacy-preserving model to anonymize high-dimensional data (prone to various privacy attacks including probabilistic, skewness, and gender-specific). Our proposed model is a combination of l-diversity along with constrained slicing and vertical division. The proposed model can protect the above-stated attacks with minimal information loss. The extensive experiments on real-world datasets advocate the outperformance of our proposed model among its counterparts

Directory of Open Access Journals

UDORA - University of Derby Online Research Archive

On the Complexity of $t$ -Closeness Anonymization and Related Problems

Author: D. Rebollo-Monedero
E. Anshelevich
J. Blocki
J. Cao
L. Sweeney
N. Li
P. Bonizzoni
P. Samarati
P.A. Evans
R. Bredereck
Y. Rubner
Publication venue
Publication date: 01/01/2013
Field of study

An important issue in releasing individual data is to protect the sensitive information from being leaked and maliciously utilized. Famous privacy preserving principles that aim to ensure both data privacy and data integrity, such as

k

-anonymity and

l

-diversity, have been extensively studied both theoretically and empirically. Nonetheless, these widely-adopted principles are still insufficient to prevent attribute disclosure if the attacker has partial knowledge about the overall sensitive data distribution. The

t

-closeness principle has been proposed to fix this, which also has the benefit of supporting numerical sensitive attributes. However, in contrast to

k

-anonymity and

l

-diversity, the theoretical aspect of

t

-closeness has not been well investigated. We initiate the first systematic theoretical study on the

t

-closeness principle under the commonly-used attribute suppression model. We prove that for every constant

t

such that

0\leq t<1

, it is NP-hard to find an optimal

t

-closeness generalization of a given table. The proof consists of several reductions each of which works for different values of

t

, which together cover the full range. To complement this negative result, we also provide exact and fixed-parameter algorithms. Finally, we answer some open questions regarding the complexity of

k

-anonymity and

l

-diversity left in the literature.Comment: An extended abstract to appear in DASFAA 201

arXiv.org e-Print Archive

Crossref

Trajectory and Policy Aware Sender Anonymity in Location Based Services

Author: Deutsch Alin
Hull Richard
Vyas Avinash
Zhao Kevin Keliang
Publication venue
Publication date: 27/02/2012
Field of study

We consider Location-based Service (LBS) settings, where a LBS provider logs the requests sent by mobile device users over a period of time and later wants to publish/share these logs. Log sharing can be extremely valuable for advertising, data mining research and network management, but it poses a serious threat to the privacy of LBS users. Sender anonymity solutions prevent a malicious attacker from inferring the interests of LBS users by associating them with their service requests after gaining access to the anonymized logs. With the fast-increasing adoption of smartphones and the concern that historic user trajectories are becoming more accessible, it becomes necessary for any sender anonymity solution to protect against attackers that are trajectory-aware (i.e. have access to historic user trajectories) as well as policy-aware (i.e they know the log anonymization policy). We call such attackers TP-aware. This paper introduces a first privacy guarantee against TP-aware attackers, called TP-aware sender k-anonymity. It turns out that there are many possible TP-aware anonymizations for the same LBS log, each with a different utility to the consumer of the anonymized log. The problem of finding the optimal TP-aware anonymization is investigated. We show that trajectory-awareness renders the problem computationally harder than the trajectory-unaware variants found in the literature (NP-complete in the size of the log, versus PTIME). We describe a PTIME l-approximation algorithm for trajectories of length l and empirically show that it scales to large LBS logs (up to 2 million users)

arXiv.org e-Print Archive

CiteSeerX

eScholarship - University of California