Search CORE

647 research outputs found

A Decision Tree Approach for Assessing and Mitigating Background and Identity Disclosure Risks

Author: Hu Xiangpei
Li Xiaobai
Wang Mingzheng
Yang Haifang
Publication venue: AIS Electronic Library (AISeL)
Publication date: 06/11/2019
Field of study

The Facebook/Cambridge Analytica data scandal shows a type of privacy threat where an adversary attacks on a massive number of people without prior knowledge about their background information. Existing studies typically assume that the adversary knew the background information of the target individuals. This study examines the disclosure risk issue in privacy breaches without such an assumption. We define the background disclosure risk and re-identification risk based on the notion of prior and conditional probabilities respectively, and integrate the two risk measures into a composite measure using the Minimum Description Length principle. We then develop a decision-tree pruning algorithm to find an appropriate group size considering the tradeoff between disclosure risk and data utility. Furthermore, we propose a novel tiered generalization method for anonymizing data at the group level. An experimental study has been conducted to demonstrate the effectiveness of our approach

AIS Electronic Library (AISeL)

Utility-driven assessment of anonymized data via clustering

Author: Fazendeiro Paulo
Ferrão Maria Eugénia
Prata Paula
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 30/07/2022
Field of study

In this study, clustering is conceived as an auxiliary tool to identify groups of special interest. This approach was applied to a real dataset concerning an entire Portuguese cohort of higher education Law students. Several anonymized clustering scenarios were compared against the original cluster solution. The clustering techniques were explored as data utility models in the context of data anonymization, using k-anonymity and (ε, δ)-differential as privacy models. The purpose was to assess anonymized data utility by standard metrics, by the characteristics of the groups obtained, and the relative risk (a relevant metric in social sciences research). For a matter of self-containment, we present an overview of anonymization and clustering methods. We used a partitional clustering algorithm and analyzed several clustering validity indices to understand to what extent the data structure is preserved, or not, after data anonymization. The results suggest that for low dimensionality/cardinality datasets the anonymization procedure easily jeopardizes the clustering endeavor. In addition, there is evidence that relevant field-of-study estimates obtained from anonymized data are biased.info:eu-repo/semantics/publishedVersio

UBibliorum repositorio digital da ubi

PubMed Central

A vision for global privacy bridges: Technical and legal measures for international data markets

Author: Novotny Alexander
Spiekermann-Hoff Sarah
Publication venue: 'Elsevier BV'
Publication date: 01/01/2015
Field of study

From the early days of the information economy, personal data has been its most valuable asset. Despite data protection laws and an acknowledged right to privacy, trading personal information has become a business equated with "trading oil". Most of this business is done without the knowledge and active informed consent of the people. But as data breaches and abuses are made public through the media, consumers react. They become irritated about companies' data handling practices, lose trust, exercise political pressure and start to protect their privacy with the help of technical tools. As a result, companies' Internet business models that are based on personal data are unsettled. An open conflict is arising between business demands for data and a desire for privacy. As of 2015 no true answer is in sight of how to resolve this conflict. Technologists, economists and regulators are struggling to develop technical solutions and policies that meet businesses' demand for more data while still maintaining privacy. Yet, most of the proposed solutions fail to account for market complexity and provide no pathway to technological and legal implementation. They lack a bigger vision for data use and privacy. To break this vicious cycle, we propose and test such a vision of a personal information market with privacy. We accumulate technical and legal measures that have been proposed by technical and legal scholars over the past two decades. And out of this existing knowledge, we compose something new: a four-space market model for personal data

Elektronische Publikationen der Wirtschaftsuniversität Wien

Privacy-Preserving Design of Data Processing Systems in the Public Transport Context

Author: Callegati Franco
Campi Aldo
Melis Andrea
Prandini Marco
Zevenbergen Bendert
Publication venue: AIS Electronic Library (AISeL)
Publication date: 01/01/2015
Field of study

The public transport network of a region inhabited by more than 4 million people is run by a complex interplay of public and private actors. Large amounts of data are generated by travellers, buying and using various forms of tickets and passes. Analysing the data is of paramount importance for the governance and sustainability of the system. This manuscript reports the early results of the privacy analysis which is being undertaken as part of the analysis of the clearing process in the Emilia-Romagna region, in Italy, which will compute the compensations for tickets bought from one operator and used with another. In the manuscript it is shown by means of examples that the clearing data may be used to violate various privacy aspects regarding users, as well as (technically equivalent) trade secrets regarding operators. The ensuing discussion has a twofold goal. First, it shows that after researching possible existing solutions, both by reviewing the literature on general privacy-preserving techniques, and by analysing similar scenarios that are being discussed in various cities across the world, the former are found exhibiting structural effectiveness deficiencies, while the latter are found of limited applicability, typically involving less demanding requirements. Second, it traces a research path towards a more effective approach to privacy-preserving data management in the specific context of public transport, both by refinement of current sanitization techniques and by application of the privacy by design approach. Available at: https://aisel.aisnet.org/pajais/vol7/iss4/4

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

AIS Electronic Library (AISeL)

PRUDEnce: A system for assessing privacy risk vs utility in data sharing ecosystems

Author: Giannotti Fosca
Monreale Anna
Pedreschi Dino
Pratesi Francesca
Trasarti Roberto
Yanagihara Tadashi
Publication venue
Publication date: 01/01/2018
Field of study

Data describing human activities are an important source of knowledge useful for understanding individual and collective behavior and for developing a wide range of user services. Unfortunately, this kind of data is sensitive, because people’s whereabouts may allow re-identification of individuals in a de-identified database. Therefore, Data Providers, before sharing those data, must apply any sort of anonymization to lower the privacy risks, but they must be aware and capable of controlling also the data quality, since these two factors are often a trade-off. In this paper we propose PRUDEnce (Privacy Risk versus Utility in Data sharing Ecosystems), a system enabling a privacy-aware ecosystem for sharing personal data. It is based on a methodology for assessing both the empirical (not theoretical) privacy risk associated to users represented in the data, and the data quality guaranteed only with users not at risk. Our proposal is able to support the Data Provider in the exploration of a repertoire of possible data transformations with the aim of selecting one specific transformation that yields an adequate trade-off between data quality and privacy risk. We study the practical effectiveness of our proposal over three data formats underlying many services, defined on real mobility data, i.e., presence data, trajectory data and road segment data

Archivio istituzionale della Ricerca - Scuola Normale Superiore

Archivio della Ricerca - Università di Pisa

Recommended from our members

Addressing the Failure of Anonymization: Guidance from the European Union’s General Data Protection Regulation

Author: Frasher Elizabeth
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2018
Field of study

It is common practice for companies to “anonymize” the consumer data that they collect. In fact, U.S. data protection laws and Federal Trade Commission guidelines encourage the practice of anonymization by exempting anonymized data from the privacy and data security requirements they impose. Anonymization involves removing personally identifiable information (“PII”) from a dataset so that, in theory, the data cannot be traced back to its data subjects. In practice, however, anonymization fails to irrevocably protect consumer privacy due to the potential for deanonymization—the linking of anonymized data to auxiliary information to re-identify data subjects. Because U.S. data protection laws provide safe harbors for anonymized data, re-identified data subjects receive no statutory privacy protections at all—a fact that is particularly troublesome given consumers’ dependence on technology and today’s climate of ubiquitous data collection. By adopting an all-or-nothing approach to anonymization, the United States has created no means of incentivizing the practice of anonymization while still providing data subjects statutory protections. This Note argues that the United States should look to the risk-based approach taken by the European Union under the General Data Protection Regulation. Their data protection laws utilize multiple tiers of anonymization, which vary in their potential for deanonymization. Under this approach, pseudonymized data—i.e., certain data that has had PII removed but can still be linked to auxiliary information to re-identify data subjects—falls within the scope of the governing law, but receives relaxed requirements designed to incentivize pseudonymization and thereby reduce the risk of data subject identification. This approach both strikes a balance between data privacy and data utility, and affords data subjects the benefit of anonymity in addition to statutory protections ranging from choice to transparency

Columbia University Academic Commons

Catch, Clean, and Release: A Survey of Obstacles and Opportunities for Network Trace Sanitization

Author: Kotz David
Locasto Michael E
Tan Keren
Yeo Jihwang
Publication venue: Dartmouth Digital Commons
Publication date: 21/03/2011
Field of study

Network researchers benefit tremendously from access to traces of production networks, and several repositories of such network traces exist. By their very nature, these traces capture sensitive business and personal activity. Furthermore, network traces contain significant operational information about the target network, such as its structure, identity of the network provider, or addresses of important servers. To protect private or proprietary information, researchers must “sanitize” a trace before sharing it. \par In this chapter, we survey the growing body of research that addresses the risks, methods, and evaluation of network trace sanitization. Research on the risks of network trace sanitization attempts to extract information from published network traces, while research on sanitization methods investigates approaches that may protect against such attacks. Although researchers have recently proposed both quantitative and qualitative methods to evaluate the effectiveness of sanitization methods, such work has several shortcomings, some of which we highlight in a discussion of open problems. Sanitizing a network trace, however challenging, remains an important method for advancing network–based research

Dartmouth Digital Commons (Dartmouth College)