Search CORE

143 research outputs found

Privacy-preserving Transactions on the Web

Author: Behl Sahil
Lilien Leszek T.
Publication venue: ScholarWorks at WMU
Publication date: 22/09/2011
Field of study

There is a rapid growth in the number of applications using sensitive and personal information on the World Wide Web. This growth creates an urgent need to maintain the anonymity of the participants in many web transactions and to preserve the privacy of their sensitive data during data dissemination over the web. First, maintaining the anonymity of users on the World Wide Web is essential for a number of web applications. Anonymity cannot be assured by single interested individuals or an organization but requires participation from other web nodes owned by other entities. Second, preserving the privacy of sensitive data is another very important issue in web transactions. Today, exchanging and sharing personal data between various participants in web transactions endangers privacy. In this article, we discuss various research directions and challenges that need to be addressed while trying to accomplish our goal of maintaining the anonymity of participants and preserving the privacy of sensitive data in web transactions. To maintain anonymity of participants in a web transaction, we propose a method based on the modi fied form of the club mechanism with economic incentives, a solution which rests upon the Prisoner’s Dilemma approach. We compare our approach to other well-known dat a-sharing approaches such as Crowds, Tor, Tarzan and LPWA. To maintain the privacy of sensitive data, we propose a solution based on privacy-preserving data dissemination (P2D2). We also present a solution to implement our approach using Semantic Web Rule Languages and Jena—a Java-based inference engine

ScholarWorks at WMU

Survey of Privacy-Preserving Data Publishing Methods and Speedy: a multi-threaded algorithm preserving k-anonymity

Author: Χατζόπουλος Σεραφείμ
Publication venue
Publication date: 01/01/2015
Field of study

Στις μέρες μας, πολλοί οργανισμοί, επιχειρήσεις ή κρατικοί φορείς συλλέγουν και διαχειρίζονται μεγάλο όγκο προσωπικών πληροφοριών. Τυπικά παραδείγματα τέτοιων συνόλων δεδομένων περιλαμβάνουν κλινικές εξετάσεις νοσοκομείων, query logs μηχανών αναζήτησης, κοινωνικά δεδομένων προερχόμενα από δίκτυα κοινωνικής δικτύωσης, οικονομικά στοιχεία πληροφοριακών συστημάτων του δημοσίου κλπ. Αυτά τα σύνολα δεδομένων χρειάζεται συχνά να δημοσιευτούν για ερευνητικές ή στατιστικές μελέτες χωρίς να αποκαλυφθούν ευαίσθητα δεδομένα των ανθρώπων που περιλαμβάνουν. Η διαδικασία ανωνυμοποίησης είναι πιο περίπλοκη από την απλή απόκρυψη πεδίων που μπορούν άμεσα να προσδιορίσουν ένα άτομο (όνομα, AΦM κλπ). Ακόμα και χωρίς αυτά τα πεδία, ένας επιτιθέμενος μπορεί να προκαλέσει διαρροή ευαίσθητων πληροφοριών διασταυρώνοντας με άλλα δημόσια διαθέσιμα σύνολα δεδομένων ή έχοντας κάποιου είδους πρότερη γνώση. Επομένως, η διαφύλαξη της ιδιωτικότητας σε δεδομένα προς δημοσίευση έχει προσεγγίσει μεγάλο ενδιαφέρον τα τελευταία χρόνια με αρκετά μοντέλα ιδιωτικότητας να έχουν προταθεί στη βιβλιογραφία. Σε αυτή τη διπλωματική εργασία, αναλύουμε τις πιο συχνές επιθέσεις που μπορούν να γίνουν σε δημοσιευμένα σύνολα δεδομένων και παρουσιάζουμε τις πιο σύγχρονες εγγυήσεις ιδιωτικότητας και αλγορίθμους ανωνυμοποίησης για την αντιμετώπιση των επιθέσεων αυτών. Επιπλέον, προτείνουμε ένα νέο πολυνηματικό αλγόριθμο ανωνυμοποίησης που εκμεταλλεύεται τις δυνατότητες των σύγχρονων επεξεργαστών ώστε να επιταχυνθεί η διαδικασία ανωνυμοποίησης και να επιτευχθεί η k-ανωνυμία στο ανωνυμοποιημένο σύνολο δεδομένων.Nowadays, many organizations, enterprises or public services collect and manage a vast amount of personal information. Typical examples of such datasets include clinical tests conducted in hospitals, query logs held by search engines, social data produced by social networks, financial data from public sector information systems etc. These datasets often need to be published for research or statistical studies without revealing sensitive information of the individuals they describe. The anonymization process is more complicated than hiding attributes that can directly identify an individual (name, SSN etc.) from the published dataset. Even without these attributes an adversary can cause privacy leakage by cross-linking with other publicly available datasets or having some sort of background knowledge. Therefore, privacy preservation in data publishing has gained considerable attention during recent years with several privacy models proposed in the literature. In this thesis, we discuss the most common attacks that can be made on published datasets and we present state-of-the-art privacy guarantees and anonymization algorithms to counter these attacks. Furthermore, we propose a novel multi-threaded anonymization algorithm which exploits the capabilities of modern CPUs to speed up the anonymization process achieving k-anonymity in the anonymized dataset

Pergamos : Unified Institutional Repository / Digital Library Platform of the National and Kapodistrian University of Athens

Privacy Preservation in High-dimensional Trajectory Data for Passenger Flow Analysis

Author: Ghasemzadeh Moein
Publication venue
Publication date: 01/09/2013
Field of study

The increasing use of location-aware devices provides many opportunities for analyzing and mining human mobility. The trajectory of a person can be represented as a sequence of visited locations with different timestamps. Storing, sharing, and analyzing personal trajectories may pose new privacy threats. Previous studies have shown that employing traditional privacy models and anonymization methods often leads to low information quality in the resulting data. In this thesis we propose a method for achieving anonymity in a trajectory database while preserving the information to support effective passenger flow analysis. Specifically, we first extract the passenger flowgraph, which is a commonly employed representation for modeling uncertain moving objects, from the raw trajectory data. We then anonymize the data with the goal of minimizing the impact on the flowgraph. Extensive experimental results on both synthetic and real-life data sets suggest that the framework is effective to overcome the special challenges in trajectory data anonymization, namely, high dimensionality, sparseness, and sequentiality

Concordia University Research Repository

ρ-uncertainty Anonymization by Partial Suppression

Author: Chao Pan
Eric Lo
Kenny Q. Zhu
Xiao Jia
Xinhui Xu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Abstract. We present a novel framework for set-valued data anonymiza-tion by partial suppression regardless of the amount of background knowl-edge the attacker possesses, and can be adapted to both space-time and quality-time trade-offs in a “pay-as-you-go ” approach. While minimizing the number of item deletions, the framework attempts to either preserve the original data distribution or retain mineable useful association rules, which targets statistical analysis and association mining, two major data mining applications on set-valued data.

CiteSeerX

The Hong Kong Polytechnic University Pao Yue-kong Library

Crossref

Privacy-Preserving Design of Data Processing Systems in the Public Transport Context

Author: Callegati Franco
Campi Aldo
Melis Andrea
Prandini Marco
Zevenbergen Bendert
Publication venue: AIS Electronic Library (AISeL)
Publication date: 01/01/2015
Field of study

The public transport network of a region inhabited by more than 4 million people is run by a complex interplay of public and private actors. Large amounts of data are generated by travellers, buying and using various forms of tickets and passes. Analysing the data is of paramount importance for the governance and sustainability of the system. This manuscript reports the early results of the privacy analysis which is being undertaken as part of the analysis of the clearing process in the Emilia-Romagna region, in Italy, which will compute the compensations for tickets bought from one operator and used with another. In the manuscript it is shown by means of examples that the clearing data may be used to violate various privacy aspects regarding users, as well as (technically equivalent) trade secrets regarding operators. The ensuing discussion has a twofold goal. First, it shows that after researching possible existing solutions, both by reviewing the literature on general privacy-preserving techniques, and by analysing similar scenarios that are being discussed in various cities across the world, the former are found exhibiting structural effectiveness deficiencies, while the latter are found of limited applicability, typically involving less demanding requirements. Second, it traces a research path towards a more effective approach to privacy-preserving data management in the specific context of public transport, both by refinement of current sanitization techniques and by application of the privacy by design approach. Available at: https://aisel.aisnet.org/pajais/vol7/iss4/4

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

AIS Electronic Library (AISeL)

Publishing data from electronic health records while preserving privacy: a survey of algorithms

Author: Adam
Aggarwal
Aggarwal
Aris Gkoulalas-Divanis
Berman
Burges
Cao
Cassa
Chen
De Capitani di Vimercati
Dean
Domingo-Ferrer
Domingo-Ferrer
Domingo-Ferrer
El Emam
El Emam
El Emam
El Emam
Fernandez-Aleman
Filho
Fung
Fung
Gardner
Gkoulalas-Divanis
Grigorios Loukides
Gupta
He
Homer
Jimeng Sun
Laszlo
Lau
LeFevre
Li
Li
Loukides
Loukides
Loukides
Loukides
Loukides
Loukides
Loukides
Loukides
Loukides
Mailman
Makoul
Malin
Malin
Malin
Meystre
Moon
Nergiz
Nergiz
Nergiz
Ollier
Press
Reis
Rothstein
Samarati
Sandhu
Sweeney
Terrovitis
Terrovitis
Tildesley
Van Rijsbergen
Zhang
Publication venue: 'Elsevier BV'
Publication date: 01/08/2014
Field of study

The dissemination of Electronic Health Records (EHRs) can be highly beneficial for a range of medical studies, spanning from clinical trials to epidemic control studies, but it must be performed in a way that preserves patients’ privacy. This is not straightforward, because the disseminated data need to be protected against several privacy threats, while remaining useful for subsequent analysis tasks. In this work, we present a survey of algorithms that have been proposed for publishing structured patient data, in a privacy-preserving way. We review more than 45 algorithms, derive insights on their operation, and highlight their advantages and disadvantages. We also provide a discussion of some promising directions for future research in this area

Elsevier - Publisher Connector

Crossref

Online Research @ Cardiff

What the Surprising Failure of Data Anonymization Means for Law and Policy

Author: Ohm Paul
Publication venue: Santa Clara Law Digital Commons
Publication date: 08/04/2010
Field of study

Paul Ohm is an Associate Professor of Law at the University of Colorado Law School. He writes in the areas of information privacy, computer crime law, intellectual property, and criminal procedure. Through his scholarship and outreach, Professor Ohm is leading efforts to build new interdisciplinary bridges between law and computer science. Before becoming a law professor, Professor Ohm served as a federal prosecutor for the U.S. Department of Justice in the computer crimes unit. Before law school, he worked as a computer programmer and network systems administrator

bepress Legal Repository

Santa Clara University School of Law

Anonymizing large transaction data using MapReduce

Author: Memon Neelam
Publication venue
Publication date
Field of study

Publishing transaction data is important to applications such as marketing research and biomedical studies. Privacy is a concern when publishing such data since they often contain person-specific sensitive information. To address this problem, different data anonymization methods have been proposed. These methods have focused on protecting the associated individuals from different types of privacy leaks as well as preserving utility of the original data. But all these methods are sequential and are designed to process data on a single machine, hence not scalable to large datasets. Recently, MapReduce has emerged as a highly scalable platform for data-intensive applications. In this work, we consider how MapReduce may be used to provide scalability in large transaction data anonymization. More specifically, we consider how setbased generalization methods such as RBAT (Rule-Based Anonymization of Transaction data) may be parallelized using MapReduce. Set-based generalization methods have some desirable features for transaction anonymization, but their highly iterative nature makes parallelization challenging. RBAT is a good representative of such methods. We propose a method for transaction data partitioning and representation. We also present two MapReduce-based parallelizations of RBAT. Our methods ensure scalability when the number of transaction records and domain of items are large. Our preliminary results show that a direct parallelization of RBAT by partitioning data alone can result in significant overhead, which can offset the gains from parallel processing. We propose MR-RBAT that generalizes our direct parallel method and allows to control parallelization overhead. Our experimental results show that MR-RBAT can scale linearly to large datasets and to the available resources while retaining good data utility

Online Research @ Cardiff