Search CORE

448 research outputs found

Privacy Preservation by Disassociation

Author: Liagouris John
Mamoulis Nikos
Skiadopoulos Spiros
Terrovitis Manolis
Publication venue
Publication date: 01/01/2012
Field of study

In this work, we focus on protection against identity disclosure in the publication of sparse multidimensional data. Existing multidimensional anonymization techniquesa) protect the privacy of users either by altering the set of quasi-identifiers of the original data (e.g., by generalization or suppression) or by adding noise (e.g., using differential privacy) and/or (b) assume a clear distinction between sensitive and non-sensitive information and sever the possible linkage. In many real world applications the above techniques are not applicable. For instance, consider web search query logs. Suppressing or generalizing anonymization methods would remove the most valuable information in the dataset: the original query terms. Additionally, web search query logs contain millions of query terms which cannot be categorized as sensitive or non-sensitive since a term may be sensitive for a user and non-sensitive for another. Motivated by this observation, we propose an anonymization technique termed disassociation that preserves the original terms but hides the fact that two or more different terms appear in the same record. We protect the users' privacy by disassociating record terms that participate in identifying combinations. This way the adversary cannot associate with high probability a record with a rare combination of terms. To the best of our knowledge, our proposal is the first to employ such a technique to provide protection against identity disclosure. We propose an anonymization algorithm based on our approach and evaluate its performance on real and synthetic datasets, comparing it against other state-of-the-art methods based on generalization and differential privacy.Comment: VLDB201

arXiv.org e-Print Archive

CiteSeerX

HKU Scholars Hub

Privacy in data publishing for tailored recommendation scenarios

Author: Aguiar R. L.
Gomes Diogo Nuno
Gonçalves J. M.
Publication venue: IIIA-CSIC
Publication date: 01/01/2015
Field of study

Personal information is increasingly gathered and used for providing services tailored to user preferences, but the datasets used to provide such functionality can represent serious privacy threats if not appropriately protected. Work in privacy-preserving data publishing targeted privacy guarantees that protect against record re-identification, by making records indistinguishable, or sensitive attribute value disclosure, by introducing diversity or noise in the sensitive values. However, most approaches fail in the high-dimensional case, and the ones that don’t introduce a utility cost incompatible with tailored recommendation scenarios. This paper aims at a sensible trade-off between privacy and the benefits of tailored recommendations, in the context of privacy-preserving data publishing. We empirically demonstrate that significant privacy improvements can be achieved at a utility cost compatible with tailored recommendation scenarios, using a simple partition-based sanitization method

Repositório Institucional da Universidade de Aveiro

Directory of Open Access Journals

ρ-uncertainty Anonymization by Partial Suppression

Author: Chao Pan
Eric Lo
Kenny Q. Zhu
Xiao Jia
Xinhui Xu
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Abstract. We present a novel framework for set-valued data anonymiza-tion by partial suppression regardless of the amount of background knowl-edge the attacker possesses, and can be adapted to both space-time and quality-time trade-offs in a “pay-as-you-go ” approach. While minimizing the number of item deletions, the framework attempts to either preserve the original data distribution or retain mineable useful association rules, which targets statistical analysis and association mining, two major data mining applications on set-valued data.

CiteSeerX

The Hong Kong Polytechnic University Pao Yue-kong Library

Crossref

Semantic attack on anonymised transaction data

Author: Alshuhail Asma
Publication venue
Publication date
Field of study

Publishing data about individuals is a double-edged sword; it can provide a significant benefit for a range of organisations to help understand issues concerning individuals, and improve services they offer. However, it can also represent a serious threat to individuals’ privacy. To overcome these threats, researchers have worked on developing anonymisation methods. However, the anonymisation methods do not take into consideration the semantic relationships and meaning of data, which can be exploited by attackers to expose protected data. In our work, we study a specific anonymisation method called disassociation and investigate if it provides adequate protection for transaction data. The disassociation method hides sensitive links between transaction’s items by dividing them into chunks. We propose a de-anonymisation approach to attacking transaction data anonymised by the disassociated data. The approach exploits the semantic relationships between transaction items to reassociate them. Our findings reveal that the disassociation method may not effectively protect transaction data. Our de-anonymisation approach can recombine approximately 60% of the disassociated items and can break the privacy of nearly 70% of the protected itemets in disassociated transactions

Online Research @ Cardiff

Privacy preservation in e-health cloud:Taxonomy, privacy requirements, feasibility analysis, and opportunities

Author: Anjum Adeel
Kanwal Tehsin
Khan Abid
Publication venue
Publication date: 01/03/2021
Field of study

Crossref

Aberystwyth Research Portal

Recommended from our members

Human Mobility Monitoring using WiFi: Analysis, Modeling, and Applications

Author: Trivedi Amee
Publication venue: ScholarWorks@UMass Amherst
Publication date: 20/10/2021
Field of study

Understanding and modeling humans and device mobility has fundamental importance in mobile computing, with implications ranging from network design and location-aware technologies to urban infrastructure planning. Today\u27s users carry a plethora of devices such as smartphones, laptops, tablets, and smartwatches, with each device offering a different set of services resulting in different usage and mobility leading to the research question of understanding and modeling multiple user device trajectories. Additionally, prior research on mobility focuses on outdoor mobility when it is known that users spend 80% of their time indoors resulting in wide gaps in knowledge in the area of indoor mobility of users and devices. Here, I try to fill the gaps in mobility modeling in the areas of understanding and modeling indoor-outdoor human mobility as well as multi-device mobility. In this thesis, I propose the characterization and modeling of human and device mobility. Further, I design and deploy mobility-aware applications for contact tracing of infectious diseases and energy-aware Heating, Ventilation, and Air Conditioning (HVAC) scheduling. I try and answer a sequence of four primary inter-related questions : (1) how is indoor and outdoor user mobility different, (2) are multiple device trajectories belonging to a single user correlated, (3) how to model indoor mobility of users and (4) how to design effective mobility aware applications that are easily deployable and align with long term goals of sustainability as well relay positive societal impact. The insights gained from each question serves as a base to build up on the next question in the series. I present answers to these questions across three main parts of my thesis. The first part comprises of characterization and analysis of human and device mobility. In this part I design and develop tool to extract device trajectories from WiFi system logs syslog and map devices to users. These extracted trajectories and device to user mapping are used to characterize and empirically analyze the mobility of users at varying spatial granularity (indoor, outdoor) and extract device mobility correlations between multiple devices of users and forms the first part of my thesis. In the second part, based on the insights gained from the multi-granular and multi-device mobility characterization stated above, I argue that mobility is inherently hierarchical in nature and propose novel indoor human mobility modeling approach. Third, I leverage the passively observed mobility to design mobility-aware applications that either look back or look ahead in time. WiFiTrace is a look back or backtracking application that is a network-centric contact tracing tool to aid healthcare workers in manual contact tracing of infectious diseases and iSchedule is a look ahead machine learning based mobility-aware energy-saving application that predicts Heating, Ventilation, and Air Conditioning (HVAC) schedule for higher energy savings while increasing user comfort

ScholarWorks@UMass Amherst

Stochastic Modeling and Inference of Large-scale Gene Regulatory Networks

Author: Kim Haseong
Publication venue: Electrical and Electronic Engineering, Imperial College London
Publication date: 01/08/2012
Field of study

Gene regulatory networks (GRNs) consist of thousands of genes and proteins which are dynamically interacting with each other. Researchers have investigated how to uncover these unknown interactions by observing expressions of biological molecules with various statistical/mathematical methods. Once these regulatory structures are revealed, it is necessary to understand their dynamical behaviors since pathway activities could be changed by their given conditions. Therefore, both the regulatory structure estimation and dynamics modeling of GRNs are essential for biological research. Generally, GRN dynamics are usually investigated via stochastic models since molecular interactions are basically discrete and stochastic processes. However, this stochastic nature requires heavy simulation time to find the steady-state solution of the GRNs where thousands of genes are involved. This large number of genes also causes difficulties such as dimensionality problem in estimating their regulatory structure. This thesis mainly focuses on developing methodologies for large-scale GRN analyses. It includes applications of a stochastic process theory called G-networks and a reverse engineering technique for large-scale GRNs. Additionally a series of bioinformatics techniques was applied in brain tumor data to detect disease candidate genes along with their large-scale GRNs. The proposed techniques such as stochastic modeling (bottom-up) and reverse engineering (top-down) could provide a systematic view of a complex system and an efficient guideline to identify candidate genes or pathways triggering a specific phenotype of a cell. As further work, the combinatorial use of the modeling and reverse engineering approaches would be helpful in obtaining a reliable mathematical model and even in developing a synthetic biological system

Spiral - Imperial College Digital Repository

Large-scale Wireless Local-area Network Measurement and Privacy Analysis

Author: Tan Keren
Publication venue: Dartmouth Digital Commons
Publication date: 01/08/2011
Field of study

The edge of the Internet is increasingly becoming wireless. Understanding the wireless edge is therefore important for understanding the performance and security aspects of the Internet experience. This need is especially necessary for enterprise-wide wireless local-area networks (WLANs) as organizations increasingly depend on WLANs for mission- critical tasks. To study a live production WLAN, especially a large-scale network, is a difficult undertaking. Two fundamental difficulties involved are (1) building a scalable network measurement infrastructure to collect traces from a large-scale production WLAN, and (2) preserving user privacy while sharing these collected traces to the network research community. In this dissertation, we present our experience in designing and implementing one of the largest distributed WLAN measurement systems in the United States, the Dartmouth Internet Security Testbed (DIST), with a particular focus on our solutions to the challenges of efficiency, scalability, and security. We also present an extensive evaluation of the DIST system. To understand the severity of some potential trace-sharing risks for an enterprise-wide large-scale wireless network, we conduct privacy analysis on one kind of wireless network traces, a user-association log, collected from a large-scale WLAN. We introduce a machine-learning based approach that can extract and quantify sensitive information from a user-association log, even though it is sanitized. Finally, we present a case study that evaluates the tradeoff between utility and privacy on WLAN trace sanitization

Dartmouth Digital Commons (Dartmouth College)

Privacidade em comunicações de dados para ambientes contextualizados

Author: Gonçalves João Miguel Ribeiro
Publication venue: Universidade de Aveiro
Publication date: 01/01/2015
Field of study

Doutoramento em InformáticaInternet users consume online targeted advertising based on information collected about them and voluntarily share personal information in social networks. Sensor information and data from smart-phones is collected and used by applications, sometimes in unclear ways. As it happens today with smartphones, in the near future sensors will be shipped in all types of connected devices, enabling ubiquitous information gathering from the physical environment, enabling the vision of Ambient Intelligence. The value of gathered data, if not obvious, can be harnessed through data mining techniques and put to use by enabling personalized and tailored services as well as business intelligence practices, fueling the digital economy. However, the ever-expanding information gathering and use undermines the privacy conceptions of the past. Natural social practices of managing privacy in daily relations are overridden by socially-awkward communication tools, service providers struggle with security issues resulting in harmful data leaks, governments use mass surveillance techniques, the incentives of the digital economy threaten consumer privacy, and the advancement of consumergrade data-gathering technology enables new inter-personal abuses. A wide range of fields attempts to address technology-related privacy problems, however they vary immensely in terms of assumptions, scope and approach. Privacy of future use cases is typically handled vertically, instead of building upon previous work that can be re-contextualized, while current privacy problems are typically addressed per type in a more focused way. Because significant effort was required to make sense of the relations and structure of privacy-related work, this thesis attempts to transmit a structured view of it. It is multi-disciplinary - from cryptography to economics, including distributed systems and information theory - and addresses privacy issues of different natures. As existing work is framed and discussed, the contributions to the state-of-theart done in the scope of this thesis are presented. The contributions add to five distinct areas: 1) identity in distributed systems; 2) future context-aware services; 3) event-based context management; 4) low-latency information flow control; 5) high-dimensional dataset anonymity. Finally, having laid out such landscape of the privacy-preserving work, the current and future privacy challenges are discussed, considering not only technical but also socio-economic perspectives.Quem usa a Internet vê publicidade direccionada com base nos seus hábitos de navegação, e provavelmente partilha voluntariamente informação pessoal em redes sociais. A informação disponível nos novos telemóveis é amplamente acedida e utilizada por aplicações móveis, por vezes sem razões claras para isso. Tal como acontece hoje com os telemóveis, no futuro muitos tipos de dispositivos elecónicos incluirão sensores que permitirão captar dados do ambiente, possibilitando o surgimento de ambientes inteligentes. O valor dos dados captados, se não for óbvio, pode ser derivado através de técnicas de análise de dados e usado para fornecer serviços personalizados e definir estratégias de negócio, fomentando a economia digital. No entanto estas práticas de recolha de informação criam novas questões de privacidade. As práticas naturais de relações inter-pessoais são dificultadas por novos meios de comunicação que não as contemplam, os problemas de segurança de informação sucedem-se, os estados vigiam os seus cidadãos, a economia digital leva á monitorização dos consumidores, e as capacidades de captação e gravação dos novos dispositivos eletrónicos podem ser usadas abusivamente pelos próprios utilizadores contra outras pessoas. Um grande número de áreas científicas focam problemas de privacidade relacionados com tecnologia, no entanto fazem-no de maneiras diferentes e assumindo pontos de partida distintos. A privacidade de novos cenários é tipicamente tratada verticalmente, em vez de re-contextualizar trabalho existente, enquanto os problemas actuais são tratados de uma forma mais focada. Devido a este fraccionamento no trabalho existente, um exercício muito relevante foi a sua estruturação no âmbito desta tese. O trabalho identificado é multi-disciplinar - da criptografia à economia, incluindo sistemas distribuídos e teoria da informação - e trata de problemas de privacidade de naturezas diferentes. À medida que o trabalho existente é apresentado, as contribuições feitas por esta tese são discutidas. Estas enquadram-se em cinco áreas distintas: 1) identidade em sistemas distribuídos; 2) serviços contextualizados; 3) gestão orientada a eventos de informação de contexto; 4) controlo de fluxo de informação com latência baixa; 5) bases de dados de recomendação anónimas. Tendo descrito o trabalho existente em privacidade, os desafios actuais e futuros da privacidade são discutidos considerando também perspectivas socio-económicas

Repositório Institucional da Universidade de Aveiro