2,192 research outputs found
Privacy Preservation by Disassociation
In this work, we focus on protection against identity disclosure in the
publication of sparse multidimensional data. Existing multidimensional
anonymization techniquesa) protect the privacy of users either by altering the
set of quasi-identifiers of the original data (e.g., by generalization or
suppression) or by adding noise (e.g., using differential privacy) and/or (b)
assume a clear distinction between sensitive and non-sensitive information and
sever the possible linkage. In many real world applications the above
techniques are not applicable. For instance, consider web search query logs.
Suppressing or generalizing anonymization methods would remove the most
valuable information in the dataset: the original query terms. Additionally,
web search query logs contain millions of query terms which cannot be
categorized as sensitive or non-sensitive since a term may be sensitive for a
user and non-sensitive for another. Motivated by this observation, we propose
an anonymization technique termed disassociation that preserves the original
terms but hides the fact that two or more different terms appear in the same
record. We protect the users' privacy by disassociating record terms that
participate in identifying combinations. This way the adversary cannot
associate with high probability a record with a rare combination of terms. To
the best of our knowledge, our proposal is the first to employ such a technique
to provide protection against identity disclosure. We propose an anonymization
algorithm based on our approach and evaluate its performance on real and
synthetic datasets, comparing it against other state-of-the-art methods based
on generalization and differential privacy.Comment: VLDB201
Privacy in the Genomic Era
Genome sequencing technology has advanced at a rapid pace and it is now
possible to generate highly-detailed genotypes inexpensively. The collection
and analysis of such data has the potential to support various applications,
including personalized medical services. While the benefits of the genomics
revolution are trumpeted by the biomedical community, the increased
availability of such data has major implications for personal privacy; notably
because the genome has certain essential features, which include (but are not
limited to) (i) an association with traits and certain diseases, (ii)
identification capability (e.g., forensics), and (iii) revelation of family
relationships. Moreover, direct-to-consumer DNA testing increases the
likelihood that genome data will be made available in less regulated
environments, such as the Internet and for-profit companies. The problem of
genome data privacy thus resides at the crossroads of computer science,
medicine, and public policy. While the computer scientists have addressed data
privacy for various data types, there has been less attention dedicated to
genomic data. Thus, the goal of this paper is to provide a systematization of
knowledge for the computer science community. In doing so, we address some of
the (sometimes erroneous) beliefs of this field and we report on a survey we
conducted about genome data privacy with biomedical specialists. Then, after
characterizing the genome privacy problem, we review the state-of-the-art
regarding privacy attacks on genomic data and strategies for mitigating such
attacks, as well as contextualizing these attacks from the perspective of
medicine and public policy. This paper concludes with an enumeration of the
challenges for genome data privacy and presents a framework to systematize the
analysis of threats and the design of countermeasures as the field moves
forward
Garantia de privacidade na exploração de bases de dados distribuídas
Anonymisation is currently one of the biggest challenges when sharing sensitive
personal information. Its importance depends largely on the application
domain, but when dealing with health information, this becomes a more serious
issue. A simpler approach to avoid this disclosure is to ensure that all
data that can be associated directly with an individual is removed from the
original dataset. However, some studies have shown that simple anonymisation
procedures can sometimes be reverted using specific patients’ characteristics,
namely when the anonymisation is based on hidden key attributes.
In this work, we propose a secure architecture to share information from distributed
databases without compromising the subjects’ privacy. The work
was initially focused on identifying techniques to link information between
multiple data sources, in order to revert the anonymization procedures. In
a second phase, we developed the methodology to perform queries over
distributed databases was proposed. The architecture was validated using
a standard data schema that is widely adopted in observational research
studies.A garantia da anonimização de dados é atualmente um dos maiores desafios
quando existe a necessidade de partilhar informações pessoais de carácter
sensível. Apesar de ser um problema transversal a muitos domínios de
aplicação, este torna-se mais crítico quando a anonimização envolve dados
clinicos. Nestes casos, a abordagem mais comum para evitar a divulgação
de dados, que possam ser associados diretamente a um indivíduo, consiste
na remoção de atributos identificadores. No entanto, segundo a literatura,
esta abordagem não oferece uma garantia total de anonimato, que pode ser
quebrada através de ataques específicos que permitem a reidentificação dos
sujeitos.
Neste trabalho, é proposta uma arquitetura que permite partilhar dados
armazenados em repositórios distribuídos, de forma segura e sem comprometer
a privacidade. Numa primeira fase deste trabalho, foi feita uma análise
de técnicas que permitam reverter os procedimentos de anonimização. Na
fase seguinte, foi proposta uma metodologia que permite realizar pesquisas
em bases de dados distribuídas, sem que o anonimato seja quebrado. Esta
arquitetura foi validada sobre um esquema de base de dados relacional que
é amplamente utilizado em estudos clínicos observacionais.Mestrado em Ciberseguranç
- …