43 research outputs found

    Quantifying and mitigating privacy risks in biomedical data

    Get PDF
    Die stetig sinkenden Kosten fĂŒr molekulares Profiling haben der Biomedizin zahlreiche neue Arten biomedizinischer Daten geliefert und den Durchbruch fĂŒr eine prĂ€zisere und personalisierte Medizin ermöglicht. Die Veröffentlichung dieser inhĂ€rent hochsensiblen und miteinander verbundenen Daten stellt jedoch eine neue Bedrohung fĂŒr unsere PrivatsphĂ€re dar. WĂ€hrend die IT-Sicherheitsforschung sich bisher hauptsĂ€chlich auf die Auswirkung genetischer Daten auf die PrivatsphĂ€re konzentriert hat, wurden die vielfĂ€ltigen Risiken durch andere Arten biomedizinischer Daten – epigenetischer Daten im Speziellen – grĂ¶ĂŸtenteils außer Acht gelassen. Diese Dissertation stellt Verfahren zur Messung und Abwehr solcher PrivatsphĂ€rerisiken vor. Neben dem Genom konzentrieren wir uns auf zwei der wichtigsten gesundheitsrelevanten epigenetischen Elemente: microRNAs und DNA-Methylierung. Wir quantifizieren die PrivatsphĂ€re fĂŒr die folgenden realistischen Angriffe: (1) VerknĂŒpfung von Profilen ĂŒber die Zeit, VerknĂŒpfung verschiedener Datentypen und verwandter Personen, (2) Feststellung der Studienteilnahme und (3) Inferenz von Attributen. Unsere Resultate bekrĂ€ftigen, dass die PrivatsphĂ€rerisiken solcher Daten ernst genommen werden mĂŒssen. Zudem prĂ€sentieren und evaluieren wir Lösungen zum Schutz der PrivatsphĂ€re. Sie reichen von der Anwendung von Differential Privacy unter BerĂŒcksichtigung des Nutzwertes bis zu kryptographischen Protokollen zur sicheren Auswertung eines Random Forests.The decreasing costs of molecular profiling have fueled the biomedical research community with a plethora of new types of biomedical data, allowing for a breakthrough towards a more precise and personalized medicine. However, the release of these intrinsically highly sensitive, interdependent data poses a new severe privacy threat. So far, the security community has mostly focused on privacy risks arising from genomic data. However, the manifold privacy risks stemming from other types of biomedical data – and epigenetic data in particular – have been largely overlooked. In this thesis, we provide means to quantify and protect the privacy of individuals’ biomedical data. Besides the genome, we specifically focus on two of the most important epigenetic elements influencing human health: microRNAs and DNA methylation. We quantify the privacy for multiple realistic attack scenarios, namely, (1) linkability attacks along the temporal dimension, between different types of data, and between related individuals, (2) membership attacks, and (3) inference attacks. Our results underline that the privacy risks inherent to biomedical data have to be taken seriously. Moreover, we present and evaluate solutions to preserve the privacy of individuals. Our mitigation techniques stretch from the differentially private release of epigenetic data, considering its utility, up to cryptographic constructions to securely, and privately evaluate a random forest on a patient’s data

    Albatross: An optimistic consensus algorithm

    Full text link
    The area of distributed ledgers is a vast and quickly developing landscape. At the heart of most distributed ledgers is their consensus protocol. The consensus protocol describes the way participants in a distributed network interact with each other to obtain and agree on a shared state. While classical consensus Byzantine fault tolerant (BFT) algorithms are designed to work in closed, size-limited networks only, modern distributed ledgers -- and blockchains in particular -- often focus on open, permissionless networks. In this paper, we present a novel blockchain consensus algorithm, called Albatross, inspired by speculative BFT algorithms. Transactions in Albatross benefit from strong probabilistic finality. We describe the technical specification of Albatross in detail and analyse its security and performance. We conclude that the protocol is secure under regular PBFT security assumptions and has a performance close to the theoretical maximum for single-chain Proof-of-Stake consensus algorithms

    Measuring Conditional Anonymity - A Global Study

    Get PDF
    The realm of digital health is experiencing a global surge, with mobile applications extending their reach into various facets of daily life. From tracking daily eating habits and vital functions to monitoring sleep patterns and even the menstrual cycle, these apps have become ubiquitous in their pursuit of comprehensive health insights. Many of these apps collect sensitive data and promise users to protect their privacy - often through pseudonymization. We analyze the real anonymity that users can expect by this approach and report on our findings. More concretely: 1. We introduce the notion of conditional anonymity sets derived from statistical properties of the population. 2. We measure anonymity sets for two real-world applications and present overarching findings from 39 countries. 3. We develop a graphical tool for people to explore their own anonymity set. One of our case studies is a popular app for tracking the menstruation cycle. Our findings for this app show that, despite their promise to protect privacy, the collected data can be used to identify users up to groups of 5 people in 97% of all the US counties, allowing the de-anonymization of the individuals. Given that the US Supreme Court recently overturned abortion rights, the possibility of determining individuals is a calamity

    A framework for constructing Single Secret Leader Election from MPC

    Get PDF
    The emergence of distributed digital currencies has raised the need for a reliable consensus mechanism. In proof-of-stake cryptocurrencies, the participants periodically choose a closed set of validators, who can vote and append transactions to the blockchain. Each validator can become a leader with the probability proportional to its stake.Keeping the leader private yet unique until it publishes a new block can significantly reduce the attack vector of an adversary and improve the throughput of the network. The problem of Single Secret Leader Election(SSLE) was first formally defined by Boneh et al. in 2020. In this work, we propose a novel framework for constructing SSLE protocols, which relies on secure multi-party computation (MPC) and satisfies the desired security properties. Our framework does not use any shuffle or sort operations and has a computational cost for N parties as low as O(N) of basic MPC operations per party. We improve the state-of-the-art for SSLE protocols that do not assume a trusted setup. Moreover, our SSLE scheme efficiently handles weighted elections. That is, for a total weight S of N parties, the associated costs are only increased by a factor of log S. When the MPC layer is instantiated with techniques based on Shamir’s secret-sharing, our SSLE has a communication cost of O(N^2) which is spread over O(log N) rounds, can tolerate up to t < N/2 of faulty nodes without restarting the protocol, and its security relies on DDH in the random oracle model. When the MPC layer is instantiated with more efficient techniques based on garbled circuits, our SSLE re-quires all parties to participate, up to N−1 of which can be malicious, and its security is based on the random oracle model

    Quantifying Privacy Risks of Prompts in Visual Prompt Learning

    Full text link
    Large-scale pre-trained models are increasingly adapted to downstream tasks through a new paradigm called prompt learning. In contrast to fine-tuning, prompt learning does not update the pre-trained model's parameters. Instead, it only learns an input perturbation, namely prompt, to be added to the downstream task data for predictions. Given the fast development of prompt learning, a well-generalized prompt inevitably becomes a valuable asset as significant effort and proprietary data are used to create it. This naturally raises the question of whether a prompt may leak the proprietary information of its training data. In this paper, we perform the first comprehensive privacy assessment of prompts learned by visual prompt learning through the lens of property inference and membership inference attacks. Our empirical evaluation shows that the prompts are vulnerable to both attacks. We also demonstrate that the adversary can mount a successful property inference attack with limited cost. Moreover, we show that membership inference attacks against prompts can be successful with relaxed adversarial assumptions. We further make some initial investigations on the defenses and observe that our method can mitigate the membership inference attacks with a decent utility-defense trade-off but fails to defend against property inference attacks. We hope our results can shed light on the privacy risks of the popular prompt learning paradigm. To facilitate the research in this direction, we will share our code and models with the community.Comment: To appear in the 33rd USENIX Security Symposium, August 14-16, 202

    Data Poisoning Attacks Against Multimodal Encoders

    Get PDF
    Traditional machine learning (ML) models usually rely on large-scale labeled datasets to achieve strong performance. However, such labeled datasets are often challenging and expensive to obtain. Also, the predefined categories limit the model's ability to generalize to other visual concepts as additional labeled data is required. On the contrary, the newly emerged multimodal model, which contains both visual and linguistic modalities, learns the concept of images from the raw text. It is a promising way to solve the above problems as it can use easy-to-collect image-text pairs to construct the training dataset and the raw texts contain almost unlimited categories according to their semantics. However, learning from a large-scale unlabeled dataset also exposes the model to the risk of potential poisoning attacks, whereby the adversary aims to perturb the model's training dataset to trigger malicious behaviors in it. Previous work mainly focuses on the visual modality. In this paper, we instead focus on answering two questions: (1) Is the linguistic modality also vulnerable to poisoning attacks? and (2) Which modality is most vulnerable? To answer the two questions, we conduct three types of poisoning attacks against CLIP, the most representative multimodal contrastive learning framework. Extensive evaluations on different datasets and model architectures show that all three attacks can perform well on the linguistic modality with only a relatively low poisoning rate and limited epochs. Also, we observe that the poisoning effect differs between different modalities, i.e., with lower MinRank in the visual modality and with higher Hit@K when K is small in the linguistic modality. To mitigate the attacks, we propose both pre-training and post-training defenses. We empirically show that both defenses can significantly reduce the attack performance while preserving the model's utility

    On how zero-knowledge proof blockchain mixers improve, and worsen user privacy

    Get PDF
    One of the most prominent and widely-used blockchain privacy solutions are zero-knowledge proof (ZKP) mixers operating on top of smart contract-enabled blockchains. ZKP mixers typically advertise their level of privacy through a so-called anonymity set size, similar to k-anonymity, where a user hides among a set of kk other users. In reality, however, these anonymity set claims are mostly inaccurate, as we find through empirical measurements of the currently most active ZKP mixers. We propose five heuristics that, in combination, can increase the probability that an adversary links a withdrawer to the correct depositor on average by 51.94% (108.63%) on the most popular Ethereum (ETH) and Binance Smart Chain (BSC) mixer, respectively. Our empirical evidence is hence also the first to suggest a differing privacy-predilection of users on ETH and BSC. We further identify 105 Decentralized Finance (DeFi) attackers leveraging ZKP mixers as the initial funds and to deposit attack revenue (e.g., from phishing scams, hacking centralized exchanges, and blockchain project attacks). State-of-the-art mixers are moreover tightly intertwined with the growing DeFi ecosystem by offering ``anonymity mining'' (AM) incentives, i.e., mixer users receive monetary rewards for mixing coins. However, contrary to the claims of related work, we find that AM does not always contribute to improving the quality of an anonymity set size of a mixer, because AM tends to attract privacy-ignorant users naively reusing addresses
    corecore