14 research outputs found

    When the signal is in the noise: Exploiting Diffix's Sticky Noise

    Get PDF
    Anonymized data is highly valuable to both businesses and researchers. A large body of research has however shown the strong limits of the de-identification release-and-forget model, where data is anonymized and shared. This has led to the development of privacy-preserving query-based systems. Based on the idea of "sticky noise", Diffix has been recently proposed as a novel query-based mechanism satisfying alone the EU Article~29 Working Party's definition of anonymization. According to its authors, Diffix adds less noise to answers than solutions based on differential privacy while allowing for an unlimited number of queries. This paper presents a new class of noise-exploitation attacks, exploiting the noise added by the system to infer private information about individuals in the dataset. Our first differential attack uses samples extracted from Diffix in a likelihood ratio test to discriminate between two probability distributions. We show that using this attack against a synthetic best-case dataset allows us to infer private information with 89.4% accuracy using only 5 attributes. Our second cloning attack uses dummy conditions that conditionally strongly affect the output of the query depending on the value of the private attribute. Using this attack on four real-world datasets, we show that we can infer private attributes of at least 93% of the users in the dataset with accuracy between 93.3% and 97.1%, issuing a median of 304 queries per user. We show how to optimize this attack, targeting 55.4% of the users and achieving 91.7% accuracy, using a maximum of only 32 queries per user. Our attacks demonstrate that adding data-dependent noise, as done by Diffix, is not sufficient to prevent inference of private attributes. We furthermore argue that Diffix alone fails to satisfy Art. 29 WP's definition of anonymization. [...

    Quantifying Surveillance in the Networked Age: Node-based Intrusions and Group Privacy

    Full text link
    From the "right to be left alone" to the "right to selective disclosure", privacy has long been thought as the control individuals have over the information they share and reveal about themselves. However, in a world that is more connected than ever, the choices of the people we interact with increasingly affect our privacy. This forces us to rethink our definition of privacy. We here formalize and study, as local and global node- and edge-observability, Bloustein's concept of group privacy. We prove edge-observability to be independent of the graph structure, while node-observability depends only on the degree distribution of the graph. We show on synthetic datasets that, for attacks spanning several hops such as those implemented by social networks and current US laws, the presence of hubs increases node-observability while a high clustering coefficient decreases it, at fixed density. We then study the edge-observability of a large real-world mobile phone dataset over a month and show that, even under the restricted two-hops rule, compromising as little as 1% of the nodes leads to observing up to 46% of all communications in the network. More worrisome, we also show that on average 36\% of each person's communications would be locally edge-observable under the same rule. Finally, we use real sensing data to show how people living in cities are vulnerable to distributed node-observability attacks. Using a smartphone app to compromise 1\% of the population, an attacker could monitor the location of more than half of London's population. Taken together, our results show that the current individual-centric approach to privacy and data protection does not encompass the realities of modern life. This makes us---as a society---vulnerable to large-scale surveillance attacks which we need to develop protections against

    M2^2M: A general method to perform various data analysis tasks from a differentially private sketch

    Full text link
    Differential privacy is the standard privacy definition for performing analyses over sensitive data. Yet, its privacy budget bounds the number of tasks an analyst can perform with reasonable accuracy, which makes it challenging to deploy in practice. This can be alleviated by private sketching, where the dataset is compressed into a single noisy sketch vector which can be shared with the analysts and used to perform arbitrarily many analyses. However, the algorithms to perform specific tasks from sketches must be developed on a case-by-case basis, which is a major impediment to their use. In this paper, we introduce the generic moment-to-moment (M2^2M) method to perform a wide range of data exploration tasks from a single private sketch. Among other things, this method can be used to estimate empirical moments of attributes, the covariance matrix, counting queries (including histograms), and regression models. Our method treats the sketching mechanism as a black-box operation, and can thus be applied to a wide variety of sketches from the literature, widening their ranges of applications without further engineering or privacy loss, and removing some of the technical barriers to the wider adoption of sketches for data exploration under differential privacy. We validate our method with data exploration tasks on artificial and real-world data, and show that it can be used to reliably estimate statistics and train classification models from private sketches.Comment: Published at the 18th International Workshop on Security and Trust Management (STM 2022

    Compressive Learning with Privacy Guarantees

    Get PDF
    International audienceThis work addresses the problem of learning from large collections of data with privacy guarantees. The compressive learning framework proposes to deal with the large scale of datasets by compressing them into a single vector of generalized random moments, from which the learning task is then performed. We show that a simple perturbation of this mechanism with additive noise is sufficient to satisfy differential privacy, a well established formalism for defining and quantifying the privacy of a random mechanism. We combine this with a feature subsampling mechanism, which reduces the computational cost without damaging privacy. The framework is applied to the tasks of Gaussian modeling, k-means clustering and principal component analysis (PCA), for which sharp privacy bounds are derived. Empirically, the quality (for subsequent learning) of the compressed representation produced by our mechanism is strongly related with the induced noise level, for which we give analytical expressions

    Chronic Hepatitis C Virus infection associated hepatic fibrogenesis

    No full text
    Les mécanismes de la fibrogenèse hépatique liée à l'infection chronique par le virus de l'hépatite C (VHC) sont encore mal connus. L'activation de la fibrogenèse semble fortement associée à la réaction inflammatoire locale. Néanmoins le rôle direct du VHC dans le processus fibrogène n'a pas été étudié. Notre hypothèse est que la fibrose peut être au moins en partie directement induite par le VHC, indépendamment de la réponse immune de l'hôte.L'un des aspects inhérents à ce postulat est qu'une relation directe pourrait exister entre les particules du VHC et l'activation des principaux acteurs cellulaires de la fibrogenèse hépatique, les cellules étoilées du foie (CEFs). Ainsi, les objectifs de ce projet ont été d'étudier in vitro, la capacité du VHC à activer les CEFs humaines mais d'abord d'explorer la relation existant entre cette activation et une éventuelle infection de ces cellules par le VHC. Afin d'analyser la permissivité des CEFs à l'infection par le VHC, nous avons combiné plusieurs modèles originaux du VHC, tels que le clone infectieux JFH1, des rétrovirus pseudotypés avec les protéines d'enveloppe du VHC et le réplicon sous-génomique, avec deux modèles cellulaires de CEFs relevant (des cultures primaires humaines et la lignée immortalisée LX2). En conclusion, nous avons démontré que les CEFs humaines sont réfractaires à la fois à l'entrée et à la réplication du VHC. Ces résultats n'écartent cependant pas l'hypothèse d'une interaction directe entre les particules du VHC et la surface des CEFs dans l'activation fibrogénique de ces cellules. Le rôle des protéines d'enveloppe virales sous leur conformation native dans l'activation des CEFs fût étudié en incubant des ppVHC avec les CEFs. Malgré des résultats préliminaires encourageants, la question d'une activation éventuelle des CEFs en culture après contact avec des particules virales du VHC, et ce indépendamment d'une entrée et/ou d'une réplication virale, n'a pu être confirmé et reste encore sans réponse.Un second aspect est que l'expression in vivo de l'ensemble des protéines du VHC dans les hépatocytes pourrait jouer un rôle dans le déclenchement et la progression de la fibrose portale. Nous avons démontré pour la première fois que l'expression hépatocytaire in vivo des protéines du VHC chez des souris transgéniques, les FL-N/35, soumises à un traitement fibrogénique (injection chronique de CCl4) était associée à une fibrose augmentée, et ce de manière indépendante de l'inflammation locale. Cette fibrose augmentée chez les FL-N/35 s'accompagnait d'une production augmentée de d'espèces réactives de l'oxygène intrahépatocytaires, d'une réaction ductulaire caractérisée notamment par une expansion de cellules progénitrices hépatiques (CPHs), et d'une l'inhibition de la prolifération hépatocytaire. On notera également que cette fibrose portale corrélait avec l'expansion des CPHs portales, corollaire implicite de l'inhibition de la prolifération hépatocytaire. Ces observations, également observé chez les patients infectés par le VHC, suggèrent que la RD associée à une altération de la prolifération hépatocytaire jouerait un rôle dans la fibrose portale. Le modèle de souris utilisé présentant une expression intrahépatocytaire des protéines du VHC, nos résultats suggèrent implicitement une perturbation de l'homéostasie hépatocytaire comme point de départ des altérations observées dans cette étude. Afin de caractériser in vivo les altérations de la progression du cycle cellulaire hépatocytaire et d'identifier le mécanisme sous-jacent chez ces souris, un modèle de régénération hépatique, induit par l'injection d'une forte dose de l'hépatotoxique CCl4 a été utilisé. Nos résultats ont mis en évidence une inhibition de la transition G1/S associée à une activation de la voie ATM de réponse aux dommages à l'ADN causés par un stress oxydant exacerbé dans les hépatocytes de souris exprimant les protéines du VHC.Les mécanismes de la fibrogenèse hépatique liée à l'infection chronique par le virus de l'hépatite C (VHC) sont encore mal connus. L'activation de la fibrogenèse semble fortement associée à la réaction inflammatoire locale. Néanmoins le rôle direct du VHC dans le processus fibrogène n'a pas été étudié. Notre hypothèse est que la fibrose peut être au moins en partie directement induite par le VHC, indépendamment de la réponse immune de l'hôte.L'un des aspects inhérents à ce postulat est qu'une relation directe pourrait exister entre les particules du VHC et l'activation des principaux acteurs cellulaires de la fibrogenèse hépatique, les cellules étoilées du foie (CEFs). Ainsi, les objectifs de ce projet ont été d'étudier in vitro, la capacité du VHC à activer les CEFs humaines mais d'abord d'explorer la relation existant entre cette activation et une éventuelle infection de ces cellules par le VHC. Afin d'analyser la permissivité des CEFs à l'infection par le VHC, nous avons combiné plusieurs modèles originaux du VHC, tels que le clone infectieux JFH1, des rétrovirus pseudotypés avec les protéines d'enveloppe du VHC et le réplicon sous-génomique, avec deux modèles cellulaires de CEFs relevant (des cultures primaires humaines et la lignée immortalisée LX2). En conclusion, nous avons démontré que les CEFs humaines sont réfractaires à la fois à l'entrée et à la réplication du VHC. Ces résultats n'écartent cependant pas l'hypothèse d'une interaction directe entre les particules du VHC et la surface des CEFs dans l'activation fibrogénique de ces cellules. Le rôle des protéines d'enveloppe virales sous leur conformation native dans l'activation des CEFs fût étudié en incubant des ppVHC avec les CEFs. Malgré des résultats préliminaires encourageants, la question d'une activation éventuelle des CEFs en culture après contact avec des particules virales du VHC, et ce indépendamment d'une entrée et/ou d'une réplication virale, n'a pu être confirmé et reste encore sans réponse.Un second aspect est que l'expression in vivo de l'ensemble des protéines du VHC dans les hépatocytes pourrait jouer un rôle dans le déclenchement et la progression de la fibrose portale. Nous avons démontré pour la première fois que l'expression hépatocytaire in vivo des protéines du VHC chez des souris transgéniques, les FL-N/35, soumises à un traitement fibrogénique (injection chronique de CCl4) était associée à une fibrose augmentée, et ce de manière indépendante de l'inflammation locale. Cette fibrose augmentée chez les FL-N/35 s'accompagnait d'une production augmentée de d'espèces réactives de l'oxygène intrahépatocytaires, d'une réaction ductulaire caractérisée notamment par une expansion de cellules progénitrices hépatiques (CPHs), et d'une l'inhibition de la prolifération hépatocytaire. On notera également que cette fibrose portale corrélait avec l'expansion des CPHs portales, corollaire implicite de l'inhibition de la prolifération hépatocytaire. Ces observations, également observé chez les patients infectés par le VHC, suggèrent que la RD associée à une altération de la prolifération hépatocytaire jouerait un rôle dans la fibrose portale. Le modèle de souris utilisé présentant une expression intrahépatocytaire des protéines du VHC, nos résultats suggèrent implicitement une perturbation de l'homéostasie hépatocytaire comme point de départ des altérations observées dans cette étude. Afin de caractériser in vivo les altérations de la progression du cycle cellulaire hépatocytaire et d'identifier le mécanisme sous-jacent chez ces souris, un modèle de régénération hépatique, induit par l'injection d'une forte dose de l'hépatotoxique CCl4 a été utilisé. Nos résultats ont mis en évidence une inhibition de la transition G1/S associée à une activation de la voie ATM de réponse aux dommages à l'ADN causés par un stress oxydant exacerbé dans les hépatocytes de souris exprimant les protéines du VHC

    Web Privacy: A Formal Adversarial Model for Query Obfuscation

    No full text
    The queries we perform, the searches we make, and the websites we visit — this sensitive data is collected at scale by companies as part of the services they provide. Query obfuscation, intertwining the user queries with artificial queries, has been proposed as a solution to protect the privacy of individuals on the web. We here present a formal model and formulate through attack models three privacy requirements for obfuscators: 1) indistinguishability, that the user query should be hard to identify; 2) coverage, that its topic should be hard to identify; and 3) imprecision, that the query should still be hard to identify for an attacker with additional auxiliary information. The latter is needed to make the former two guarantees “future-proof”. Using our framework, we derive two important results for obfuscators. First, we show that indistinguishability imposes strong bounds on the coverage and imprecision achievable by an obfuscator. Second, we prove an important tradeoff between coverage and imprecision, which inherently limits the strength and robustness of the privacy guarantees that an obfuscator can provide. We then introduce a family of obfuscators with provable indistinguishability guarantees, which we call kk- ball obfuscators, and show, for a range of parameter values, the achievable coverage and imprecision. We show empirically that our theoretical tradeoff holds, and that its bound is not tight in practice: even in a simple idealized setting, there is a significant gap between practical coverage and imprecision guarantees, and the optimal bounds. While obfuscators have proven popular with the general public, all obfuscators currently available provide ad-hoc guarantees, and have been shown to be vulnerable to attacks, putting the data of users at risk. We hope this work to be a first step towards a robust evaluation of the properties of query obfuscators and the development of principled obfuscators

    When the Signal is in the Noise: Exploiting Diffix's Sticky Noise

    No full text
    Anonymized data is highly valuable to both businesses and researchers. A large body of research has however shown the strong limits of the de-identification release-and-forget model, where data is anonymized and shared. This has led to the development of privacy-preserving query-based systems. Based on the idea of "sticky noise", Diffix has been recently proposed as a novel query-based mechanism satisfying alone the EU Article 29 Working Party's definition of anonymization. According to its authors, Diffix adds less noise to answers than solutions based on differential privacy while allowing for an unlimited number of queries. This paper presents a new class of noise-exploitation attacks, exploiting the noise added by the system to infer private information about individuals in the dataset. Our first differential attack uses samples extracted from Diffix in a likelihood ratio test to discriminate between two probability distributions. We show that using this attack against a synthetic best-case dataset allows us to infer private information with 89.4% accuracy using only 5 attributes. Our second cloning attack uses dummy conditions that conditionally strongly affect the output of the query depending on the value of the private attribute. Using this attack on four real-world datasets, we show that we can infer private attributes of at least 93% of the users in the dataset with accuracy between 93.3% and 97.1%, issuing a median of 304 queries per user. We show how to optimize this attack, targeting 55.4% of the users and achieving 91.7% accuracy, using a maximum of only 32 queries per user. Our attacks demonstrate that adding data-dependent noise, as done by Diffix, is not sufficient to prevent inference of private attributes. We furthermore argue that Diffix alone fails to satisfy Art. 29 WP's definition of anonymization. We conclude by discussing how non-provable privacy-preserving systems can be combined with fundamental security principles such as defense-in-depth and auditability to build practically useful anonymization systems without relying on differential privacy

    Human hepatic stellate cells are not permissive for hepatitis C virus entry and replication

    No full text
    International audienceBACKGROUND: Chronic HCV infection is associated with the development of hepatic fibrosis. The direct role of HCV in the fibrogenic process is unknown. Specifically, whether HCV is able to infect hepatic stellate cells (HSCs) is debated. OBJECTIVE: To assess whether human HSCs are susceptible to HCV infection. DESIGN: We combined a set of original HCV models, including the infectious genotype 2a JFH1 model (HCVcc), retroviral pseudoparticles expressing the folded HCV genotype 1b envelope glycoproteins (HCVpp) and a subgenomic genotype 1b HCV replicon, and two relevant cellular models, primary human HSCs from different patients and the LX-2 cell line, to assess whether HCV can infect/replicate in HSCs. RESULTS: In contrast with the hepatocyte cell line Huh-7, neither infectious HCVcc nor HCVpp infected primary human HSCs or LX-2 cells. The cellular expression of host cellular factors required for HCV entry was high in Huh-7 cells but low in HSCs and LX-2 cells, with the exception of CD81. Finally, replication of a genotype 2a full-length RNA genome and a genotype 1b subgenomic replicon was impaired in primary human HSCs and LX-2 cells, which expressed low levels of cellular factors known to play a key role in the HCV life-cycle, suggesting that human HSCs are not permissive for HCV replication. CONCLUSIONS: Human HSCs are refractory to HCV infection. Both HCV entry and replication are deficient in these cells, regardless of the HCV genotype and origin of the cells. Thus, HCV infection of HSCs does not play a role in liver fibrosis. These results do not rule out a direct role of HCV infection of hepatocytes in the fibrogenic process

    Hepatitis C virus (HCV) protein expression enhances hepatic fibrosis in HCV transgenic mice exposed to a fibrogenic agent.

    No full text
    International audienceBACKGROUND & AIMS: During chronic HCV infection, activation of fibrogenesis appears to be principally related to local inflammation. However, the direct role of hepatic HCV protein expression in fibrogenesis remains unknown. METHODS: We used transgenic mice expressing the full length HCV open reading frame exposed to a 'second hit' of the fibrogenic agent carbon tetrachloride (CCl(4)). Both acute and chronic liver injuries were induced in these mice by CCl(4) injections. Liver injury, expression of matrix re-modeling genes, reactive oxygen species (ROS), inflammation, hepatocyte proliferation, ductular reaction and hepatic progenitor cells (HPC) expansion were examined. RESULTS: After CCl(4) treatment, HCV transgenic mice exhibited enhanced liver fibrosis, significant changes in matrix re-modeling genes and increased ROS production compared to wild type littermates despite no differences in the degree of local inflammation. This increase was accompanied by a decrease in hepatocyte proliferation, which appeared to be due to delayed hepatocyte entry into the S phase. A prominent ductular reaction and hepatic progenitor cell compartment expansion were observed in transgenic animals. These observations closely mirror those previously made in HCV-infected individuals. CONCLUSIONS: Together, these results demonstrate that expression of the HCV proteins in hepatocytes contributes to the development of hepatic fibrosis in the presence of other fibrogenic agents. In the presence of CCl(4), HCV transgenic mice display an intra-hepatic re-organization of several key cellular actors in the fibrogenic process

    Differentially Private Compressive K-means

    No full text
    This work addresses the problem of learning from large collections of data with privacy guarantees. The sketched learning framework proposes to deal with the large scale of datasets by compressing them into a single vector of generalized random moments, from which the learning task is then performed. We modify the standard sketching mechanism to provide differential privacy, using addition of Laplace noise combined with a subsampling mechanism (each moment is computed from a subset of the dataset). The data can be divided between several sensors, each applying the privacy-preserving mechanism locally, yielding a differentially-private sketch of the whole dataset when reunited. We apply this framework to the k-means clustering problem, for which a measure of utility of the mechanism in terms of a signal-to-noise ratio is provided, and discuss the obtained privacy-utility tradeoff
    corecore