Search CORE

16 research outputs found

Locality-Sensitive Hashing Does Not Guarantee Privacy! Attacks on Google's FLoC and the MinHash Hierarchy System

Author: Basin David
Cotrini Carlos
Kubicek Karel
Turati Florian
Publication venue
Publication date: 27/02/2023
Field of study

Recently proposed systems aim at achieving privacy using locality-sensitive hashing. We show how these approaches fail by presenting attacks against two such systems: Google's FLoC proposal for privacy-preserving targeted advertising and the MinHash Hierarchy, a system for processing mobile users' traffic behavior in a privacy-preserving way. Our attacks refute the pre-image resistance, anonymity, and privacy guarantees claimed for these systems. In the case of FLoC, we show how to deanonymize users using Sybil attacks and to reconstruct 10% or more of the browsing history for 30% of its users using Generative Adversarial Networks. We achieve this only analyzing the hashes used by FLoC. For MinHash, we precisely identify the movement of a subset of individuals and, on average, we can limit users' movement to just 10% of the possible geographic area, again using just the hashes. In addition, we refute their differential privacy claims.Comment: 14 pages, 9 figures submitted to PETS 202

arXiv.org e-Print Archive

S-GBDT: Frugal Differentially Private Gradient Boosting Decision Trees

Author: Cotrini Carlos
Kirschte Moritz
Mohammadi Esfandiar
Peinemann Thorsten
Stock Joshua
Publication venue
Publication date: 28/09/2023
Field of study

Privacy-preserving learning of gradient boosting decision trees (GBDT) has the potential for strong utility-privacy tradeoffs for tabular data, such as census data or medical meta data: classical GBDT learners can extract non-linear patterns from small sized datasets. The state-of-the-art notion for provable privacy-properties is differential privacy, which requires that the impact of single data points is limited and deniable. We introduce a novel differentially private GBDT learner and utilize four main techniques to improve the utility-privacy tradeoff. (1) We use an improved noise scaling approach with tighter accounting of privacy leakage of a decision tree leaf compared to prior work, resulting in noise that in expectation scales with

O(1/n)

, for

n

data points. (2) We integrate individual R\'enyi filters to our method to learn from data points that have been underutilized during an iterative training process, which -- potentially of independent interest -- results in a natural yet effective insight to learning streams of non-i.i.d. data. (3) We incorporate the concept of random decision tree splits to concentrate privacy budget on learning leaves. (4) We deploy subsampling for privacy amplification. Our evaluation shows for the Abalone dataset (

<4k

training data points) a

R^2

-score of

0.39

for

\varepsilon=0.15

, which the closest prior work only achieved for

\varepsilon=10.0

. On the Adult dataset (

50k

training data points) we achieve test error of

18.7\,\%

for

\varepsilon=0.07

which the closest prior work only achieved for

\varepsilon=1.0

. For the Abalone dataset for

\varepsilon=0.54

we achieve

R^2

-score of

0.47

which is very close to the

R^2

-score of

0.54

for the nonprivate version of GBDT. For the Adult dataset for

\varepsilon=0.54

we achieve test error

17.1\,\%

which is very close to the test error

13.7\,\%

of the nonprivate version of GBDT.Comment: The first two authors equally contributed to this wor

arXiv.org e-Print Archive

Efficient and Extensible Policy Mining for Relationship-Based Access Control

Author: Bui Thang
Cotrini Carlos
Das Saptarshi
Mocanu Decebal C.
Zeiler Matthew D.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 08/08/2019
Field of study

Relationship-based access control (ReBAC) is a flexible and expressive framework that allows policies to be expressed in terms of chains of relationship between entities as well as attributes of entities. ReBAC policy mining algorithms have a potential to significantly reduce the cost of migration from legacy access control systems to ReBAC, by partially automating the development of a ReBAC policy. Existing ReBAC policy mining algorithms support a policy language with a limited set of operators; this limits their applicability. This paper presents a ReBAC policy mining algorithm designed to be both (1) easily extensible (to support additional policy language features) and (2) scalable. The algorithm is based on Bui et al.'s evolutionary algorithm for ReBAC policy mining algorithm. First, we simplify their algorithm, in order to make it easier to extend and provide a methodology that extends it to handle new policy language features. However, extending the policy language increases the search space of candidate policies explored by the evolutionary algorithm, thus causes longer running time and/or worse results. To address the problem, we enhance the algorithm with a feature selection phase. The enhancement utilizes a neural network to identify useful features. We use the result of feature selection to reduce the evolutionary algorithm's search space. The new algorithm is easy to extend and, as shown by our experiments, is more efficient and produces better policies

arXiv.org e-Print Archive

Crossref

Transitive primal infon logic: The propositional case, Microsoft Research

Author: Carlos Cotrini
Carlos Cotrini
Yuri Gurevich
Yuri Gurevich
Publication venue
Publication date: 01/01/2012
Field of study

Abstract Primal (propositional) logic PL is the {∧, →} fragment of intuitionistic logic, and primal (propositional) infon logic PIL is a conservative extension of PL with the quotation construct p said. Logic PIL was introduced by Gurevich and Neeman in 2009 in connection with the DKAL project. The derivation problem for PIL (and therefore for PL) is solvable in linear time, and yet PIL allows one to express many common access control scenarios. The most obvious limitations on the expressivity of logics PL and PIL are the failures of the transitivity rules pref x → z respectively where pref ranges over quotation prefixes p said q said . . .. Here we investigate the extension T of PL with an axiom x → x and the inference rule (trans0) as well as the extension qT of PIL with an axiom pref x → x and the inference rule (trans). • [Subformula property] T has the subformula property: if Γ y then there is a derivation of y from Γ comprising only subformulas of Γ ∪ {y}. qT has a similar locality property. • [Complexity] The derivation problems for T and qT are solvable in quadratic time. • [Soundness and completeness] We define Kripke models for qT (resp. T) and show that the semantics is sound and complete. • [Small models] T has the one-element-model property: if Γ y then there is a one-element counterexample. Similarly small (though not one-element) counterexamples exist for qT

CiteSeerX

Preventing privilege abuse using policy analysis and policy mining

Author: Cotrini Jimenez Carlos Mauricio
Publication venue: ETH Zurich
Publication date: 01/01/2019
Field of study

Repository for Publications and Research Data

Basic primal infon logic

Author: Beklemishev
Blass
Carlos Cotrini
Yuri Gurevich
Publication venue: 'Oxford University Press (OUP)'
Publication date
Field of study

Crossref

Automating Cookie Consent and GDPR Violation Detection

Author: Basin David
Bollinger Dino
Cotrini Carlos
Kubicek Karel
Publication venue: USENIX Association
Publication date: 01/01/2022
Field of study

The European Union’s General Data Protection Regulation (GDPR) requires websites to inform users about personal data collection and request consent for cookies. Yet the majority of websites do not give users any choices, and others attempt to deceive them into accepting all cookies. We document the severity of this situation through an analysis of potential GDPR violations in cookie banners in almost 30k websites. We identify six novel violation types, such as incorrect category assignments and misleading expiration times, and we find at least one potential violation in a surprising 94.7% of the analyzed websites. We address this issue by giving users the power to protect their privacy. We develop a browser extension, called CookieBlock, that uses machine learning to enforce GDPR cookie consent at the client. It automatically categorizes cookies by usage purpose using only the information provided in the cookie itself. At a mean validation accuracy of 84.4%, our model attains a prediction quality competitive with expert knowledge in the field. Additionally, our approach differs from prior work by not relying on the cooperation of websites themselves. We empirically evaluate CookieBlock on a set of 100 randomly sampled websites, on which it filters roughly 90% of the privacy-invasive cookies without significantly impairing website functionality

Repository for Publications and Research Data

Locality-Sensitive Hashing Does Not Guarantee Privacy! Attacks on Google's FLoC and the MinHash Hierarchy System

Author: Basin David
Cotrini Carlos
Kubicek Karel
Turati Florian
Publication venue: Privacy Enhancing Technologies Symposium Advisory Board
Publication date: 01/10/2023
Field of study

Recently proposed systems aim at achieving privacy using locality-sensitive hashing. We show how these approaches fail by presenting attacks against two such systems: Google’s FLoC proposal for privacy-preserving targeted advertising and the MinHash Hierarchy, a system for processing location trajectories in a privacy-preserving way. Our attacks refute the pre-image resistance, anonymity, and privacy guarantees claimed for these systems. In the case of FLoC, we show how to deanonymize users using Sybil attacks and to reconstruct 10% or more of the browsing history for 30% of its users using Generative Adversarial Networks. We achieve this only analyzing the hashes used by FLoC. For MinHash, we precisely identify the location trajectory of a subset of individuals and, on average, we can limit users’ trajectory to just 10% of the possible geographic area, again using just the hashes. In addition, we refute their differential privacy claims.ISSN:2299-098

Repository for Publications and Research Data