28 research outputs found
Privacy via the Johnson-Lindenstrauss Transform
Suppose that party A collects private information about its users, where each
user's data is represented as a bit vector. Suppose that party B has a
proprietary data mining algorithm that requires estimating the distance between
users, such as clustering or nearest neighbors. We ask if it is possible for
party A to publish some information about each user so that B can estimate the
distance between users without being able to infer any private bit of a user.
Our method involves projecting each user's representation into a random,
lower-dimensional space via a sparse Johnson-Lindenstrauss transform and then
adding Gaussian noise to each entry of the lower-dimensional representation. We
show that the method preserves differential privacy---where the more privacy is
desired, the larger the variance of the Gaussian noise. Further, we show how to
approximate the true distances between users via only the lower-dimensional,
perturbed data. Finally, we consider other perturbation methods such as
randomized response and draw comparisons to sketch-based methods. While the
goal of releasing user-specific data to third parties is more broad than
preserving distances, this work shows that distance computations with privacy
is an achievable goal.Comment: 24 page
Preference-Informed Fairness
We study notions of fairness in decision-making systems when individuals have
diverse preferences over the possible outcomes of the decisions. Our starting
point is the seminal work of Dwork et al. which introduced a notion of
individual fairness (IF): given a task-specific similarity metric, every pair
of individuals who are similarly qualified according to the metric should
receive similar outcomes. We show that when individuals have diverse
preferences over outcomes, requiring IF may unintentionally lead to
less-preferred outcomes for the very individuals that IF aims to protect. A
natural alternative to IF is the classic notion of fair division, envy-freeness
(EF): no individual should prefer another individual's outcome over their own.
Although EF allows for solutions where all individuals receive a
highly-preferred outcome, EF may also be overly-restrictive. For instance, if
many individuals agree on the best outcome, then if any individual receives
this outcome, they all must receive it, regardless of each individual's
underlying qualifications for the outcome.
We introduce and study a new notion of preference-informed individual
fairness (PIIF) that is a relaxation of both individual fairness and
envy-freeness. At a high-level, PIIF requires that outcomes satisfy IF-style
constraints, but allows for deviations provided they are in line with
individuals' preferences. We show that PIIF can permit outcomes that are more
favorable to individuals than any IF solution, while providing considerably
more flexibility to the decision-maker than EF. In addition, we show how to
efficiently optimize any convex objective over the outcomes subject to PIIF for
a rich class of individual preferences. Finally, we demonstrate the broad
applicability of the PIIF framework by extending our definitions and algorithms
to the multiple-task targeted advertising setting introduced by Dwork and
Ilvento
Having your Privacy Cake and Eating it Too: Platform-supported Auditing of Social Media Algorithms for Public Interest
Social media platforms curate access to information and opportunities, and so
play a critical role in shaping public discourse today. The opaque nature of
the algorithms these platforms use to curate content raises societal questions.
Prior studies have used black-box methods to show that these algorithms can
lead to biased or discriminatory outcomes. However, existing auditing methods
face fundamental limitations because they function independent of the
platforms. Concerns of potential harm have prompted proposal of legislation in
both the U.S. and the E.U. to mandate a new form of auditing where vetted
external researchers get privileged access to social media platforms.
Unfortunately, to date there have been no concrete technical proposals to
provide such auditing, because auditing at scale risks disclosure of users'
private data and platforms' proprietary algorithms. We propose a new method for
platform-supported auditing that can meet the goals of the proposed
legislation. Our first contribution is to enumerate the challenges of existing
auditing methods to implement these policies at scale. Second, we suggest that
limited, privileged access to relevance estimators is the key to enabling
generalizable platform-supported auditing by external researchers. Third, we
show platform-supported auditing need not risk user privacy nor disclosure of
platforms' business interests by proposing an auditing framework that protects
against these risks. For a particular fairness metric, we show that ensuring
privacy imposes only a small constant factor increase (6.34x as an upper bound,
and 4x for typical parameters) in the number of samples required for accurate
auditing. Our technical contributions, combined with ongoing legal and policy
efforts, can enable public oversight into how social media platforms affect
individuals and society by moving past the privacy-vs-transparency hurdle