185 research outputs found
Models and Mechanisms for Fairness in Location Data Processing
Location data use has become pervasive in the last decade due to the advent
of mobile apps, as well as novel areas such as smart health, smart cities, etc.
At the same time, significant concerns have surfaced with respect to fairness
in data processing. Individuals from certain population segments may be
unfairly treated when being considered for loan or job applications, access to
public resources, or other types of services. In the case of location data,
fairness is an important concern, given that an individual's whereabouts are
often correlated with sensitive attributes, e.g., race, income, education.
While fairness has received significant attention recently, e.g., in the case
of machine learning, there is little focus on the challenges of achieving
fairness when dealing with location data. Due to their characteristics and
specific type of processing algorithms, location data pose important fairness
challenges that must be addressed in a comprehensive and effective manner. In
this paper, we adapt existing fairness models to suit the specific properties
of location data and spatial processing. We focus on individual fairness, which
is more difficult to achieve, and more relevant for most location data
processing scenarios. First, we devise a novel building block to achieve
fairness in the form of fair polynomials. Then, we propose two mechanisms based
on fair polynomials that achieve individual fairness, corresponding to two
common interaction types based on location data. Extensive experimental results
on real data show that the proposed mechanisms achieve individual location
fairness without sacrificing utility
Finding Skewed Subcubes Under a Distribution
Say that we are given samples from a distribution ? over an n-dimensional space. We expect or desire ? to behave like a product distribution (or a k-wise independent distribution over its marginals for small k). We propose the problem of enumerating/list-decoding all large subcubes where the distribution ? deviates markedly from what we expect; we refer to such subcubes as skewed subcubes. Skewed subcubes are certificates of dependencies between small subsets of variables in ?. We motivate this problem by showing that it arises naturally in the context of algorithmic fairness and anomaly detection.
In this work we focus on the special but important case where the space is the Boolean hypercube, and the expected marginals are uniform. We show that the obvious definition of skewed subcubes can lead to intractable list sizes, and propose a better definition of a minimal skewed subcube, which are subcubes whose skew cannot be attributed to a larger subcube that contains it. Our main technical contribution is a list-size bound for this definition and an algorithm to efficiently find all such subcubes. Both the bound and the algorithm rely on Fourier-analytic techniques, especially the powerful hypercontractive inequality.
On the lower bounds side, we show that finding skewed subcubes is as hard as the sparse noisy parity problem, and hence our algorithms cannot be improved on substantially without a breakthrough on this problem which is believed to be intractable. Motivated by this, we study alternate models allowing query access to ? where finding skewed subcubes might be easier
- …