52 research outputs found
Group Meritocratic Fairness in Linear Contextual Bandits
We study the linear contextual bandit problem where an agent has to select one candidate from a pool and each candidate belongs to a sensitive group. In this setting, candidates⧠rewards may not be directly comparable between groups, for example when the agent is an employer hiring candidates from different ethnic groups and some groups have a lower reward due to discriminatory bias and/or social injustice. We propose a notion of fairness that states that the agent* policy is fair when it selects a candidate with highest relative rank, which measures how good the reward is when compared to candidates from the same group. This is a very strong notion of fairness, since the relative rank is not directly observed by the agent and depends on the underlying reward model and on the distribution of rewards. Thus we study the problem of learning a policy which approximates a fair policy under the condition that the contexts are independent between groups and the distribution of rewards of each group is absolutely continuous. In particular, we design a greedy policy which at each round constructs a ridge regression estimate from the observed context-reward pairs, and then computes an estimate of the relative rank of each candidate using the empirical cumulative distribution function. We prove that, despite its simplicity and the lack of an initial exploration phase, the greedy policy achieves, up to log factors and with high probability, a fair pseudo-regret of order √dT after T rounds, where d is the dimension of the context vectors. The policy also satisfies demographic parity at each round when averaged over all possible information available before the selection. Finally, we use simulated settings and experiments on the US census data to show that our policy achieves sub-linear fair pseudo-regret also in practice
Fair Adaptive Experiments
Randomized experiments have been the gold standard for assessing the
effectiveness of a treatment or policy. The classical complete randomization
approach assigns treatments based on a prespecified probability and may lead to
inefficient use of data. Adaptive experiments improve upon complete
randomization by sequentially learning and updating treatment assignment
probabilities. However, their application can also raise fairness and equity
concerns, as assignment probabilities may vary drastically across groups of
participants. Furthermore, when treatment is expected to be extremely
beneficial to certain groups of participants, it is more appropriate to expose
many of these participants to favorable treatment. In response to these
challenges, we propose a fair adaptive experiment strategy that simultaneously
enhances data use efficiency, achieves an envy-free treatment assignment
guarantee, and improves the overall welfare of participants. An important
feature of our proposed strategy is that we do not impose parametric modeling
assumptions on the outcome variables, making it more versatile and applicable
to a wider array of applications. Through our theoretical investigation, we
characterize the convergence rate of the estimated treatment effects and the
associated standard deviations at the group level and further prove that our
adaptive treatment assignment algorithm, despite not having a closed-form
expression, approaches the optimal allocation rule asymptotically. Our proof
strategy takes into account the fact that the allocation decisions in our
design depend on sequentially accumulated data, which poses a significant
challenge in characterizing the properties and conducting statistical inference
of our method. We further provide simulation evidence to showcase the
performance of our fair adaptive experiment strategy
Equal Opportunity in Online Classification with Partial Feedback
We study an online classification problem with partial feedback in which
individuals arrive one at a time from a fixed but unknown distribution, and
must be classified as positive or negative. Our algorithm only observes the
true label of an individual if they are given a positive classification. This
setting captures many classification problems for which fairness is a concern:
for example, in criminal recidivism prediction, recidivism is only observed if
the inmate is released; in lending applications, loan repayment is only
observed if the loan is granted. We require that our algorithms satisfy common
statistical fairness constraints (such as equalizing false positive or negative
rates -- introduced as "equal opportunity" in Hardt et al. (2016)) at every
round, with respect to the underlying distribution. We give upper and lower
bounds characterizing the cost of this constraint in terms of the regret rate
(and show that it is mild), and give an oracle efficient algorithm that
achieves the upper bound.Comment: The Conference version of this paper appears in the Proceedings of
NeurIPS 2019. 29 page
Individual Fairness in Hindsight
Since many critical decisions impacting human lives are increasingly being
made by algorithms, it is important to ensure that the treatment of individuals
under such algorithms is demonstrably fair under reasonable notions of
fairness. One compelling notion proposed in the literature is that of
individual fairness (IF), which advocates that similar individuals should be
treated similarly (Dwork et al. 2012). Originally proposed for offline
decisions, this notion does not, however, account for temporal considerations
relevant for online decision-making. In this paper, we extend the notion of IF
to account for the time at which a decision is made, in settings where there
exists a notion of conduciveness of decisions as perceived by the affected
individuals. We introduce two definitions: (i) fairness-across-time (FT) and
(ii) fairness-in-hindsight (FH). FT is the simplest temporal extension of IF
where treatment of individuals is required to be individually fair relative to
the past as well as future, while in FH, we require a one-sided notion of
individual fairness that is defined relative to only the past decisions. We
show that these two definitions can have drastically different implications in
the setting where the principal needs to learn the utility model. Linear regret
relative to optimal individually fair decisions is inevitable under FT for
non-trivial examples. On the other hand, we design a new algorithm: Cautious
Fair Exploration (CaFE), which satisfies FH and achieves sub-linear regret
guarantees for a broad range of settings. We characterize lower bounds showing
that these guarantees are order-optimal in the worst case. FH can thus be
embedded as a primary safeguard against unfair discrimination in algorithmic
deployments, without hindering the ability to take good decisions in the
long-run
Achieving Causal Fairness in Recommendation
Recommender systems provide personalized services for users seeking information and play an increasingly important role in online applications. While most research papers focus on inventing machine learning algorithms to fit user behavior data and maximizing predictive performance in recommendation, it is also very important to develop fairness-aware machine learning algorithms such that the decisions made by them are not only accurate but also meet desired fairness requirements. In personalized recommendation, although there are many works focusing on fairness and discrimination, how to achieve user-side fairness in bandit recommendation from a causal perspective still remains a challenging task. Besides, the deployed systems utilize user-item interaction data to train models and then generate new data by online recommendation. This feedback loop in recommendation often results in various biases in observational data. The goal of this dissertation is to address challenging issues in achieving causal fairness in recommender systems: achieving user-side fairness and counterfactual fairness in bandit-based recommendation, mitigating confounding and sample selection bias simultaneously in recommendation and robustly improving bandit learning process with biased offline data. In this dissertation, we developed the following algorithms and frameworks for research problems related to causal fairness in recommendation. • We developed a contextual bandit algorithm to achieve group level user-side fairness and two UCB-based causal bandit algorithms to achieve counterfactual individual fairness for personalized recommendation; • We derived sufficient and necessary graphical conditions for identifying and estimating three causal quantities under the presence of confounding and sample selection biases and proposed a framework for leveraging the causal bound derived from the confounded and selection biased offline data to robustly improve online bandit learning process; • We developed a framework for discrimination analysis with the benefit of multiple causes of the outcome variable to deal with hidden confounding; • We proposed a new causal-based fairness notion and developed algorithms for determining whether an individual or a group of individuals is discriminated in terms of equality of effort
- …