52 research outputs found

    Group Meritocratic Fairness in Linear Contextual Bandits

    Get PDF
    We study the linear contextual bandit problem where an agent has to select one candidate from a pool and each candidate belongs to a sensitive group. In this setting, candidates⧠rewards may not be directly comparable between groups, for example when the agent is an employer hiring candidates from different ethnic groups and some groups have a lower reward due to discriminatory bias and/or social injustice. We propose a notion of fairness that states that the agent* policy is fair when it selects a candidate with highest relative rank, which measures how good the reward is when compared to candidates from the same group. This is a very strong notion of fairness, since the relative rank is not directly observed by the agent and depends on the underlying reward model and on the distribution of rewards. Thus we study the problem of learning a policy which approximates a fair policy under the condition that the contexts are independent between groups and the distribution of rewards of each group is absolutely continuous. In particular, we design a greedy policy which at each round constructs a ridge regression estimate from the observed context-reward pairs, and then computes an estimate of the relative rank of each candidate using the empirical cumulative distribution function. We prove that, despite its simplicity and the lack of an initial exploration phase, the greedy policy achieves, up to log factors and with high probability, a fair pseudo-regret of order √dT after T rounds, where d is the dimension of the context vectors. The policy also satisfies demographic parity at each round when averaged over all possible information available before the selection. Finally, we use simulated settings and experiments on the US census data to show that our policy achieves sub-linear fair pseudo-regret also in practice

    Fair Adaptive Experiments

    Full text link
    Randomized experiments have been the gold standard for assessing the effectiveness of a treatment or policy. The classical complete randomization approach assigns treatments based on a prespecified probability and may lead to inefficient use of data. Adaptive experiments improve upon complete randomization by sequentially learning and updating treatment assignment probabilities. However, their application can also raise fairness and equity concerns, as assignment probabilities may vary drastically across groups of participants. Furthermore, when treatment is expected to be extremely beneficial to certain groups of participants, it is more appropriate to expose many of these participants to favorable treatment. In response to these challenges, we propose a fair adaptive experiment strategy that simultaneously enhances data use efficiency, achieves an envy-free treatment assignment guarantee, and improves the overall welfare of participants. An important feature of our proposed strategy is that we do not impose parametric modeling assumptions on the outcome variables, making it more versatile and applicable to a wider array of applications. Through our theoretical investigation, we characterize the convergence rate of the estimated treatment effects and the associated standard deviations at the group level and further prove that our adaptive treatment assignment algorithm, despite not having a closed-form expression, approaches the optimal allocation rule asymptotically. Our proof strategy takes into account the fact that the allocation decisions in our design depend on sequentially accumulated data, which poses a significant challenge in characterizing the properties and conducting statistical inference of our method. We further provide simulation evidence to showcase the performance of our fair adaptive experiment strategy

    Equal Opportunity in Online Classification with Partial Feedback

    Get PDF
    We study an online classification problem with partial feedback in which individuals arrive one at a time from a fixed but unknown distribution, and must be classified as positive or negative. Our algorithm only observes the true label of an individual if they are given a positive classification. This setting captures many classification problems for which fairness is a concern: for example, in criminal recidivism prediction, recidivism is only observed if the inmate is released; in lending applications, loan repayment is only observed if the loan is granted. We require that our algorithms satisfy common statistical fairness constraints (such as equalizing false positive or negative rates -- introduced as "equal opportunity" in Hardt et al. (2016)) at every round, with respect to the underlying distribution. We give upper and lower bounds characterizing the cost of this constraint in terms of the regret rate (and show that it is mild), and give an oracle efficient algorithm that achieves the upper bound.Comment: The Conference version of this paper appears in the Proceedings of NeurIPS 2019. 29 page

    Individual Fairness in Hindsight

    Full text link
    Since many critical decisions impacting human lives are increasingly being made by algorithms, it is important to ensure that the treatment of individuals under such algorithms is demonstrably fair under reasonable notions of fairness. One compelling notion proposed in the literature is that of individual fairness (IF), which advocates that similar individuals should be treated similarly (Dwork et al. 2012). Originally proposed for offline decisions, this notion does not, however, account for temporal considerations relevant for online decision-making. In this paper, we extend the notion of IF to account for the time at which a decision is made, in settings where there exists a notion of conduciveness of decisions as perceived by the affected individuals. We introduce two definitions: (i) fairness-across-time (FT) and (ii) fairness-in-hindsight (FH). FT is the simplest temporal extension of IF where treatment of individuals is required to be individually fair relative to the past as well as future, while in FH, we require a one-sided notion of individual fairness that is defined relative to only the past decisions. We show that these two definitions can have drastically different implications in the setting where the principal needs to learn the utility model. Linear regret relative to optimal individually fair decisions is inevitable under FT for non-trivial examples. On the other hand, we design a new algorithm: Cautious Fair Exploration (CaFE), which satisfies FH and achieves sub-linear regret guarantees for a broad range of settings. We characterize lower bounds showing that these guarantees are order-optimal in the worst case. FH can thus be embedded as a primary safeguard against unfair discrimination in algorithmic deployments, without hindering the ability to take good decisions in the long-run

    Achieving Causal Fairness in Recommendation

    Get PDF
    Recommender systems provide personalized services for users seeking information and play an increasingly important role in online applications. While most research papers focus on inventing machine learning algorithms to fit user behavior data and maximizing predictive performance in recommendation, it is also very important to develop fairness-aware machine learning algorithms such that the decisions made by them are not only accurate but also meet desired fairness requirements. In personalized recommendation, although there are many works focusing on fairness and discrimination, how to achieve user-side fairness in bandit recommendation from a causal perspective still remains a challenging task. Besides, the deployed systems utilize user-item interaction data to train models and then generate new data by online recommendation. This feedback loop in recommendation often results in various biases in observational data. The goal of this dissertation is to address challenging issues in achieving causal fairness in recommender systems: achieving user-side fairness and counterfactual fairness in bandit-based recommendation, mitigating confounding and sample selection bias simultaneously in recommendation and robustly improving bandit learning process with biased offline data. In this dissertation, we developed the following algorithms and frameworks for research problems related to causal fairness in recommendation. • We developed a contextual bandit algorithm to achieve group level user-side fairness and two UCB-based causal bandit algorithms to achieve counterfactual individual fairness for personalized recommendation; • We derived sufficient and necessary graphical conditions for identifying and estimating three causal quantities under the presence of confounding and sample selection biases and proposed a framework for leveraging the causal bound derived from the confounded and selection biased offline data to robustly improve online bandit learning process; • We developed a framework for discrimination analysis with the benefit of multiple causes of the outcome variable to deal with hidden confounding; • We proposed a new causal-based fairness notion and developed algorithms for determining whether an individual or a group of individuals is discriminated in terms of equality of effort
    • …
    corecore