6,072 research outputs found

    Beyond Personalization: Research Directions in Multistakeholder Recommendation

    Get PDF
    Recommender systems are personalized information access applications; they are ubiquitous in today's online environment, and effective at finding items that meet user needs and tastes. As the reach of recommender systems has extended, it has become apparent that the single-minded focus on the user common to academic research has obscured other important aspects of recommendation outcomes. Properties such as fairness, balance, profitability, and reciprocity are not captured by typical metrics for recommender system evaluation. The concept of multistakeholder recommendation has emerged as a unifying framework for describing and understanding recommendation settings where the end user is not the sole focus. This article describes the origins of multistakeholder recommendation, and the landscape of system designs. It provides illustrative examples of current research, as well as outlining open questions and research directions for the field.Comment: 64 page

    Consumer-side Fairness in Recommender Systems: A Systematic Survey of Methods and Evaluation

    Full text link
    In the current landscape of ever-increasing levels of digitalization, we are facing major challenges pertaining to scalability. Recommender systems have become irreplaceable both for helping users navigate the increasing amounts of data and, conversely, aiding providers in marketing products to interested users. The growing awareness of discrimination in machine learning methods has recently motivated both academia and industry to research how fairness can be ensured in recommender systems. For recommender systems, such issues are well exemplified by occupation recommendation, where biases in historical data may lead to recommender systems relating one gender to lower wages or to the propagation of stereotypes. In particular, consumer-side fairness, which focuses on mitigating discrimination experienced by users of recommender systems, has seen a vast number of diverse approaches for addressing different types of discrimination. The nature of said discrimination depends on the setting and the applied fairness interpretation, of which there are many variations. This survey serves as a systematic overview and discussion of the current research on consumer-side fairness in recommender systems. To that end, a novel taxonomy based on high-level fairness interpretation is proposed and used to categorize the research and their proposed fairness evaluation metrics. Finally, we highlight some suggestions for the future direction of the field.Comment: Draft submitted to Springer (November 2022

    A benchmark study on methods to ensure fair algorithmic decisions for credit scoring

    Full text link
    The utility of machine learning in evaluating the creditworthiness of loan applicants has been proofed since decades ago. However, automatic decisions may lead to different treatments over groups or individuals, potentially causing discrimination. This paper benchmarks 12 top bias mitigation methods discussing their performance based on 5 different fairness metrics, accuracy achieved and potential profits for the financial institutions. Our findings show the difficulties in achieving fairness while preserving accuracy and profits. Additionally, it highlights some of the best and worst performers and helps bridging the gap between experimental machine learning and its industrial application

    Essays on Structural Econometric Modeling and Machine Learning

    Get PDF
    This dissertation is composed of three independent chapters relating the theory and empirical methodology in economics to machine learning and important topics in information age . The first chapter raises an important problem in structural estimation and provide a solution to it by incorporating a culture in machine learning. The second chapter investigates a problem of statistical discrimination in big data era. The third chapter studies the implication of information uncertainty in the security software market. Structural estimation is a widely used methodology in empirical economics, and a large class of structural econometric models are estimated through the generalized method of moments (GMM). Traditionally, a model to be estimated is chosen by researchers based on their intuition on the model, and the structural estimation itself does not directly test it from the data. In other words, not sufficient amount of attention is paid to devise a principled method to verify such an intuition. In the first chapter, we propose a model selection for GMM by using cross-validation, which is widely used in machine learning and statistics communities. We prove the consistency of the cross-validation. The empirical property of the proposed model selection is compared with existing model selection methods by Monte Carlo simulations of a linear instrumental variable regression and oligopoly pricing model. In addition, we propose the way to apply our method to Mathematical Programming of Equilibrium Constraint (MPEC) approach. Finally, we perform our method to online-retail sales data to compare dynamic model to static model. In the second chapter, we study a fair machine learning algorithm that avoids a statistical discrimination when making a decision. Algorithmic decision making process now affects many aspects of our lives. Standard tools for machine learning, such as classification and regression, are subject to the bias in data, and thus direct application of such off-the-shelf tools could lead to a specific group being statistically discriminated. Removing sensitive variables such as race or gender from data does not solve this problem because a disparate impact can arise when non-sensitive variables and sensitive variables are correlated. This problem arises severely nowadays as bigger data is utilized, it is of particular importance to invent an algorithmic solution. Inspired by the two-stage least squares method that is widely used in the field of economics, we propose a two-stage algorithm that removes bias in the training data. The proposed algorithm is conceptually simple. Unlike most of existing fair algorithms that are designed for classification tasks, the proposed method is able to (i) deal with regression tasks, (ii) combine explanatory variables to remove reverse discrimination, and (iii) deal with numerical sensitive variables. The performance and fairness of the proposed algorithm are evaluated in simulations with synthetic and real-world datasets. The third chapter examines the issue of information uncertainty in the context of information security. Many users lack the ability to correctly estimate the true quality of the security software they purchase, as evidenced by some anecdotes and even some academic research. Yet, most of the analytical research assumes otherwise. Hence, we were motivated to incorporate this “false sense of security” behavior into a game-theoretic model and study the implications on welfare parameters. Our model features two segments of consumers, well-and ill-informed, and the monopolistic software vendor. Well-informed consumers observe the true quality of the security software, while the ill-informed ones overestimate. While the proportion of both segments are known to the software vendor, consumers are uncertain about the segment they belong to. We find that, in fact, the level of the uncertainty is not necessarily harmful to society. Furthermore, there exist some extreme circumstances where society and consumers could be better off if the security software did not exist. Interestingly, we also find that the case where consumers know the information structure and weight their expectation accordingly does not always lead to optimal social welfare. These results contrast with the conventional wisdom and are crucially important in developing appropriate policies in this context

    Counterfactual Fairness with Partially Known Causal Graph

    Full text link
    Fair machine learning aims to avoid treating individuals or sub-populations unfavourably based on \textit{sensitive attributes}, such as gender and race. Those methods in fair machine learning that are built on causal inference ascertain discrimination and bias through causal effects. Though causality-based fair learning is attracting increasing attention, current methods assume the true causal graph is fully known. This paper proposes a general method to achieve the notion of counterfactual fairness when the true causal graph is unknown. To be able to select features that lead to counterfactual fairness, we derive the conditions and algorithms to identify ancestral relations between variables on a \textit{Partially Directed Acyclic Graph (PDAG)}, specifically, a class of causal DAGs that can be learned from observational data combined with domain knowledge. Interestingly, we find that counterfactual fairness can be achieved as if the true causal graph were fully known, when specific background knowledge is provided: the sensitive attributes do not have ancestors in the causal graph. Results on both simulated and real-world datasets demonstrate the effectiveness of our method

    Advancing Personalized Federated Learning: Group Privacy, Fairness, and Beyond

    Full text link
    Federated learning (FL) is a framework for training machine learning models in a distributed and collaborative manner. During training, a set of participating clients process their data stored locally, sharing only the model updates obtained by minimizing a cost function over their local inputs. FL was proposed as a stepping-stone towards privacy-preserving machine learning, but it has been shown vulnerable to issues such as leakage of private information, lack of personalization of the model, and the possibility of having a trained model that is fairer to some groups than to others. In this paper, we address the triadic interaction among personalization, privacy guarantees, and fairness attained by models trained within the FL framework. Differential privacy and its variants have been studied and applied as cutting-edge standards for providing formal privacy guarantees. However, clients in FL often hold very diverse datasets representing heterogeneous communities, making it important to protect their sensitive information while still ensuring that the trained model upholds the aspect of fairness for the users. To attain this objective, a method is put forth that introduces group privacy assurances through the utilization of dd-privacy (aka metric privacy). dd-privacy represents a localized form of differential privacy that relies on a metric-oriented obfuscation approach to maintain the original data's topological distribution. This method, besides enabling personalized model training in a federated approach and providing formal privacy guarantees, possesses significantly better group fairness measured under a variety of standard metrics than a global model trained within a classical FL template. Theoretical justifications for the applicability are provided, as well as experimental validation on real-world datasets to illustrate the working of the proposed method
    • …
    corecore