6,072 research outputs found
Beyond Personalization: Research Directions in Multistakeholder Recommendation
Recommender systems are personalized information access applications; they
are ubiquitous in today's online environment, and effective at finding items
that meet user needs and tastes. As the reach of recommender systems has
extended, it has become apparent that the single-minded focus on the user
common to academic research has obscured other important aspects of
recommendation outcomes. Properties such as fairness, balance, profitability,
and reciprocity are not captured by typical metrics for recommender system
evaluation. The concept of multistakeholder recommendation has emerged as a
unifying framework for describing and understanding recommendation settings
where the end user is not the sole focus. This article describes the origins of
multistakeholder recommendation, and the landscape of system designs. It
provides illustrative examples of current research, as well as outlining open
questions and research directions for the field.Comment: 64 page
Consumer-side Fairness in Recommender Systems: A Systematic Survey of Methods and Evaluation
In the current landscape of ever-increasing levels of digitalization, we are
facing major challenges pertaining to scalability. Recommender systems have
become irreplaceable both for helping users navigate the increasing amounts of
data and, conversely, aiding providers in marketing products to interested
users. The growing awareness of discrimination in machine learning methods has
recently motivated both academia and industry to research how fairness can be
ensured in recommender systems. For recommender systems, such issues are well
exemplified by occupation recommendation, where biases in historical data may
lead to recommender systems relating one gender to lower wages or to the
propagation of stereotypes. In particular, consumer-side fairness, which
focuses on mitigating discrimination experienced by users of recommender
systems, has seen a vast number of diverse approaches for addressing different
types of discrimination. The nature of said discrimination depends on the
setting and the applied fairness interpretation, of which there are many
variations. This survey serves as a systematic overview and discussion of the
current research on consumer-side fairness in recommender systems. To that end,
a novel taxonomy based on high-level fairness interpretation is proposed and
used to categorize the research and their proposed fairness evaluation metrics.
Finally, we highlight some suggestions for the future direction of the field.Comment: Draft submitted to Springer (November 2022
A benchmark study on methods to ensure fair algorithmic decisions for credit scoring
The utility of machine learning in evaluating the creditworthiness of loan
applicants has been proofed since decades ago. However, automatic decisions may
lead to different treatments over groups or individuals, potentially causing
discrimination. This paper benchmarks 12 top bias mitigation methods discussing
their performance based on 5 different fairness metrics, accuracy achieved and
potential profits for the financial institutions. Our findings show the
difficulties in achieving fairness while preserving accuracy and profits.
Additionally, it highlights some of the best and worst performers and helps
bridging the gap between experimental machine learning and its industrial
application
Essays on Structural Econometric Modeling and Machine Learning
This dissertation is composed of three independent chapters relating the theory and empirical methodology in economics to machine learning and important topics in information age . The first chapter raises an important problem in structural estimation and provide a solution to it by incorporating a culture in machine learning. The second chapter investigates a problem of statistical discrimination in big data era. The third chapter studies the implication of information uncertainty in the security software market.
Structural estimation is a widely used methodology in empirical economics, and a large class of structural econometric models are estimated through the generalized method of moments (GMM). Traditionally, a model to be estimated is chosen by researchers based on their intuition on the model, and the structural estimation itself does not directly test it from the data. In other words, not sufficient amount of attention is paid to devise a principled method to verify such an intuition. In the first chapter, we propose a model selection for GMM by using cross-validation, which is widely used in machine learning and statistics communities. We prove the consistency of the cross-validation. The empirical property of the proposed model selection is compared with existing model selection methods by Monte Carlo simulations of a linear instrumental variable regression and oligopoly pricing model. In addition, we propose the way to apply our method to Mathematical Programming of Equilibrium Constraint (MPEC) approach. Finally, we perform our method to online-retail sales data to compare dynamic model to static model.
In the second chapter, we study a fair machine learning algorithm that avoids a statistical discrimination when making a decision. Algorithmic decision making process now affects many aspects of our lives. Standard tools for machine learning, such as classification and regression, are subject to the bias in data, and thus direct application of such off-the-shelf tools could lead to a specific group being statistically discriminated. Removing sensitive variables such as race or gender from data does not solve this problem because a disparate impact can arise when non-sensitive variables and sensitive variables are correlated. This problem arises severely nowadays as bigger data is utilized, it is of particular importance to invent an algorithmic solution. Inspired by the two-stage least squares method that is widely used in the field of economics, we propose a two-stage algorithm that removes bias in the training data. The proposed algorithm is conceptually simple. Unlike most of existing fair algorithms that are designed for classification tasks, the proposed method is able to (i) deal with regression tasks, (ii) combine explanatory variables to remove reverse discrimination, and (iii) deal with numerical sensitive variables. The performance and fairness of the proposed algorithm are evaluated in simulations with synthetic and real-world datasets.
The third chapter examines the issue of information uncertainty in the context of information security. Many users lack the ability to correctly estimate the true quality of the security software they purchase, as evidenced by some anecdotes and even some academic research. Yet, most of the analytical research assumes otherwise. Hence, we were motivated to incorporate this “false sense of security” behavior into a game-theoretic model and study the implications on welfare parameters. Our model features two segments of consumers, well-and ill-informed, and the monopolistic software vendor. Well-informed consumers observe the true quality of the security software, while the ill-informed ones overestimate. While the proportion of both segments are known to the software vendor, consumers are uncertain about the segment they belong to. We find that, in fact, the level of the uncertainty is not necessarily harmful to society. Furthermore, there exist some extreme circumstances where society and consumers could be better off if the security software did not exist. Interestingly, we also find that the case where consumers know the information structure and weight their expectation accordingly does not always lead to optimal social welfare. These results contrast with the conventional wisdom and are crucially important in developing appropriate policies in this context
Counterfactual Fairness with Partially Known Causal Graph
Fair machine learning aims to avoid treating individuals or sub-populations
unfavourably based on \textit{sensitive attributes}, such as gender and race.
Those methods in fair machine learning that are built on causal inference
ascertain discrimination and bias through causal effects. Though
causality-based fair learning is attracting increasing attention, current
methods assume the true causal graph is fully known. This paper proposes a
general method to achieve the notion of counterfactual fairness when the true
causal graph is unknown. To be able to select features that lead to
counterfactual fairness, we derive the conditions and algorithms to identify
ancestral relations between variables on a \textit{Partially Directed Acyclic
Graph (PDAG)}, specifically, a class of causal DAGs that can be learned from
observational data combined with domain knowledge. Interestingly, we find that
counterfactual fairness can be achieved as if the true causal graph were fully
known, when specific background knowledge is provided: the sensitive attributes
do not have ancestors in the causal graph. Results on both simulated and
real-world datasets demonstrate the effectiveness of our method
Advancing Personalized Federated Learning: Group Privacy, Fairness, and Beyond
Federated learning (FL) is a framework for training machine learning models
in a distributed and collaborative manner. During training, a set of
participating clients process their data stored locally, sharing only the model
updates obtained by minimizing a cost function over their local inputs. FL was
proposed as a stepping-stone towards privacy-preserving machine learning, but
it has been shown vulnerable to issues such as leakage of private information,
lack of personalization of the model, and the possibility of having a trained
model that is fairer to some groups than to others. In this paper, we address
the triadic interaction among personalization, privacy guarantees, and fairness
attained by models trained within the FL framework. Differential privacy and
its variants have been studied and applied as cutting-edge standards for
providing formal privacy guarantees. However, clients in FL often hold very
diverse datasets representing heterogeneous communities, making it important to
protect their sensitive information while still ensuring that the trained model
upholds the aspect of fairness for the users. To attain this objective, a
method is put forth that introduces group privacy assurances through the
utilization of -privacy (aka metric privacy). -privacy represents a
localized form of differential privacy that relies on a metric-oriented
obfuscation approach to maintain the original data's topological distribution.
This method, besides enabling personalized model training in a federated
approach and providing formal privacy guarantees, possesses significantly
better group fairness measured under a variety of standard metrics than a
global model trained within a classical FL template. Theoretical justifications
for the applicability are provided, as well as experimental validation on
real-world datasets to illustrate the working of the proposed method
- …