9 research outputs found
Rankers, Rankees, & Rankings: Peeking into the Pandora's Box from a Socio-Technical Perspective
Algorithmic rankers have a profound impact on our increasingly data-driven
society. From leisurely activities like the movies that we watch, the
restaurants that we patronize; to highly consequential decisions, like making
educational and occupational choices or getting hired by companies -- these are
all driven by sophisticated yet mostly inaccessible rankers. A small change to
how these algorithms process the rankees (i.e., the data items that are ranked)
can have profound consequences. For example, a change in rankings can lead to
deterioration of the prestige of a university or have drastic consequences on a
job candidate who missed out being in the list of the preferred top-k for an
organization. This paper is a call to action to the human-centered data science
research community to develop principled methods, measures, and metrics for
studying the interactions among the socio-technical context of use,
technological innovations, and the resulting consequences of algorithmic
rankings on multiple stakeholders. Given the spate of new legislations on
algorithmic accountability, it is imperative that researchers from social
science, human-computer interaction, and data science work in unison for
demystifying how rankings are produced, who has agency to change them, and what
metrics of socio-technical impact one must use for informing the context of
use.Comment: Accepted for Interrogating Human-Centered Data Science workshop at
CHI'2
Within-layer Diversity Reduces Generalization Gap
Neural networks are composed of multiple layers arranged in a hierarchical
structure jointly trained with a gradient-based optimization, where the errors
are back-propagated from the last layer back to the first one. At each
optimization step, neurons at a given layer receive feedback from neurons
belonging to higher layers of the hierarchy. In this paper, we propose to
complement this traditional 'between-layer' feedback with additional
'within-layer' feedback to encourage diversity of the activations within the
same layer. To this end, we measure the pairwise similarity between the outputs
of the neurons and use it to model the layer's overall diversity. By penalizing
similarities and promoting diversity, we encourage each neuron to learn a
distinctive representation and, thus, to enrich the data representation learned
within the layer and to increase the total capacity of the model. We
theoretically study how the within-layer activation diversity affects the
generalization performance of a neural network and prove that increasing the
diversity of hidden activations reduces the estimation error. In addition to
the theoretical guarantees, we present an empirical study on three datasets
confirming that the proposed approach enhances the performance of
state-of-the-art neural network models and decreases the generalization gap.Comment: 18 pages, 1 figure, 3 Table
Qualité, équité, transparence, vérification et explicabilité des décisions algorithmiques
International audienceWe consider aspects, especially technical, of the quality, fairness, transparency, and explainability of algorithmic decisions.Nous considérons des aspects, surtout techniques, de la qualité, l'équité, la transparence, et l'explicabilité des décisions algorithmiques
Impact Remediation: Optimal Interventions to Reduce Inequality
A significant body of research in the data sciences considers unfair
discrimination against social categories such as race or gender that could
occur or be amplified as a result of algorithmic decisions. Simultaneously,
real-world disparities continue to exist, even before algorithmic decisions are
made. In this work, we draw on insights from the social sciences and humanistic
studies brought into the realm of causal modeling and constrained optimization,
and develop a novel algorithmic framework for tackling pre-existing real-world
disparities. The purpose of our framework, which we call the "impact
remediation framework," is to measure real-world disparities and discover the
optimal intervention policies that could help improve equity or access to
opportunity for those who are underserved with respect to an outcome of
interest. We develop a disaggregated approach to tackling pre-existing
disparities that relaxes the typical set of assumptions required for the use of
social categories in structural causal models. Our approach flexibly
incorporates counterfactuals and is compatible with various ontological
assumptions about the nature of social categories. We demonstrate impact
remediation with a real-world case study and compare our disaggregated approach
to an existing state-of-the-art approach, comparing its structure and resulting
policy recommendations. In contrast to most work on optimal policy learning, we
explore disparity reduction itself as an objective, explicitly focusing the
power of algorithms on reducing inequality
Dealing with Intransitivity, Non-Convexity, and Algorithmic Bias in Preference Learning
Rankings are ubiquitous since they are a natural way to present information to people who are making decisions. There are seemingly countless scenarios where rankings arise, such as deciding whom to hire at a company, determining what movies to watch, purchasing products, understanding human perception, judging science fair projects, voting for political candidates, and so on. In many of these scenarios, the number of items in consideration is prohibitively large, such that asking someone to rank all of the choices is essentially impossible. On the other hand, collecting preference data on a small subset of the items is feasible, e.g., collecting answers to ``Do you prefer item A or item B?" or ``Is item A closer to item B or item C?". Therefore, an important machine learning task is to learn a ranking of the items based on this preference data. This thesis theoretically and empirically addresses three key challenges of preference learning: intransitivity in preference data, non-convex optimization, and algorithmic bias. Chapter 2 addresses the challenge of learning a ranking given pairwise comparison data that violates rational choice such as intransitivity. Our key observation is that two items compared in isolation from other items may be compared based on only a salient subset of features. Formalizing this framework, we propose the salient feature preference model and prove a sample complexity result for learning the parameters of our model and the underlying ranking with maximum likelihood estimation. Chapter 3 addresses the non-convexity of an optimization problem inspired by ordinal embedding, which is a preference learning task. We aim to understand the landscape, that is local minimizers and global minimizers, of the non-convex objective, which corresponds to the hinge loss arising from quadratic constraints. Under certain assumptions, we give necessary conditions for non-global, local minimizers of our objective and additionally show that in two dimensions, every local minimizer is a global minimizer. Chapters 4 and 5 address the challenge of algorithmic bias. We consider training machine learning models that are fair in the sense that their performance is invariant under certain sensitive perturbations to the inputs. For example, the performance of a resume screening system should be invariant under changes to the gender and ethnicity of the applicant. We formalize this notion of algorithmic fairness as a variant of individual fairness. In Chapter 4, we consider classification and develop a distributionally robust optimization approach, SenSR, that enforces this notion of individual fairness during training and provably learns individually fair classifiers. Chapter 5 builds upon Chapter 4. We develop a related algorithm, SenSTIR, to train provably individually fair learning-to-rank (LTR) models. The proposed approach ensures items from minority groups appear alongside similar items from majority groups. This notion of fair ranking is based on the individual fairness definition considered in Chapter 4 for the supervised learning context and is more nuanced than prior fair LTR approaches that simply provide underrepresented items with a basic level of exposure. The crux of our method is an optimal transport-based regularizer that enforces individual fairness and an efficient algorithm for optimizing the regularizer.PHDApplied and Interdisciplinary MathematicsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttp://deepblue.lib.umich.edu/bitstream/2027.42/166120/1/amandarg_1.pd