77 research outputs found
'Person' == Light-skinned, Western Man, and Sexualization of Women of Color: Stereotypes in Stable Diffusion
We study stereotypes embedded within one of the most popular text-to-image
generators: Stable Diffusion. We examine what stereotypes of gender and
nationality/continental identity does Stable Diffusion display in the absence
of such information i.e. what gender and nationality/continental identity is
assigned to `a person', or to `a person from Asia'. Using vision-language model
CLIP's cosine similarity to compare images generated by CLIP-based Stable
Diffusion v2.1 verified by manual examination, we chronicle results from 136
prompts (50 results/prompt) of front-facing images of persons from 6 different
continents, 27 nationalities and 3 genders. We observe how Stable Diffusion
outputs of `a person' without any additional gender/nationality information
correspond closest to images of men and least with persons of nonbinary gender,
and to persons from Europe/North America over Africa/Asia, pointing towards
Stable Diffusion having a concerning representation of personhood to be a
European/North American man. We also show continental stereotypes and resultant
harms e.g. a person from Oceania is deemed to be Australian/New Zealander over
Papua New Guinean, pointing to the erasure of Indigenous Oceanic peoples, who
form a majority over descendants of colonizers both in Papua New Guinea and in
Oceania overall. Finally, we unexpectedly observe a pattern of
oversexualization of women, specifically Latin American, Mexican, Indian and
Egyptian women relative to other nationalities, measured through an NSFW
detector. This demonstrates how Stable Diffusion perpetuates Western
fetishization of women of color through objectification in media, which if left
unchecked will amplify this stereotypical representation. Image datasets are
made publicly available.Comment: Upcoming publication, Findings of EMNLP 202
Is the U.S. Legal System Ready for AI's Challenges to Human Values?
Our interdisciplinary study investigates how effectively U.S. laws confront
the challenges posed by Generative AI to human values. Through an analysis of
diverse hypothetical scenarios crafted during an expert workshop, we have
identified notable gaps and uncertainties within the existing legal framework
regarding the protection of fundamental values, such as privacy, autonomy,
dignity, diversity, equity, and physical/mental well-being. Constitutional and
civil rights, it appears, may not provide sufficient protection against
AI-generated discriminatory outputs. Furthermore, even if we exclude the
liability shield provided by Section 230, proving causation for defamation and
product liability claims is a challenging endeavor due to the intricate and
opaque nature of AI systems. To address the unique and unforeseeable threats
posed by Generative AI, we advocate for legal frameworks that evolve to
recognize new threats and provide proactive, auditable guidelines to industry
stakeholders. Addressing these issues requires deep interdisciplinary
collaborations to identify harms, values, and mitigation strategies.Comment: 25 pages, 7 figure
Bias Against 93 Stigmatized Groups in Masked Language Models and Downstream Sentiment Classification Tasks
The rapid deployment of artificial intelligence (AI) models demands a
thorough investigation of biases and risks inherent in these models to
understand their impact on individuals and society. This study extends the
focus of bias evaluation in extant work by examining bias against social
stigmas on a large scale. It focuses on 93 stigmatized groups in the United
States, including a wide range of conditions related to disease, disability,
drug use, mental illness, religion, sexuality, socioeconomic status, and other
relevant factors. We investigate bias against these groups in English
pre-trained Masked Language Models (MLMs) and their downstream sentiment
classification tasks. To evaluate the presence of bias against 93 stigmatized
conditions, we identify 29 non-stigmatized conditions to conduct a comparative
analysis. Building upon a psychology scale of social rejection, the Social
Distance Scale, we prompt six MLMs: RoBERTa-base, RoBERTa-large, XLNet-large,
BERTweet-base, BERTweet-large, and DistilBERT. We use human annotations to
analyze the predicted words from these models, with which we measure the extent
of bias against stigmatized groups. When prompts include stigmatized
conditions, the probability of MLMs predicting negative words is approximately
20 percent higher than when prompts have non-stigmatized conditions. In the
sentiment classification tasks, when sentences include stigmatized conditions
related to diseases, disability, education, and mental illness, they are more
likely to be classified as negative. We also observe a strong correlation
between bias in MLMs and their downstream sentiment classifiers (r =0.79). The
evidence indicates that MLMs and their downstream sentiment classification
tasks exhibit biases against socially stigmatized groups.Comment: 20 pages,12 figures,2 tables; ACM FAccT 202
Evaluating Biased Attitude Associations of Language Models in an Intersectional Context
Language models are trained on large-scale corpora that embed implicit biases
documented in psychology. Valence associations (pleasantness/unpleasantness) of
social groups determine the biased attitudes towards groups and concepts in
social cognition. Building on this established literature, we quantify how
social groups are valenced in English language models using a sentence template
that provides an intersectional context. We study biases related to age,
education, gender, height, intelligence, literacy, race, religion, sex, sexual
orientation, social class, and weight. We present a concept projection approach
to capture the valence subspace through contextualized word embeddings of
language models. Adapting the projection-based approach to embedding
association tests that quantify bias, we find that language models exhibit the
most biased attitudes against gender identity, social class, and sexual
orientation signals in language. We find that the largest and better-performing
model that we study is also more biased as it effectively captures bias
embedded in sociocultural data. We validate the bias evaluation method by
overperforming on an intrinsic valence evaluation task. The approach enables us
to measure complex intersectional biases as they are known to manifest in the
outputs and applications of language models that perpetuate historical biases.
Moreover, our approach contributes to design justice as it studies the
associations of groups underrepresented in language such as transgender and
homosexual individuals.Comment: to be published in AIES 202
Pre-trained Speech Processing Models Contain Human-Like Biases that Propagate to Speech Emotion Recognition
Previous work has established that a person's demographics and speech style
affect how well speech processing models perform for them. But where does this
bias come from? In this work, we present the Speech Embedding Association Test
(SpEAT), a method for detecting bias in one type of model used for many speech
tasks: pre-trained models. The SpEAT is inspired by word embedding association
tests in natural language processing, which quantify intrinsic bias in a
model's representations of different concepts, such as race or valence
(something's pleasantness or unpleasantness) and capture the extent to which a
model trained on large-scale socio-cultural data has learned human-like biases.
Using the SpEAT, we test for six types of bias in 16 English speech models
(including 4 models also trained on multilingual data), which come from the
wav2vec 2.0, HuBERT, WavLM, and Whisper model families. We find that 14 or more
models reveal positive valence (pleasantness) associations with abled people
over disabled people, with European-Americans over African-Americans, with
females over males, with U.S. accented speakers over non-U.S. accented
speakers, and with younger people over older people. Beyond establishing that
pre-trained speech models contain these biases, we also show that they can have
real world effects. We compare biases found in pre-trained models to biases in
downstream models adapted to the task of Speech Emotion Recognition (SER) and
find that in 66 of the 96 tests performed (69%), the group that is more
associated with positive valence as indicated by the SpEAT also tends to be
predicted as speaking with higher valence by the downstream model. Our work
provides evidence that, like text and image-based models, pre-trained speech
based-models frequently learn human-like biases. Our work also shows that bias
found in pre-trained models can propagate to the downstream task of SER
- …