23 research outputs found
For Women, Life, Freedom: A Participatory AI-Based Social Web Analysis of a Watershed Moment in Iran's Gender Struggles
In this paper, we present a computational analysis of the Persian language
Twitter discourse with the aim to estimate the shift in stance toward gender
equality following the death of Mahsa Amini in police custody. We present an
ensemble active learning pipeline to train a stance classifier. Our novelty
lies in the involvement of Iranian women in an active role as annotators in
building this AI system. Our annotators not only provide labels, but they also
suggest valuable keywords for more meaningful corpus creation as well as
provide short example documents for a guided sampling step. Our analyses
indicate that Mahsa Amini's death triggered polarized Persian language
discourse where both fractions of negative and positive tweets toward gender
equality increased. The increase in positive tweets was slightly greater than
the increase in negative tweets. We also observe that with respect to account
creation time, between the state-aligned Twitter accounts and pro-protest
Twitter accounts, pro-protest accounts are more similar to baseline Persian
Twitter activity.Comment: Accepted at IJCAI 2023 (AI for good track
Down the Toxicity Rabbit Hole: Investigating PaLM 2 Guardrails
This paper conducts a robustness audit of the safety feedback of PaLM 2
through a novel toxicity rabbit hole framework introduced here. Starting with a
stereotype, the framework instructs PaLM 2 to generate more toxic content than
the stereotype. Every subsequent iteration it continues instructing PaLM 2 to
generate more toxic content than the previous iteration until PaLM 2 safety
guardrails throw a safety violation. Our experiments uncover highly disturbing
antisemitic, Islamophobic, racist, homophobic, and misogynistic (to list a few)
generated content that PaLM 2 safety guardrails do not evaluate as highly
unsafe
Disentangling Societal Inequality from Model Biases: Gender Inequality in Divorce Court Proceedings
Divorce is the legal dissolution of a marriage by a court. Since this is
usually an unpleasant outcome of a marital union, each party may have reasons
to call the decision to quit which is generally documented in detail in the
court proceedings. Via a substantial corpus of 17,306 court proceedings, this
paper investigates gender inequality through the lens of divorce court
proceedings. While emerging data sources (e.g., public court records) on
sensitive societal issues hold promise in aiding social science research,
biases present in cutting-edge natural language processing (NLP) methods may
interfere with or affect such studies. We thus require a thorough analysis of
potential gaps and limitations present in extant NLP resources. In this paper,
on the methodological side, we demonstrate that existing NLP resources required
several non-trivial modifications to quantify societal inequalities. On the
substantive side, we find that while a large number of court cases perhaps
suggest changing norms in India where women are increasingly challenging
patriarchy, AI-powered analyses of these court proceedings indicate striking
gender inequality with women often subjected to domestic violence.Comment: This paper is accepted at IJCAI 2023 (AI for good track
Subjective Crowd Disagreements for Subjective Data: Uncovering Meaningful CrowdOpinion with Population-level Learning
Human-annotated data plays a critical role in the fairness of AI systems,
including those that deal with life-altering decisions or moderating
human-created web/social media content. Conventionally, annotator disagreements
are resolved before any learning takes place. However, researchers are
increasingly identifying annotator disagreement as pervasive and meaningful.
They also question the performance of a system when annotators disagree.
Particularly when minority views are disregarded, especially among groups that
may already be underrepresented in the annotator population. In this paper, we
introduce \emph{CrowdOpinion}\footnote{Accepted for publication at ACL 2023},
an unsupervised learning based approach that uses language features and label
distributions to pool similar items into larger samples of label distributions.
We experiment with four generative and one density-based clustering method,
applied to five linear combinations of label distributions and features. We use
five publicly available benchmark datasets (with varying levels of annotator
disagreements) from social media (Twitter, Gab, and Reddit). We also experiment
in the wild using a dataset from Facebook, where annotations come from the
platform itself by users reacting to posts. We evaluate \emph{CrowdOpinion} as
a label distribution prediction task using KL-divergence and a single-label
problem using accuracy measures.Comment: Accepted for Publication at ACL 202