Search CORE

41 research outputs found

Dialog Structure Through the Lens of Gender, Gender Environment, and Power

Author: Prabhakaran Vinodkumar
Rambow Owen
Publication venue: University of Illinois at Chicago Library
Publication date: 19/05/2017
Field of study

Understanding how the social context of an interaction affects our dialog behavior is of great interest to social scientists who study human behavior, as well as to computer scientists who build automatic methods to infer those social contexts. In this paper, we study the interaction of power, gender, and dialog behavior in organizational interactions. In order to perform this study, we first construct the Gender Identified Enron Corpus of emails, in which we semi-automatically assign the gender of around 23,000 individuals who authored around 97,000 email messages in the Enron corpus. This corpus, which is made freely available, is orders of magnitude larger than previously existing gender identified corpora in the email domain. Next, we use this corpus to perform a largescale data-oriented study of the interplay of gender and manifestations of power. We argue that, in addition to one’s own gender, the “gender environment” of an interaction, i.e., the gender makeup of one’s interlocutors, also affects the way power is manifested in dialog. We focus especially on manifestations of power in the dialog structure — both, in a shallow sense that disregards the textual content of messages (e.g., how often do the participants contribute, how often do they get replies etc.), as well as the structure that is expressed within the textual content (e.g., who issues requests and how are they made, whose requests get responses etc.). We find that both gender and gender environment affect the ways power is manifested in dialog, resulting in patterns that reveal the underlying factors. Finally, we show the utility of gender information in the problem of automatically predicting the direction of power between pairs of participants in email interactions

University of Illinois at Chicago: Journals@UIC

arXiv.org e-Print Archive

Dialogue & Discourse (E-Journal - Universität Bielefeld)

Recommended from our members

Social Power in Interactions: Computational Analysis and Detection of Power Relations

Author: Prabhakaran Gourinivas Vinodkumar
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2015
Field of study

In this thesis, I investigate whether social power relations are manifested in the language and structure of social interactions, and if so, in what ways, and whether we can use the insights gained from this study to build computational systems that can automatically identify these power relations by analyzing social interactions. To further understand these manifestations, I extend this study in two ways. First, I investigate whether a person’s gender and the gender makeup of an interaction (e.g., are most participants female?) affect the manifestations of his/her power (or lack of it) and whether it can help improve the predictive performance of an automatic power prediction system. Second, I investigate whether different types of power manifest differently in interactions, and whether they exhibit different but predictable patterns. I perform this study on interactions from two different genres: organizational emails, which contain task oriented written interactions, and political debates, which contain discursive spoken interactions

Columbia University Academic Commons

Disentangling Perceptions of Offensiveness: Cultural and Moral Correlates

Author: Baker Dylan
Davani Aida
Díaz Mark
Prabhakaran Vinodkumar
Publication venue
Publication date: 11/12/2023
Field of study

Perception of offensiveness is inherently subjective, shaped by the lived experiences and socio-cultural values of the perceivers. Recent years have seen substantial efforts to build AI-based tools that can detect offensive language at scale, as a means to moderate social media platforms, and to ensure safety of conversational AI technologies such as ChatGPT and Bard. However, existing approaches treat this task as a technical endeavor, built on top of data annotated for offensiveness by a global crowd workforce without any attention to the crowd workers' provenance or the values their perceptions reflect. We argue that cultural and psychological factors play a vital role in the cognitive processing of offensiveness, which is critical to consider in this context. We re-frame the task of determining offensiveness as essentially a matter of moral judgment -- deciding the boundaries of ethically wrong vs. right language within an implied set of socio-cultural norms. Through a large-scale cross-cultural study based on 4309 participants from 21 countries across 8 cultural regions, we demonstrate substantial cross-cultural differences in perceptions of offensiveness. More importantly, we find that individual moral values play a crucial role in shaping these variations: moral concerns about Care and Purity are significant mediating factors driving cross-cultural differences. These insights are of crucial importance as we build AI models for the pluralistic world, where the values they espouse should aim to respect and account for moral values in diverse geo-cultural contexts

arXiv.org e-Print Archive

Whose Ground Truth? Accounting for Individual and Collective Identities Underlying Dataset Annotation

Author: Denton Remi
Díaz Mark
Kivlichan Ian
Prabhakaran Vinodkumar
Rosen Rachel
Publication venue
Publication date: 08/12/2021
Field of study

Human annotations play a crucial role in machine learning (ML) research and development. However, the ethical considerations around the processes and decisions that go into building ML datasets has not received nearly enough attention. In this paper, we survey an array of literature that provides insights into ethical considerations around crowdsourced dataset annotation. We synthesize these insights, and lay out the challenges in this space along two layers: (1) who the annotator is, and how the annotators' lived experiences can impact their annotations, and (2) the relationship between the annotators and the crowdsourcing platforms and what that relationship affords them. Finally, we put forth a concrete set of recommendations and considerations for dataset developers at various stages of the ML data pipeline: task formulation, selection of annotators, platform and infrastructure choices, dataset analysis and evaluation, and dataset documentation and release

arXiv.org e-Print Archive

SeeGULL Multilingual: a Dataset of Geo-Culturally Situated Stereotypes

Author: Bhutani Mukul
Dave Shachi
Dev Sunipa
Prabhakaran Vinodkumar
Robinson Kevin
Publication venue
Publication date: 08/03/2024
Field of study

While generative multilingual models are rapidly being deployed, their safety and fairness evaluations are largely limited to resources collected in English. This is especially problematic for evaluations targeting inherently socio-cultural phenomena such as stereotyping, where it is important to build multi-lingual resources that reflect the stereotypes prevalent in respective language communities. However, gathering these resources, at scale, in varied languages and regions pose a significant challenge as it requires broad socio-cultural knowledge and can also be prohibitively expensive. To overcome this critical gap, we employ a recently introduced approach that couples LLM generations for scale with culturally situated validations for reliability, and build SeeGULL Multilingual, a global-scale multilingual dataset of social stereotypes, containing over 25K stereotypes, spanning 20 languages, with human annotations across 23 regions, and demonstrate its utility in identifying gaps in model evaluations. Content warning: Stereotypes shared in this paper can be offensive

arXiv.org e-Print Archive