37 research outputs found
Women are warmer but no less assertive than men: gender and language on Facebook
Using a large social media dataset and open-vocabulary methods from computational linguistics, we explored differences in language use across gender, affiliation, and assertiveness. In Study 1, we analyzed topics (groups of semantically similar words) across 10 million messages from over 52,000 Facebook users. Most language differed little across gender. However, topics most associated with self-identified female participants included friends, family, and social life, whereas topics most associated with self-identified male participants included swearing, anger, discussion of objects instead of people, and the use of argumentative language. In Study 2, we plotted male- and female-linked language topics along two interpersonal dimensions prevalent in gender research: affiliation and assertiveness. In a sample of over 15,000 Facebook users, we found substantial gender differences in the use of affiliative language and slight differences in assertive language. Language used more by self-identified females was interpersonally warmer, more compassionate, polite, and—contrary to previous findings—slightly more assertive in their language use, whereas language used more by self-identified males was colder, more hostile, and impersonal. Computational linguistic analysis combined with methods to automatically label topics offer means for testing psychological theories unobtrusively at large scale.This work was supported by the Templeton Religion Trust
Recommended from our members
Living in the past, present, and future: measuring temporal orientation with language
OBJECTIVE: Temporal orientation refers to individual differences in the relative emphasis one places on the past, present, or future, and is related to academic, financial, and health outcomes. We propose and evaluate a method for automatically measuring temporal orientation through language expressed on social media. METHOD: Judges rated the temporal orientation of 4,302 social media messages. We trained a classifier based on these ratings, which could accurately predict the temporal orientation of new messages in a separate validation set (accuracy/mean sensitivity = .72; mean specificity = .77). We used the classifier to automatically classify 1.3 million messages written by 5,372 participants (50% female, aged 13-48). Finally, we tested whether individual differences in past, present, and future orientation differentially related to gender, age, Big Five personality, satisfaction with life, and depressive symptoms. RESULTS: Temporal orientations exhibit several expected correlations with age, gender, and Big Five personality. More future-oriented people were older, more likely to be female, more conscientious, less impulsive, less depressed, and more satisfied with life; present orientation showed the opposite pattern. CONCLUSION: Language-based assessments can complement and extend existing measures of temporal orientation, providing an alternative approach and additional insights into language and personality relationships. This article is protected by copyright. All rights reserved.Support for this article was provided by grant #63597 from the Robert Wood Johnson Foundation (M. E. P. Seligman, PI) and by a grant from the Templeton Religion Trust (M.E.P. Seligman, H. A. Schwartz, L. H. Ungar, co-PIs)
Personality, gender, and age in the language of social media: the open-vocabulary approach
We analyzed 700 million words, phrases, and topic instances collected from the Facebook messages of 75,000 volunteers, who also took standard personality tests, and found striking variations in language with personality, gender, and age. In our open-vocabulary technique, the data itself drives a comprehensive exploration of language that distinguishes people, finding connections that are not captured with traditional closed-vocabulary word-category analyses. Our analyses shed new light on psychosocial processes yielding results that are face valid (e.g., subjects living in high elevations talk about the mountains), tie in with other research (e.g., neurotic people disproportionately use the phrase ‘sick of’ and the word ‘depressed’), suggest new hypotheses (e.g., an active life implies emotional stability), and give detailed insights (males use the possessive ‘my’ when mentioning their ‘wife’ or ‘girlfriend’ more often than females use ‘my’ with ‘husband’ or 'boyfriend’). To date, this represents the largest study, by an order of magnitude, of language and personalit
Recommended from our members
The language of religious affiliation: social, emotional, and cognitive differences
Religious affiliation is an important identifying characteristic for many individuals and relates to numerous life outcomes including health, well-being, policy positions, and cognitive style. Using methods from computational linguistics, we examined language from 12,815 Facebook users in the United States and United Kingdom who indicated their religious affiliation. Religious individuals used more positive emotion words (β = .278, p < .0001) and social themes such as family (β = .242, p < .0001), while nonreligious people expressed more negative emotions like anger (β = −.427, p < .0001) and categories related to cognitive processes, like tentativeness (β = −.153, p < .0001). Nonreligious individuals also used more themes related to the body (β = −.265, p < .0001) and death (β = −.247, p < .0001). The findings offer directions for future research on religious affiliation, specifically in terms of social, emotional, and cognitive differences
Methodological developments in violence research
Über Jahrzehnte wurde Gewalt durch Interviews mit Betroffenen oder Tätern, durch teilnehmende Beobachtung oder Gewaltstatistiken untersucht, meist unter Verwendung entweder qualitativer oder quantitativer Analysemethoden. Seit der Jahrhundertwende stehen Forschenden eine Reihe neuer Ansätze zur Verfügung: Es gibt immer mehr Videoaufnahmen von gewaltsamen Ereignissen, Mixed Methods-Ansätze werden stetig weiterentwickelt und durch Computational Social Sciences finden Big Data-Ansätze Einzug in immer mehr Forschungsfelder. Diese drei Entwicklungen bieten großes Potenzial für die quantitative und qualitative Gewaltforschung. Der vorliegende Beitrag diskutiert Videodatenanalyse, Triangulation und Mixed Methods-Ansätze sowie Big Data und bespricht den gegenwärtigen und zukünftigen Einfluss der genannten Entwicklungen auf das Forschungsfeld. Das Augenmerk liegt besonders darauf, (1) wie neuere Videodaten genutzt werden können, um Gewalt zu untersuchen und wo ihre Vor- und Nachteile liegen, (2) wie Triangulation und Mixed Methods-Ansätze umfassendere Analysen und theoretische Verknüpfungen in der Gewaltforschung ermöglichen und (3) wo Anwendungen von Big Data und Computational Social Science in der Gewaltforschung liegen können.For decades violence research has relied on interviews with victims and perpetrators, on participant observation, and on survey methods, and most studies focused on either qualitative or quantitative analytic strategies. Since the turn of the millennium, researchers can draw on a range of new approaches: there are increasing amounts of video data of violent incidents, triangulation and mixed methods approaches become ever more sophisticated, and computational social sciences introduce big data analysis to more and more research fields. These three developments hold great potential for quantitative and qualitative violence research. This paper discusses video data analysis, mixed methods, and big data in the context of current and future violence research. Specific focus lies on (1) potentials and challenges of new video data for studying violence; (2) the role of triangulation and mixed methods in enabling more comprehensive violence research from multiple theoretical perspectives, and (3) what potential uses of big data and computational social science in violence research may look like
Estimating geographic subjective well-being from Twitter: A comparison of dictionary and data-driven language methods
Researchers and policy makers worldwide are interested in measuring the subjective well-being of populations. When users post on social media, they leave behind digital traces that reflect their thoughts and feelings. Aggregation of such digital traces may make it possible to monitor well-being at large scale. However, social media-based methods need to be robust to regional effects if they are to produce reliable estimates. Using a sample of 1.53 billion geotagged English tweets, we provide a systematic evaluation of word-level and data-driven methods for text analysis for generating well-being estimates for 1,208 US counties. We compared Twitter-based county-level estimates with well-being measurements provided by the Gallup-Sharecare Well-Being Index survey through 1.73 million phone surveys. We find that word-level methods (e.g., Linguistic Inquiry and Word Count [LIWC] 2015 and Language Assessment by Mechanical Turk [LabMT]) yielded inconsistent county-level well-being measurements due to regional, cultural, and socioeconomic differences in language use. However, removing as few as three of the most frequent words led to notable improvements in well-being prediction. Data-driven methods provided robust estimates, approximating the Gallup data at up to r = 0.64. We show that the findings generalized to county socioeconomic and health outcomes and were robust when poststratifying the samples to be more representative of the general US population. Regional well-being estimation from social media data seems to be robust when supervised data-driven methods are used
(Un)Happiness and voting in U.S. Presidential elections
A rapidly growing literature has attempted to explain Donald Trump's success in the 2016 U.S. presidential election as a result of a wide variety of differences in individual characteristics, attitudes, and social processes. We propose that the economic and psychological processes previously established have in common that they generated or electorally capitalized on unhappiness in the electorate, which emerges as a powerful high-level predictor of the 2016 electoral outcome. Drawing on a large data set covering over 2 million individual surveys, which we aggregated to the county level, we find that low levels of evaluative, experienced, and eudaemonic subjective well-being (SWB) are strongly predictive of Trump's victory, accounting for an exhaustive list of demographic, ideological, and socioeconomic covariates and robustness checks. County-level future life evaluation alone correlates with the Trump vote share over Republican baselines at r = -.78 in the raw data, a magnitude rarely seen in the social sciences. We show similar findings when examining the association between individual-level life satisfaction and Trump voting. Low levels of SWB also predict anti-incumbent voting at the 2012 election, both at the county and individual level. The findings suggest that SWB is a powerful high-level marker of (dis)content and that SWB should be routinely considered alongside economic explanations of electoral choice
Recommended from our members
The Internet and Participation Inequality: A Multilevel Examination of 108 Countries
This study investigates the role of the Internet in civic participation inequality across 108 countries. Merging individual-level survey data from the 2016 Gallup World Poll with country-level indices, we conduct multilevel analyses to answer three broader sets of questions: (1) Does access to the Internet increase the likelihood of civic participation? (2) Does Internet access amplify or lessen socioeconomic stratification in civic participation? (3) Do press freedom and government intervention as contextual factors shape the role of the Internet in civic participation inequality? The findings suggest that Internet access increases the likelihood of civic participation while it also deepens socioeconomic stratification in participation. Cross-level interactions unveil that the intervening role of the Internet remains unaffected by press freedom, but government intervention through the promotion of ICT use can help control the growing inequality. We discuss the theoretical implications of these findings for political inequality research and the applied global significance
Recommended from our members
Personality, gender, and age in the language of social media: the open-vocabulary approach
We analyzed 700 million words, phrases, and topic instances collected from the Facebook messages of 75,000 volunteers, who also took standard personality tests, and found striking variations in language with personality, gender, and age. In our open-vocabulary technique, the data itself drives a comprehensive exploration of language that distinguishes people, finding connections that are not captured with traditional closed-vocabulary word-category analyses. Our analyses shed new light on psychosocial processes yielding results that are face valid (e.g., subjects living in high elevations talk about the mountains), tie in with other research (e.g., neurotic people disproportionately use the phrase ‘sick of’ and the word ‘depressed’), suggest new hypotheses (e.g., an active life implies emotional stability), and give detailed insights (males use the possessive ‘my’ when mentioning their ‘wife’ or ‘girlfriend’ more often than females use ‘my’ with ‘husband’ or 'boyfriend’). To date, this represents the largest study, by an order of magnitude, of language and personalit