37 research outputs found

    Women are warmer but no less assertive than men: gender and language on Facebook

    Get PDF
    Using a large social media dataset and open-vocabulary methods from computational linguistics, we explored differences in language use across gender, affiliation, and assertiveness. In Study 1, we analyzed topics (groups of semantically similar words) across 10 million messages from over 52,000 Facebook users. Most language differed little across gender. However, topics most associated with self-identified female participants included friends, family, and social life, whereas topics most associated with self-identified male participants included swearing, anger, discussion of objects instead of people, and the use of argumentative language. In Study 2, we plotted male- and female-linked language topics along two interpersonal dimensions prevalent in gender research: affiliation and assertiveness. In a sample of over 15,000 Facebook users, we found substantial gender differences in the use of affiliative language and slight differences in assertive language. Language used more by self-identified females was interpersonally warmer, more compassionate, polite, and—contrary to previous findings—slightly more assertive in their language use, whereas language used more by self-identified males was colder, more hostile, and impersonal. Computational linguistic analysis combined with methods to automatically label topics offer means for testing psychological theories unobtrusively at large scale.This work was supported by the Templeton Religion Trust

    Personality, gender, and age in the language of social media: the open-vocabulary approach

    Get PDF
    We analyzed 700 million words, phrases, and topic instances collected from the Facebook messages of 75,000 volunteers, who also took standard personality tests, and found striking variations in language with personality, gender, and age. In our open-vocabulary technique, the data itself drives a comprehensive exploration of language that distinguishes people, finding connections that are not captured with traditional closed-vocabulary word-category analyses. Our analyses shed new light on psychosocial processes yielding results that are face valid (e.g., subjects living in high elevations talk about the mountains), tie in with other research (e.g., neurotic people disproportionately use the phrase ‘sick of’ and the word ‘depressed’), suggest new hypotheses (e.g., an active life implies emotional stability), and give detailed insights (males use the possessive ‘my’ when mentioning their ‘wife’ or ‘girlfriend’ more often than females use ‘my’ with ‘husband’ or 'boyfriend’). To date, this represents the largest study, by an order of magnitude, of language and personalit

    Methodological developments in violence research

    Get PDF
    Über Jahrzehnte wurde Gewalt durch Interviews mit Betroffenen oder Tätern, durch teilnehmende Beobachtung oder Gewaltstatistiken untersucht, meist unter Verwendung entweder qualitativer oder quantitativer Analysemethoden. Seit der Jahrhundertwende stehen Forschenden eine Reihe neuer Ansätze zur Verfügung: Es gibt immer mehr Videoaufnahmen von gewaltsamen Ereignissen, Mixed Methods-Ansätze werden stetig weiterentwickelt und durch Computational Social Sciences finden Big Data-Ansätze Einzug in immer mehr Forschungsfelder. Diese drei Entwicklungen bieten großes Potenzial für die quantitative und qualitative Gewaltforschung. Der vorliegende Beitrag diskutiert Videodatenanalyse, Triangulation und Mixed Methods-Ansätze sowie Big Data und bespricht den gegenwärtigen und zukünftigen Einfluss der genannten Entwicklungen auf das Forschungsfeld. Das Augenmerk liegt besonders darauf, (1) wie neuere Videodaten genutzt werden können, um Gewalt zu untersuchen und wo ihre Vor- und Nachteile liegen, (2) wie Triangulation und Mixed Methods-Ansätze umfassendere Analysen und theoretische Verknüpfungen in der Gewaltforschung ermöglichen und (3) wo Anwendungen von Big Data und Computational Social Science in der Gewaltforschung liegen können.For decades violence research has relied on interviews with victims and perpetrators, on participant observation, and on survey methods, and most studies focused on either qualitative or quantitative analytic strategies. Since the turn of the millennium, researchers can draw on a range of new approaches: there are increasing amounts of video data of violent incidents, triangulation and mixed methods approaches become ever more sophisticated, and computational social sciences introduce big data analysis to more and more research fields. These three developments hold great potential for quantitative and qualitative violence research. This paper discusses video data analysis, mixed methods, and big data in the context of current and future violence research. Specific focus lies on (1) potentials and challenges of new video data for studying violence; (2) the role of triangulation and mixed methods in enabling more comprehensive violence research from multiple theoretical perspectives, and (3) what potential uses of big data and computational social science in violence research may look like

    Estimating geographic subjective well-being from Twitter: A comparison of dictionary and data-driven language methods

    Get PDF
    Researchers and policy makers worldwide are interested in measuring the subjective well-being of populations. When users post on social media, they leave behind digital traces that reflect their thoughts and feelings. Aggregation of such digital traces may make it possible to monitor well-being at large scale. However, social media-based methods need to be robust to regional effects if they are to produce reliable estimates. Using a sample of 1.53 billion geotagged English tweets, we provide a systematic evaluation of word-level and data-driven methods for text analysis for generating well-being estimates for 1,208 US counties. We compared Twitter-based county-level estimates with well-being measurements provided by the Gallup-Sharecare Well-Being Index survey through 1.73 million phone surveys. We find that word-level methods (e.g., Linguistic Inquiry and Word Count [LIWC] 2015 and Language Assessment by Mechanical Turk [LabMT]) yielded inconsistent county-level well-being measurements due to regional, cultural, and socioeconomic differences in language use. However, removing as few as three of the most frequent words led to notable improvements in well-being prediction. Data-driven methods provided robust estimates, approximating the Gallup data at up to r = 0.64. We show that the findings generalized to county socioeconomic and health outcomes and were robust when poststratifying the samples to be more representative of the general US population. Regional well-being estimation from social media data seems to be robust when supervised data-driven methods are used

    (Un)Happiness and voting in U.S. Presidential elections

    No full text
    A rapidly growing literature has attempted to explain Donald Trump's success in the 2016 U.S. presidential election as a result of a wide variety of differences in individual characteristics, attitudes, and social processes. We propose that the economic and psychological processes previously established have in common that they generated or electorally capitalized on unhappiness in the electorate, which emerges as a powerful high-level predictor of the 2016 electoral outcome. Drawing on a large data set covering over 2 million individual surveys, which we aggregated to the county level, we find that low levels of evaluative, experienced, and eudaemonic subjective well-being (SWB) are strongly predictive of Trump's victory, accounting for an exhaustive list of demographic, ideological, and socioeconomic covariates and robustness checks. County-level future life evaluation alone correlates with the Trump vote share over Republican baselines at r = -.78 in the raw data, a magnitude rarely seen in the social sciences. We show similar findings when examining the association between individual-level life satisfaction and Trump voting. Low levels of SWB also predict anti-incumbent voting at the 2012 election, both at the county and individual level. The findings suggest that SWB is a powerful high-level marker of (dis)content and that SWB should be routinely considered alongside economic explanations of electoral choice
    corecore