102 research outputs found

    Concept-Guided Chain-of-Thought Prompting for Pairwise Comparison Scaling of Texts with Large Language Models

    Full text link
    Existing text scaling methods often require a large corpus, struggle with short texts, or require labeled data. We develop a text scaling method that leverages the pattern recognition capabilities of generative large language models (LLMs). Specifically, we propose concept-guided chain-of-thought (CGCoT), which uses prompts designed to summarize ideas and identify target parties in texts to generate concept-specific breakdowns, in many ways similar to guidance for human coder content analysis. CGCoT effectively shifts pairwise text comparisons from a reasoning problem to a pattern recognition problem. We then pairwise compare concept-specific breakdowns using an LLM. We use the results of these pairwise comparisons to estimate a scale using the Bradley-Terry model. We use this approach to scale affective speech on Twitter. Our measures correlate more strongly with human judgments than alternative approaches like Wordfish. Besides a small set of pilot data to develop the CGCoT prompts, our measures require no additional labeled data and produce binary predictions comparable to a RoBERTa-Large model fine-tuned on thousands of human-labeled tweets. We demonstrate how combining substantive knowledge with LLMs can create state-of-the-art measures of abstract concepts.Comment: 26 pages, 2 figure

    Dictionary-Assisted Supervised Contrastive Learning

    Full text link
    Text analysis in the social sciences often involves using specialized dictionaries to reason with abstract concepts, such as perceptions about the economy or abuse on social media. These dictionaries allow researchers to impart domain knowledge and note subtle usages of words relating to a concept(s) of interest. We introduce the dictionary-assisted supervised contrastive learning (DASCL) objective, allowing researchers to leverage specialized dictionaries when fine-tuning pretrained language models. The text is first keyword simplified: a common, fixed token replaces any word in the corpus that appears in the dictionary(ies) relevant to the concept of interest. During fine-tuning, a supervised contrastive objective draws closer the embeddings of the original and keyword-simplified texts of the same class while pushing further apart the embeddings of different classes. The keyword-simplified texts of the same class are more textually similar than their original text counterparts, which additionally draws the embeddings of the same class closer together. Combining DASCL and cross-entropy improves classification performance metrics in few-shot learning settings and social science applications compared to using cross-entropy alone and alternative contrastive and data augmentation methods.Comment: 6 pages, 5 figures, EMNLP 202

    Don’t Republicans tweet too? Using Twitter to assess the consequences of political endorsements by celebrities

    Get PDF
    Michael Jordan supposedly justified his decision to stay out of politics by noting that Republicans buy sneakers too. In the social media era, the name of the game for celebrities is engagement with fans. So why then do celebrities risk talking about politics on social media, which is likely to antagonize a portion of their fan base? With this question in mind, we analyze approximately 220,000 tweets from 83 celebrities who chose to endorse a presidential candidate in the 2016 U.S. presidential election campaign to assess whether there is a cost—defined in terms of engagement on Twitter—for celebrities who discuss presidential candidates. We also examine whether celebrities behave similarly to other campaign surrogates in being more likely to take on the “attack dog” role by going negative more often than going positive. More specifically, we document how often celebrities of distinct political preferences tweet about Donald Trump, Bernie Sanders, and Hillary Clinton, and we show that followers of opinionated celebrities do not withhold engagement when entertainers become politically mobilized and do indeed often go negative. Interestingly, in some cases political content from celebrities actually turns out to be more popular than typical lifestyle tweets

    Of Echo Chambers and Contrarian Clubs:Exposure to Political Disagreement Among German and Italian Users of Twitter

    Get PDF
    Scholars have debated whether social media platforms, by allowing users to select the information to which they are exposed, may lead people to isolate themselves from viewpoints with which they disagree, thereby serving as political “echo chambers.” We investigate hypotheses concerning the circumstances under which Twitter users who communicate about elections would engage with (a) supportive, (b) oppositional, and (c) mixed political networks. Based on online surveys of representative samples of Italian and German individuals who posted at least one Twitter message about elections in 2013, we find substantial differences in the extent to which social media facilitates exposure to similar versus dissimilar political views. Our results suggest that exposure to supportive, oppositional, or mixed political networks on social media can be explained by broader patterns of political conversation (i.e., structure of offline networks) and specific habits in the political use of social media (i.e., the intensity of political discussion). These findings suggest that disagreement persists on social media even when ideological homophily is the modal outcome, and that scholars should pay more attention to specific situational and dispositional factors when evaluating the implications of social media for political communication

    The critical periphery in the growth of social protests

    Get PDF
    Social media have provided instrumental means of communication in many recent political protests. The efficiency of online networks in disseminating timely information has been praised by many commentators; at the same time, users are often derided as “slacktivists” because of the shallow commitment involved in clicking a forwarding button. Here we consider the role of these peripheral online participants, the immense majority of users who surround the small epicenter of protests, representing layers of diminishing online activity around the committed minority. We analyze three datasets tracking protest communication in different languages and political contexts through the social media platform Twitter and employ a network decomposition technique to examine their hierarchical structure. We provide consistent evidence that peripheral participants are critical in increasing the reach of protest messages and generating online content at levels that are comparable to core participants. Although committed minorities may constitute the heart of protest movements, our results suggest that their success in maximizing the number of online citizens exposed to protest messages depends, at least in part, on activating the critical periphery. Peripheral users are less active on a per capita basis, but their power lies in their numbers: their aggregate contribution to the spread of protest messages is comparable in magnitude to that of core participants. An analysis of two other datasets unrelated to mass protests strengthens our interpretation that core-periphery dynamics are characteristically important in the context of collective action events. Theoretical models of diffusion in social networks would benefit from increased attention to the role of peripheral nodes in the propagation of information and behavior

    Of echo chambers and contrarian clubs: Exposure to political disagreement among German and Italian users of Twitter

    Get PDF
    Scholars have debated whether social media platforms, by allowing users to select the information to which they are exposed, may lead people to isolate themselves from viewpoints with which they disagree, thereby serving as political “echo chambers.” We investigate hypotheses concerning the circumstances under which Twitter users who communicate about elections would engage with (a) supportive, (b) oppositional, and (c) mixed political networks. Based on online surveys of representative samples of Italian and German individuals who posted at least one Twitter message about elections in 2013, we find substantial differences in the extent to which social media facilitates exposure to similar versus dissimilar political views. Our results suggest that exposure to supportive, oppositional, or mixed political networks on social media can be explained by broader patterns of political conversation (i.e., structure of offline networks) and specific habits in the political use of social media (i.e., the intensity of political discussion). These findings suggest that disagreement persists on social media even when ideological homophily is the modal outcome, and that scholars should pay more attention to specific situational and dispositional factors when evaluating the implications of social media for political communication

    Direct Observation of ultrafast exciton Formation in a monolayer of WSe2

    Get PDF
    Many of the fundamental optical and electronic properties of atomically thin transition metal dichalcogenides are dominated by strong Coulomb interactions between electrons and holes, forming tightly bound atom-like states called excitons. Here, we directly trace the ultrafast formation of excitons by monitoring the absolute densities of bound and unbound electron hole pairs in single monolayers of WSe2 on a diamond substrate following femtosecond nonresonant optical excitation. To this end, phase locked mid-infrared probe pulses and field-sensitive electro-optic sampling are used to map out the full complex-valued optical conductivity of the nonequilibrium system and to discern the hallmark low-energy responses of bound and unbound pairs. While the spectral shape of the infrared response immediately after above-bandgap injection is dominated by free charge carriers, up to 60% of the electron-hole pairs are bound into excitons already on a subpicosecond time scale, evidencing extremely fast and efficient exciton formation. During the subsequent recombination phase, we still find a large density of free carriers in addition to excitons, indicating a nonequilibrium state of the photoexcited electron-hole system

    Exploiting the ANN Potential in Estimating Snow Depth and Snow Water Equivalent From the Airborne SnowSAR Data at X- and Ku-Bands

    Get PDF
    Within the framework of European Space Agency (ESA) activities, several campaigns were carried out in the last decade with the purpose of exploiting the capabilities of multifrequency synthetic aperture radar (SAR) data to retrieve snow information. This article presents the results obtained from the ESA SnowSAR airborne campaigns, carried out between 2011 and 2013 on boreal forest, tundra and alpine environments, selected as representative of different snow regimes. The aim of this study was to assess the capability of X- and Ku-bands SAR in retrieving the snow parameters, namely snow depth (SD) and snow water equivalent (SWE). The retrieval was based on machine learning (ML) techniques and, in particular, of artificial neural networks (ANNs). ANNs have been selected among other ML approaches since they are capable to offer a good compromise between retrieval accuracy and computational cost. Two approaches were evaluated, the first based on the experimental data (data driven) and the second based on data simulated by the dense medium radiative transfer (DMRT). The data driven algorithm was trained on half of the SnowSAR dataset and validated on the remaining half. The validation resulted in a correlation coefficient R ≃ 0.77 between estimated and target SD, a root-mean-square error (RMSE) ≃ 13 cm, and bias = 0.03 cm. ANN algorithms specific for each test site were also implemented, obtaining more accurate results, and the robustness of the data driven approach was evaluated over time and space. The algorithm trained with DMRT simulations and tested on the experimental dataset was able to estimate the target parameter (SWE in this case) with R = 0.74, RMSE = 34.8 mm, and bias = 1.8 mm. The model driven approach had the twofold advantage of reducing the amount of in situ data required for training the algorithm and of extending the algorithm exportability to other test sites
