203 research outputs found

    How Polarized Have We Become? A Multimodal Classification of Trump Followers and Clinton Followers

    Full text link
    Polarization in American politics has been extensively documented and analyzed for decades, and the phenomenon became all the more apparent during the 2016 presidential election, where Trump and Clinton depicted two radically different pictures of America. Inspired by this gaping polarization and the extensive utilization of Twitter during the 2016 presidential campaign, in this paper we take the first step in measuring polarization in social media and we attempt to predict individuals' Twitter following behavior through analyzing ones' everyday tweets, profile images and posted pictures. As such, we treat polarization as a classification problem and study to what extent Trump followers and Clinton followers on Twitter can be distinguished, which in turn serves as a metric of polarization in general. We apply LSTM to processing tweet features and we extract visual features using the VGG neural network. Integrating these two sets of features boosts the overall performance. We are able to achieve an accuracy of 69%, suggesting that the high degree of polarization recorded in the literature has started to manifest itself in social media as well.Comment: 16 pages, SocInfo 2017, 9th International Conference on Social Informatic

    Measuring, Predicting and Visualizing Short-Term Change in Word Representation and Usage in VKontakte Social Network

    Full text link
    Language in social media is extremely dynamic: new words emerge, trend and disappear, while the meaning of existing words can fluctuate over time. Such dynamics are especially notable during a period of crisis. This work addresses several important tasks of measuring, visualizing and predicting short term text representation shift, i.e. the change in a word's contextual semantics, and contrasting such shift with surface level word dynamics, or concept drift, observed in social media streams. Unlike previous approaches on learning word representations from text, we study the relationship between short-term concept drift and representation shift on a large social media corpus - VKontakte posts in Russian collected during the Russia-Ukraine crisis in 2014-2015. Our novel contributions include quantitative and qualitative approaches to (1) measure short-term representation shift and contrast it with surface level concept drift; (2) build predictive models to forecast short-term shifts in meaning from previous meaning as well as from concept drift; and (3) visualize short-term representation shift for example keywords to demonstrate the practical use of our approach to discover and track meaning of newly emerging terms in social media. We show that short-term representation shift can be accurately predicted up to several weeks in advance. Our unique approach to modeling and visualizing word representation shifts in social media can be used to explore and characterize specific aspects of the streaming corpus during crisis events and potentially improve other downstream classification tasks including real-time event detection

    Identifying Degree-of-Concern on COVID-19 topics with text classification of Twitters

    Get PDF
    The COVID-19 pandemic has various impacts on changing people’s behavior socially and individually. This study identifies the Degree-of-Concern topic of COVID-19 through citizen conversations on Twitter. It aims to help related parties make policies for developing appropriate emergency response strategies in dealing with changes in people’s behavior due to the pandemic. The object of research is 12,000 data from verified Twitter accounts in Surabaya. The varied nature of Twitter needs to be classified to address specific COVID-19 topics. The first stage of classification is to separate Twitter data into COVID-19 and non-COVID-19. The second stage is to classify the COVID-19 data into seven classes: warnings and suggestions, notification of information, donations, emotional support, seeking help, criticism, and hoaxes. Classification is carried out using a combination of word embedding (Word2Vec and fastText) and deep learning methods (CNN, RNN, and LSTM). The trial was carried out with three scenarios with different numbers of train data for each scenario. The classification results show the highest accuracy is 97.3% and 99.4% for the first and second stage classification obtained from the combination of fastText and LSTM. The results show that the classification of the COVID-19 topic can be used to identify Degree-of-Concern properly. The results of the Degree-of-Concern identification based on the classification can be used as a basis for related parties in making policies to formulate appropriate emergency response strategies in dealing with changes in public behavior due to a pandemic
    • …
    corecore