3 research outputs found

    Analyzing Polarization on Social Media: A Case Study of the 2022 Brazil Presidential Election

    Get PDF
    Project Work presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Data ScienceSocial Media has become a big part of our society and has now a significant role in the relationships between inter and intra-communities. Twitter is now an important communication platform for political campaigns: in the last years, politicians, campaigners, and general users have been extensively using Twitter to promote campaigns and engage in political discussions. Some studies argue that social media can create filter bubbles by limiting the flow of online information, and therefore creating communities where exposure to political diversity is rare. This selective exposure can build echo chambers where individuals only interact with those who have the same opinions as they have and by doing that, they build a polarized community. Identifying, understanding, and mitigating polarization is very important for the democratic process. People should be exposed to different ideas and opinions so they can choose their representatives without being influenced by some portion of the information. This project analyzed political polarization on social media using data from Twitter. Brazil’s presidential election in 2022 was used as a case study. Tweets from the two main candidates were extracted. A Topic Modeling algorithm was used to cluster tweets in topics. An Engagement Graph was built based on the interactions between users, candidates, and topics and was used to compute the Topic Centrality measures. A pre-trained Sentiment Analysis model was used to measure the sentiment polarity of each tweet. In the end, the project analyzed the extracted features and identified which topics were more central to each candidate and how users interact with them. The major conclusion of this work is that polarization in Brazil is more affective than ideological since the user’s sentiments towards topics are not as relevant as the sentiments towards the candidates

    Expressions of psychological stress on Twitter: detection and characterisation

    Get PDF
    A thesis submitted in partial fulfilment of the requirements of the University of Wolverhampton for the degree of Doctor of Philosophy.Long-term psychological stress is a significant predictive factor for individual mental health and short-term stress is a useful indicator of an immediate problem. Traditional psychology studies have relied on surveys to understand reasons for stress in general and in specific contexts. The popularity and ubiquity of social media make it a potential data source for identifying and characterising aspects of stress. Previous studies of stress in social media have focused on users responding to stressful personal life events. Prior social media research has not explored expressions of stress in other important domains, however, including travel and politics. This thesis detects and analyses expressions of psychological stress in social media. So far, TensiStrength is the only existing lexicon for stress and relaxation scores in social media. Using a word-vector based word sense disambiguation method, the TensiStrength lexicon was modified to include the stress scores of the different senses of the same word. On a dataset of 1000 tweets containing ambiguous stress-related words, the accuracy of the modified TensiStrength increased by 4.3%. This thesis also finds and reports characteristics of a multiple-domain stress dataset of 12000 tweets, 3000 each for airlines, personal events, UK politics, and London traffic. A two-step method for identifying stressors in tweets was implemented. The first step used LDA topic modelling and k-means clustering to find a set of types of stressors (e.g., delay, accident). Second, three word-vector based methods - maximum-word similarity, context-vector similarity, and cluster-vector similarity - were used to detect the stressors in each tweet. The cluster vector similarity method was found to identify the stressors in tweets in all four domains better than machine learning classifiers, based on the performance metrics of accuracy, precision, recall, and f-measure. Swearing and sarcasm were also analysed in high-stress and no-stress datasets from the four domains using a Convolutional Neural Network and Multilayer Perceptron, respectively. The presence of swearing and sarcasm was higher in the high-stress tweets compared to no-stress tweets in all the domains. The stressors in each domain with higher percentages of swearing or sarcasm were identified. Furthermore, the distribution of the temporal classes (past, present, future, and atemporal) in high-stress tweets was found using an ensemble classifier. The distribution depended on the domain and the stressors. This study contributes a modified and improved lexicon for the identification of stress scores in social media texts. The two-step method to identify stressors follows a general framework that can be used for domains other than those which were studied. The presence of swearing, sarcasm, and the temporal classes of high-stress tweets belonging to different domains are found and compared to the findings from traditional psychology, for the first time. The algorithms and knowledge may be useful for travel, political, and personal life systems that need to identify stressful events in order to take appropriate action.European Union's Horizon 2020 research and innovation programme under grant agreement No 636160-2, the Optimum project (www.optimumproject.eu)
    corecore