392 research outputs found

    A Sentiment Analysis Model of Spanish Tweets. Case Study: Colombia 2014 Presidential Election

    Get PDF
    Abstract. What people say on social media has turned into a rich source of information to understand social behavior. Sentiment analysis of Twitter data has been widely used to capture trends in public opinion regarding important events such as political elections. However, current research in social media analysis in political domains faces two major problems, namely: sentiment analysis methods implemented are often too simple, and most of the researches have assumed that all users and their tweets are trustworthy. This thesis is aimed at dealing with these problems to achieve more reliable public opinion measurements. Colombia 2014 presidential election was proposed as case study. First, a research on social spammer detection on Twitter was carried out by following machine learning approaches to distinguish spammer accounts from non-spammer ones. Because of the brevity of tweets and the widespread use of mobile devices, Twitter is also a rich source of noisy data containing many non-standard word forms. Since this is a task that exploits the large amount of user-generated texts, the performance of sentiment analysis may drop significantly if several lexical variation phenomena are not dealt with. For that reason, a lexical normalization system of Spanish tweets was developed to improve the quality of natural language analysis, using finite-state transducers and statistical language modeling. Lastly, a sentiment analysis system of Spanish tweets was developed by implementing a supervised classification approach. The system was applied in the Colombian election to infer voting intention. Experimental results highlight the importance of denoising in Twitter data to achieve more reliable public opinion measurements. Together with this, results show the potential of social media analysis to infer vote share, obtaining the lowest mean absolute error and correctly ranking the highest-polling candidates in the first round election. However, such an important method cannot be put forward as a substitute of the traditional polling.Lo que las personas dicen en plataformas de social media se ha convertido en una fuente valiosa de información para entender el comportamiento social. Análisis de sentimientos de datos de Twitter se ha utilizado ámpliamente para capturar tendencias en la opinión pública con respecto a temas importantes como los son las elecciones políticas. Sin embargo, la investigación actual sobre aplicaciones de análisis de social media en contextos políticos enfrenta dos grandes problemas, a saber: se han empleado los métodos más simples de análisis de sentimientos, y se ha asumido que todos los usuarios y sus tweets son dignos de confianza. Esta tesis tiene como objetivo hacer frente a estos problemas con el fin de alcanzar mediciones más fiables de la opinión pública. Las elecciones presidenciales en Colombia de 2014 se propusieron como caso de estudio. En primer lugar, se llevó a cabo una investigación sobre la detección de spammers en Twitter, implementando enfoques de aprendizaje automático para distinguir cuentas spammers de las que no lo son. Debido a la brevedad de los tweets y al ámplio uso de dispositivos móviles, Twitter se ha convertido en una fuente de datos ruídosos que contiene muchas formas de palabra que no son estándar. Al tratarse de una tarea que explota la gran cantidad de texto generado por los usuarios, el desempeño de análisis de sentimientos podría degradarse si no se abordan varios fenómenos de variación léxica presentes en los tweets. Por esta razón, se desarrolló un sistema de normalización léxica de tweets en español, el cual emplea transductores de estado finito y modelado de lenguaje estadístico, a fin de mejorar la calidad del análisis del lenguaje natural. Por último, se desarrolló un sistema de análisis de sentimientos de tweets en español siguiendo un enfoque de clasificación supervisada, el cual se aplicó en el contexto de las citadas elecciones para realizar inferencia de intención de voto. Los resultados experimentales resaltan la importancia de eliminar el ruído de los datos de Twitter que se utilizan para realizar mediciones de la opinión pública. Junto con esto, los resultados muestran el potencial del análisis de social media para inferir la distribución de los votos, obteniendo la media del error absoluto más baja y correctamente clasificando los candidatos de mayor votación en la primera vuelta electoral. Sin embargo, dicho método no puede plantearse como un sustituto del sondeo electoral tradicional.Maestrí

    Sentiment Analysis of Political Tweets From the 2019 Spanish Elections

    Get PDF
    The use of sentiment analysis methods has increased in recent years across a wide range of disciplines. Despite the potential impact of the development of opinions during political elections, few studies have focused on the analysis of sentiment dynamics and their characterization from statistical and mathematical perspectives. In this paper, we apply a set of basic methods to analyze the statistical and temporal dynamics of sentiment analysis on political campaigns and assess their scope and limitations. To this end, we gathered thousands of Twitter messages mentioning political parties and their leaders posted several weeks before and after the 2019 Spanish presidential election. We then followed a twofold analysis strategy: (1) statistical characterization using indices derived from well-known temporal and information metrics and methods –including entropy, mutual information, and the Compounded Aggregated Positivity Index– allowing the estimation of changes in the density function of sentiment data; and (2) feature extraction from nonlinear intrinsic patterns in terms of manifold learning using autoencoders and stochastic embeddings. The results show that both the indices and the manifold features provide an informative characterization of the sentiment dynamics throughout the election period. We found measurable variations in sentiment behavior and polarity across the political parties and their leaders and observed different dynamics depending on the parties’ positions on the political spectrum, their presence at the regional or national levels, and their nationalist or globalist aspirations

    Quantifying the impact of Twitter activity in political battlegrounds

    Get PDF
    It may be challenging to determine the reach of the information, how well it corresponds with the domain design, and how to utilize it as a communication medium when utilizing social media platforms, notably Twitter, to engage the public in advocating a parliament act, or during a global health emergency. Chapter 3 offers a broad overview of how candidates running in the 2020 US Elections used Twitter as a communication tool to interact with voters. More precisely, it seeks to identify components related to internal collaboration and public participation (in terms of content and stance similarity among the candidates from the same political front and to the official Twitter accounts of their political parties). The 2020 US Presidential and Vice Presidential candidates from the two main political parties, the Republicans and Democrats, are our main subjects. Along with the content similarity, their tweets were assessed for social reach and stance similarity on 22 topics. This study complements previous research on efficiently using social media platforms for election campaigns. Chapter 4 empirically examines the online social associations of the top-10 COVID-19 resilient nations’ leaders and healthcare institutions based on the Bloomberg COVID-19 Resilience Ranking. In order to measure the strength of the online social association in terms of public engagement, sentiment strength, inclusivity and diversity, we used the attributes provided by Twitter Academic Research API, coupled with the tweets of leaders and healthcare organizations from these nations. Understanding how leaders and healthcare organizations may utilize Twitter to establish digital connections with the public during health emergencies is made more accessible by this study. The thesis has proposed methods for efficiently using Twitter in various domains, utilizing the implementations of various Language Models and several data mining and analytics techniques

    Sentiment Analysis for Fake News Detection

    Get PDF
    [Abstract] In recent years, we have witnessed a rise in fake news, i.e., provably false pieces of information created with the intention of deception. The dissemination of this type of news poses a serious threat to cohesion and social well-being, since it fosters political polarization and the distrust of people with respect to their leaders. The huge amount of news that is disseminated through social media makes manual verification unfeasible, which has promoted the design and implementation of automatic systems for fake news detection. The creators of fake news use various stylistic tricks to promote the success of their creations, with one of them being to excite the sentiments of the recipients. This has led to sentiment analysis, the part of text analytics in charge of determining the polarity and strength of sentiments expressed in a text, to be used in fake news detection approaches, either as a basis of the system or as a complementary element. In this article, we study the different uses of sentiment analysis in the detection of fake news, with a discussion of the most relevant elements and shortcomings, and the requirements that should be met in the near future, such as multilingualism, explainability, mitigation of biases, or treatment of multimedia elements.Xunta de Galicia; ED431G 2019/01Xunta de Galicia; ED431C 2020/11This work has been funded by FEDER/Ministerio de Ciencia, Innovación y Universidades — Agencia Estatal de Investigación through the ANSWERASAP project (TIN2017-85160-C2-1-R); and by Xunta de Galicia through a Competitive Reference Group grant (ED431C 2020/11). CITIC, as Research Center of the Galician University System, is funded by the Consellería de Educación, Universidade e Formación Profesional of the Xunta de Galicia through the European Regional Development Fund (ERDF/FEDER) with 80%, the Galicia ERDF 2014-20 Operational Programme, and the remaining 20% from the Secretaría Xeral de Universidades (ref. ED431G 2019/01). David Vilares is also supported by a 2020 Leonardo Grant for Researchers and Cultural Creators from the BBVA Foundation. Carlos Gómez-Rodríguez has also received funding from the European Research Council (ERC), under the European Union’s Horizon 2020 research and innovation programme (FASTPARSE, grant No. 714150

    Can Machines Learn to Detect Fake News? A Survey Focused on Social Media

    Get PDF
    Through a systematic literature review method, in this work we searched classical electronic libraries in order to find the most recent papers related to fake news detection on social medias. Our target is mapping the state of art of fake news detection, defining fake news and finding the most useful machine learning technique for doing so. We concluded that the most used method for automatic fake news detection is not just one classical machine learning technique, but instead a amalgamation of classic techniques coordinated by a neural network. We also identified a need for a domain ontology that would unify the different terminology and definitions of the fake news domain. This lack of consensual information may mislead opinions and conclusions

    The Potential of Social Media Intelligence to Improve Peoples Lives: Social Media Data for Good

    Get PDF
    In this report, developed with support from Facebook, we focus on an approach to extract public value from social media data that we believe holds the greatest potential: data collaboratives. Data collaboratives are an emerging form of public-private partnership in which actors from different sectors exchange information to create new public value. Such collaborative arrangements, for example between social media companies and humanitarian organizations or civil society actors, can be seen as possible templates for leveraging privately held data towards the attainment of public goals

    Twitter and social bots : an analysis of the 2021 Canadian election

    Full text link
    Les médias sociaux sont désormais des outils de communication incontournables, notamment lors de campagnes électorales. La prévalence de l’utilisation de plateformes de communication en ligne suscite néanmoins des inquiétudes au sein des démocraties occidentales quant aux risques de manipulation des électeurs, notamment par le biais de robots sociaux. Les robots sociaux sont des comptes automatisés qui peuvent être utilisés pour produire ou amplifier le contenu en ligne tout en se faisant passer pour de réels utilisateurs. Certaines études, principalement axées sur le cas des États-Unis, ont analysé la propagation de contenus de désinformation par les robots sociaux en période électorale, alors que d’autres ont également examiné le rôle de l’affiliation partisane sur les comportements et les tactiques favorisées par les robots sociaux. Toutefois, la question à savoir si l'orientation partisane des robots sociaux a un impact sur la quantité de désinformation politique qu’ils propagent demeure sans réponse. Par conséquent, l’objectif principal de ce travail de recherche est de déterminer si des différences partisanes peuvent être observées dans (i) le nombre de robots sociaux actifs pendant la campagne électorale canadienne de 2021, (ii) leurs interactions avec les comptes réels, et (iii) la quantité de contenu de désinformation qu’ils ont propagé. Afin d’atteindre cet objectif de recherche, ce mémoire de maîtrise s’appuie sur un ensemble de données Twitter de plus de 11,3 millions de tweets en anglais provenant d’environ 1,1 million d'utilisateurs distincts, ainsi que sur divers modèles pour distinguer les comptes de robots sociaux des comptes humains, déterminer l’orientation partisane des utilisateurs et détecter le contenu de désinformation politique véhiculé. Les résultats de ces méthodes distinctes indiquent des différences limitées dans le comportement des robots sociaux lors des dernières élections fédérales. Il a tout de même été possible d'observer que les robots sociaux de tendance conservatrice étaient plus nombreux que leurs homologues de tendance libérale, mais que les robots sociaux d’orientation libérale étaient ceux qui ont interagi le plus avec les comptes authentiques par le biais de retweets et de réponses directes, et qui ont propagé le plus de contenu de désinformation.Social media have now become essential communication tools, including within the context of electoral campaigns. However, the prevalence of online communication platforms has raised concerns in Western democracies about the risks of voter manipulation, particularly through social bot accounts. Social bots are automated computer algorithms which can be used to produce or amplify online content while posing as authentic users. Some studies, mostly focused on the case of the United States, analyzed the propagation of disinformation content by social bots during electoral periods, while others have also examined the role of partisanship on social bots’ behaviors and activities. However, the question of whether social bots’ partisan-leaning impacts the amount of political disinformation content they generate online remains unanswered. Therefore, the main goal of this study is to determine whether partisan differences could be observed in (i) the number of active social bots during the 2021 Canadian election campaign, (ii) their interactions with humans, and (iii) the amount of disinformation content they propagated. In order to reach this research objective, this master’s thesis relies on an original Twitter dataset of more than 11.3 million English tweets from roughly 1.1 million distinct users, as well as diverse models to distinguish between social bot and human accounts, determine the partisan-leaning of users, and detect political disinformation content. Based on these distinct methods, the results indicate limited differences in the behavior of social bots in the 2021 federal election. It was however possible to observe that conservative-leaning social bots were more numerous than their liberal-leaning counterparts, but liberal-leaning accounts were those who interacted more with authentic accounts through retweets and replies and shared the most disinformation content

    Revista Mediterránea de Comunicación. Vol. 11, n. 2 (2020)

    Get PDF

    Society, History and Education: dialogues from a disciplinary perspective

    Get PDF
    We are pleased to present to the entire academic community and the general public the following work, which contains recent research results of a group of teachers from the Faculty of Educational Sciences in areas such as pedagogy, communication, technology and history. This publication is the result of the work coordinated by the Vice Rector's Office for Research, Innovation and Extension of the UTP, with the support of the Faculty of Education Sciences, through the realization of the First Conference on Social Appropriation of Knowledge held in 2022, in order to reach a wider field of dissemination of local research. In the first chapter "Sentiment analysis on Twitter about mobile learning" by professors Rosa María Guilleumas García and Hernán Gil Ramírez, a study of tweets about mobile learning is presented. To do so, the authors combined several techniques of social network analysis, text mining and sentiment analysis, using NodeXL software, specialized in network examination and visualization. Among their results, they highlight the great predominance of positive tweets over negative ones in this field of study, and at the same time point out that, in the analyzed tweets, the most used words were learning, mobile, app, machine, mlearning and education. In second place, there is the chapter "The institutional educational project. An opportunity for reflection and transformation of the Colombian university" by teachers Martha Cecilia Gutiérrez Giraldo and Carolina Franco Ossa, which arises from the reflection on university autonomy and its internal exercise in the construction of its Institutional Educational Projects (PEI). Thus, the main purpose of this work is to identify the relevant facts that have marked the academic life of the UTP since its creation in 1958 until 2015, in which the different strata of the university community (teachers, students, administrators, managers, graduates and the social sector) participated through a participatory action research process. The results show that since its creation, the University has updated its academic and management policies in accordance with the regulations in force in each period, and that, at the same time, as mentioned by the authors, the institutional processes of self-reflection and projection of the academic life of the UTP should be strengthened through the culture of academic and democratic participation, supported by the collaborative work of the university communityCONTENT Presentation...................................................................................................................5 CHAPTER ONE Twitter sentiment analysis on mobile learning ...............................................................9 Rosa María Guilleumas García y Hernán Gil Ramírez CHAPTER TWO The institutional educational project. an opportunity for reflection and transformation of the Colombian university.................................................................37 Martha Cecilia Gutiérrez Giraldo y Carolina Franco Ossa CHAPTER THREE The Ridway´s photographies: typologies of portraiture in Pereira, Colombia .............59 Johana Guarín Medina CHAPTER FOUR Information society. Political disputes and disciplinary openings................................91 Andrés Camilo Agudelo Vergara CHAPTER FIVE State and internal borders in the 19th century: the Quindío mountain in central western Colombia .......................................................................1

    Automatic information search for countering covid-19 misinformation through semantic similarity

    Full text link
    Trabajo Fin de Máster en Bioinformática y Biología ComputacionalInformation quality in social media is an increasingly important issue and misinformation problem has become even more critical in the current COVID-19 pandemic, leading people exposed to false and potentially harmful claims and rumours. Civil society organizations, such as the World Health Organization, have demanded a global call for action to promote access to health information and mitigate harm from health misinformation. Consequently, this project pursues countering the spread of COVID-19 infodemic and its potential health hazards. In this work, we give an overall view of models and methods that have been employed in the NLP field from its foundations to the latest state-of-the-art approaches. Focusing on deep learning methods, we propose applying multilingual Transformer models based on siamese networks, also called bi-encoders, combined with ensemble and PCA dimensionality reduction techniques. The goal is to counter COVID-19 misinformation by analyzing the semantic similarity between a claim and tweets from a collection gathered from official fact-checkers verified by the International Fact-Checking Network of the Poynter Institute. It is factual that the number of Internet users increases every year and the language spoken determines access to information online. For this reason, we give a special effort in the application of multilingual models to tackle misinformation across the globe. Regarding semantic similarity, we firstly evaluate these multilingual ensemble models and improve the result in the STS-Benchmark compared to monolingual and single models. Secondly, we enhance the interpretability of the models’ performance through the SentEval toolkit. Lastly, we compare these models’ performance against biomedical models in TREC-COVID task round 1 using the BM25 Okapi ranking method as the baseline. Moreover, we are interested in understanding the ins and outs of misinformation. For that purpose, we extend interpretability using machine learning and deep learning approaches for sentiment analysis and topic modelling. Finally, we developed a dashboard to ease visualization of the results. In our view, the results obtained in this project constitute an excellent initial step toward incorporating multilingualism and will assist researchers and people in countering COVID-19 misinformation
    corecore