82 research outputs found

    Real-World Political Polarization in Twitter Discussions on Inter-Ethnic Conflicts

    Get PDF
    Studies of political polarization in social media demonstrate mixed evidence for whether discussions necessarily evolve into left and right ideological echo chambers. Recent research shows that, for political and issue-based discussions, patterns of user clusterization may differ significantly, but that cross-cultural evidence of the polarization of users on certain issues is close to non-existent. Furthermore, most of the studies developed network proxies to detect users’ grouping, rarely taking into account the content of the Tweets themselves. Our contribution to this scholarly discussion is founded upon the detection of polarization based on attitudes towards political actors expressed by users in Germany, the USA and Russia within discussions on inter-ethnic conflicts. For this exploratory study, we develop a mixed-method approach to detecting user grouping that includes: crawling for data collection; expert coding of Tweets; user clusterization based on user attitudes; construction of word frequency vocabularies; and graph visualization. Our results show that, in all the three cases, the groups detected are far from being conventionally left or right, but rather that their views combine anti-institutionalism, nationalism, and pro- and anti-minority views in varying degrees. In addition to this, more than two threads of political debate may co-exist in the same discussion. Thus, we show that the debate that sees Twitter as either a platform of ‘echo chambering’ or ‘opinion crossroads’ may be misleading. In our opinion, the role of local political context in shaping (and explaining) user clusterization should not be under-estimated

    Beyond Left and Right: Real-World Political Polarization in Twitter Discussions on Inter-Ethnic Conflicts

    Get PDF
    Studies of political polarization in social media demonstrate mixed evidence for whether discussions necessarily evolve into left and right ideological echo chambers. Recent research shows that, for political and issue-based discussions, patterns of user clusterization may differ significantly, but that cross-cultural evidence of the polarization of users on certain issues is close to non-existent. Furthermore, most of the studies developed network proxies to detect users’ grouping, rarely taking into account the content of the Tweets themselves. Our contribution to this scholarly discussion is founded upon the detection of polarization based on attitudes towards political actors expressed by users in Germany, the USA and Russia within discussions on inter-ethnic conflicts. For this exploratory study, we develop a mixed-method approach to detecting user grouping that includes: crawling for data collection; expert coding of Tweets; user clusterization based on user attitudes; construction of word frequency vocabularies; and graph visualization. Our results show that, in all the three cases, the groups detected are far from being conventionally left or right, but rather that their views combine anti-institutionalism, nationalism, and pro- and anti-minority views in varying degrees. In addition to this, more than two threads of political debate may co-exist in the same discussion. Thus, we show that the debate that sees Twitter as either a platform of 'echo chambering' or 'opinion crossroads' may be misleading. In our opinion, the role of local political context in shaping (and explaining) user clusterization should not be under-estimated

    SentiBench - a benchmark comparison of state-of-the-practice sentiment analysis methods

    Get PDF
    In the last few years thousands of scientific papers have investigated sentiment analysis, several startups that measure opinions on real data have emerged and a number of innovative products related to this theme have been developed. There are multiple methods for measuring sentiments, including lexical-based and supervised machine learning methods. Despite the vast interest on the theme and wide popularity of some methods, it is unclear which one is better for identifying the polarity (i.e., positive or negative) of a message. Accordingly, there is a strong need to conduct a thorough apple-to-apple comparison of sentiment analysis methods, \textit{as they are used in practice}, across multiple datasets originated from different data sources. Such a comparison is key for understanding the potential limitations, advantages, and disadvantages of popular methods. This article aims at filling this gap by presenting a benchmark comparison of twenty-four popular sentiment analysis methods (which we call the state-of-the-practice methods). Our evaluation is based on a benchmark of eighteen labeled datasets, covering messages posted on social networks, movie and product reviews, as well as opinions and comments in news articles. Our results highlight the extent to which the prediction performance of these methods varies considerably across datasets. Aiming at boosting the development of this research area, we open the methods' codes and datasets used in this article, deploying them in a benchmark system, which provides an open API for accessing and comparing sentence-level sentiment analysis methods

    Data science methods for the analysis of controversial social dedia discussions

    Get PDF
    Social media communities like Reddit and Twitter allow users to express their views on topics of their interest, and to engage with other users who may share or oppose these views. This can lead to productive discussions towards a consensus, or to contended debates, where disagreements frequently arise. Prior work on such settings has primarily focused on identifying notable instances of antisocial behavior such as hate-speech and “trolling”, which represent possible threats to the health of a community. These, however, are exceptionally severe phenomena, and do not encompass controversies stemming from user debates, differences of opinions, and off-topic content, all of which can naturally come up in a discussion without going so far as to compromise its development. This dissertation proposes a framework for the systematic analysis of social media discussions that take place in the presence of controversial themes, disagreements, and mixed opinions from participating users. For this, we develop a feature-based model to describe key elements of a discussion, such as its salient topics, the level of activity from users, the sentiments it expresses, and the user feedback it receives. Initially, we build our feature model to characterize adversarial discussions surrounding political campaigns on Twitter, with a focus on the factual and sentimental nature of their topics and the role played by different users involved. We then extend our approach to Reddit discussions, leveraging community feedback signals to define a new notion of controversy and to highlight conversational archetypes that arise from frequent and interesting interaction patterns. We use our feature model to build logistic regression classifiers that can predict future instances of controversy in Reddit communities centered on politics, world news, sports, and personal relationships. Finally, our model also provides the basis for a comparison of different communities in the health domain, where topics and activity vary considerably despite their shared overall focus. In each of these cases, our framework provides insight into how user behavior can shape a community’s individual definition of controversy and its overall identity.Social-Media Communities wie Reddit und Twitter ermöglichen es Nutzern, ihre Ansichten zu eigenen Themen zu Ă€ußern und mit anderen Nutzern in Kontakt zu treten, die diese Ansichten teilen oder ablehnen. Dies kann zu produktiven Diskussionen mit einer Konsensbildung fĂŒhren oder zu strittigen Auseinandersetzungen ĂŒber auftretende Meinungsverschiedenheiten. FrĂŒhere Arbeiten zu diesem Komplex konzentrierten sich in erster Linie darauf, besondere FĂ€lle von asozialem Verhalten wie Hassrede und "Trolling" zu identifizieren, da diese eine Gefahr fĂŒr die GesprĂ€chskultur und den Wert einer Community darstellen. Die sind jedoch außergewöhnlich schwerwiegende PhĂ€nomene, die keinesfalls bei jeder Kontroverse auftreten die sich aus einfachen Diskussionen, Meinungsverschiedenheiten und themenfremden Inhalten ergeben. All diese Reibungspunkte können auch ganz natĂŒrlich in einer Diskussion auftauchen, ohne dass diese gleich den ganzen GesprĂ€chsverlauf gefĂ€hrden. Diese Dissertation stellt ein Framework fĂŒr die systematische Analyse von Social-Media Diskussionen vor, die vornehmlich von kontroversen Themen, strittigen Standpunkten und Meinungsverschiedenheiten der teilnehmenden Nutzer geprĂ€gt sind. Dazu entwickeln wir ein Feature-Modell, um SchlĂŒsselelemente einer Diskussion zu beschreiben. Dazu zĂ€hlen der AktivitĂ€tsgrad der Benutzer, die Wichtigkeit der einzelnen Aspekte, die Stimmung, die sie ausdrĂŒckt, und das Benutzerfeedback. ZunĂ€chst bauen wir unser Feature-Modell so auf, um bei Diskussionen gegensĂ€tzlicher politischer Kampagnen auf Twitter die oben genannten SchlĂŒsselelemente zu bestimmen. Der Schwerpunkt liegt dabei auf den sachlichen und emotionalen Aspekten der Themen im Bezug auf die Rollen verschiedener Nutzer. Anschließend erweitern wir unseren Ansatz auf Reddit-Diskussionen und nutzen das Community-Feedback, um einen neuen Begriff der Kontroverse zu definieren und Konversationsarchetypen hervorzuheben, die sich aus Interaktionsmustern ergeben. Wir nutzen unser Feature-Modell, um ein Logistischer Regression Verfahren zu entwickeln, das zukĂŒnftige Kontroversen in Reddit-Communities in den Themenbereichen Politik, Weltnachrichten, Sport und persönliche Beziehungen vorhersagen kann. Schlussendlich bietet unser Modell auch die Grundlage fĂŒr eine Vergleichbarkeit verschiedener Communities im Gesundheitsbereich, auch wenn dort die Themen und die NutzeraktivitĂ€t, trotz des gemeinsamen Gesamtfokus, erheblich variieren. In jedem der genannten Themenbereiche gibt unser Framework Erkenntnisgewinne, wie das Verhalten der Nutzer die spezifisch Definition von Kontroversen der Community prĂ€gt

    Information consumption on social media : efficiency, divisiveness, and trust

    Get PDF
    Over the last decade, the advent of social media has profoundly changed the way people produce and consume information online. On these platforms, users themselves play a role in selecting the sources from which they consume information, overthrowing traditional journalistic gatekeeping. Moreover, advertisers can target users with news stories using users’ personal data. This new model has many advantages: the propagation of news is faster, the number of news sources is large, and the topics covered are diverse. However, in this new model, users are often overloaded with redundant information, and they can get trapped in filter bubbles by consuming divisive and potentially false information. To tackle these concerns, in my thesis, I address the following important questions: (i) How efficient are users at selecting their information sources? We have defined three intuitive notions of users’ efficiency in social media: link, in-flow, and delay efficiency. We use these three measures to assess how good users are at selecting who to follow within the social media system in order to most efficiently acquire information. (ii) How can we break the filter bubbles that users get trapped in? Users on social media sites such as Twitter often get trapped in filter bubbles by being exposed to radical, highly partisan, or divisive information. To prevent users from getting trapped in filter bubbles, we propose an approach to inject diversity in users’ information consumption by identifying non-divisive, yet informative information. (iii) How can we design an efficient framework for fact-checking? Proliferation of false information is a major problem in social media. To counter it, social media platforms typically rely on expert fact-checkers to detect false news. However, human fact-checkers can realistically only cover a tiny fraction of all stories. So, it is important to automatically prioritizing and selecting a small number of stories for human to fact check. However, the goals for prioritizing stories for fact-checking are unclear. We identify three desired objectives to prioritize news for fact-checking. These objectives are based on the users’ perception of truthfulness of stories. Our key finding is that these three objectives are incompatible in practice.In den letzten zehn Jahren haben soziale Medien die Art und Weise, wie Menschen online Informationen generieren und konsumieren, grundlegend verĂ€ndert. Auf Social Media Plattformen wĂ€hlen Nutzer selbst aus, von welchen Quellen sie Informationen beziehen hebeln damit das traditionelle Modell journalistischen Gatekeepings aus. ZusĂ€tzlich können Werbetreibende Nutzerdaten dazu verwenden, um Nachrichtenartikel gezielt an Nutzer zu verbreiten. Dieses neue Modell bietet einige Vorteile: Nachrichten verbreiten sich schneller, die Zahl der Nachrichtenquellen ist grĂ¶ĂŸer, und es steht ein breites Spektrum an Themen zur Verfügung. Das hat allerdings zur Folge, dass Benutzer hĂ€ufig mit überflüssigen Informationen überladen werden und in Filterblasen geraten können, wenn sie zu einseitige oder falsche Informationen konsumieren. Um diesen Problemen Rechnung zu tragen, gehe ich in meiner Dissertation auf die drei folgenden wichtigen Fragestellungen ein: ‱ (i) Wie effizient sind Nutzer bei der Auswahl ihrer Informationsquellen? Dazu definieren wir drei verschiedene, intuitive Arten von Nutzereffizienz in sozialen Medien: Link-, In-Flowund Delay-Effizienz. Mithilfe dieser drei Metriken untersuchen wir, wie gut Nutzer darin sind auszuwĂ€hlen, wem sie auf Social Media Plattformen folgen sollen um effizient an Informationen zu gelangen. ‱ (ii) Wie können wir verhindern, dass Benutzer in Filterblasen geraten? Nutzer von Social Media Webseiten werden hĂ€ufig Teil von Filterblasen, wenn sie radikalen, stark parteiischen oder spalterischen Informationen ausgesetzt sind. Um das zu verhindern, entwerfen wir einen Ansatz mit dem Ziel, den Informationskonsum von Nutzern zu diversifizieren, indem wir Informationen identifizieren, die nicht polarisierend und gleichzeitig informativ sind. ‱ (iii) Wie können wir Nachrichten effizient auf faktische Korrektheit hin überprüfen? Die Verbreitung von Falschinformationen ist eines der großen Probleme sozialer Medien. Um dem entgegenzuwirken, sind Social Media Plattformen in der Regel auf fachkundige Faktenprüfer zur Identifizierung falscher Nachrichten angewiesen. Die manuelle Überprüfung von Fakten kann jedoch realistischerweise nur einen sehr kleinen Teil aller Artikel und Posts abdecken. Daher ist es wichtig, automatisch eine überschaubare Zahl von Artikeln für die manuellen Faktenkontrolle zu priorisieren. Nach welchen Zielen eine solche Priorisierung erfolgen soll, ist jedoch unklar. Aus diesem Grund identifizieren wir drei wünschenswerte Priorisierungskriterien für die Faktenkontrolle. Diese Kriterien beruhen auf der Wahrnehmung des Wahrheitsgehalts von Artikeln durch Nutzer. Unsere Schlüsselbeobachtung ist, dass diese drei Kriterien in der Praxis nicht miteinander vereinbar sind

    Stylistic variation on the Donald Trump Twitter account:a linguistic analysis of tweets posted between 2009 and 2018

    Get PDF
    Twitter was an integral part of Donald Trump's communication platform during his 2016 campaign. Although its topical content has been examined by researchers and the media, we know relatively little about the style of the language used on the account or how this style changed over time. In this study, we present the first detailed description of stylistic variation on the Trump Twitter account based on a multivariate analysis of grammatical co-occurrence patterns in tweets posted between 2009 and 2018. We identify four general patterns of stylistic variation, which we interpret as representing the degree of conversational, campaigning, engaged, and advisory discourse. We then track how the use of these four styles changed over time, focusing on the period around the campaign, showing that the style of tweets shifts systematically depending on the communicative goals of Trump and his team. Based on these results, we propose a series of hypotheses about how the Trump campaign used social media during the 2016 elections

    What you say and how you say it : joint modeling of topics and discourse in microblog conversations

    Get PDF
    This paper presents an unsupervised framework for jointly modeling topic content and discourse behavior in microblog conversations. Concretely, we propose a neural model to discover word clusters indicating what a conversation concerns (i.e., topics) and those reflecting how participants voice their opinions (i.e., discourse).1 Extensive experiments show that our model can yield both coherent topics and meaningful discourse behavior. Further study shows that our topic and discourse representations can benefit the classification of microblog messages, especially when they are jointly trained with the classifier
    • 

    corecore