10 research outputs found
Modelling of trends in Twitter using retweet graph dynamics
In this paper we model user behaviour in Twitter to capture the emergence of
trending topics. For this purpose, we first extensively analyse tweet datasets
of several different events. In particular, for these datasets, we construct
and investigate the retweet graphs. We find that the retweet graph for a
trending topic has a relatively dense largest connected component (LCC). Next,
based on the insights obtained from the analyses of the datasets, we design a
mathematical model that describes the evolution of a retweet graph by three
main parameters. We then quantify, analytically and by simulation, the
influence of the model parameters on the basic characteristics of the retweet
graph, such as the density of edges and the size and density of the LCC.
Finally, we put the model in practice, estimate its parameters and compare the
resulting behavior of the model to our datasets.Comment: 16 pages, 5 figures, presented at WAW 201
Top Comment or Flop Comment? Predicting and Explaining User Engagement in Online News Discussions
Comment sections below online news articles enjoy growing popularity among
readers. However, the overwhelming number of comments makes it infeasible for
the average news consumer to read all of them and hinders engaging discussions.
Most platforms display comments in chronological order, which neglects that
some of them are more relevant to users and are better conversation starters.
In this paper, we systematically analyze user engagement in the form of the
upvotes and replies that a comment receives. Based on comment texts, we train a
model to distinguish comments that have either a high or low chance of
receiving many upvotes and replies. Our evaluation on user comments from
TheGuardian.com compares recurrent and convolutional neural network models, and
a traditional feature-based classifier. Further, we investigate what makes some
comments more engaging than others. To this end, we identify engagement
triggers and arrange them in a taxonomy. Explanation methods for neural
networks reveal which input words have the strongest influence on our model's
predictions. In addition, we evaluate on a dataset of product reviews, which
exhibit similar properties as user comments, such as featuring upvotes for
helpfulness.Comment: Accepted at the International Conference on Web and Social Media
(ICWSM 2020); 11 pages; code and data are available at
https://hpi.de/naumann/projects/repeatability/text-mining.htm
Data science methods for the analysis of controversial social dedia discussions
Social media communities like Reddit and Twitter allow users to express their views on topics of their interest, and to engage with other users who may share or oppose these views. This can lead to productive discussions towards a consensus, or to contended debates, where disagreements frequently arise. Prior work on such settings has primarily focused on identifying notable instances of antisocial behavior such as hate-speech and “trolling”, which represent possible threats to the health of a community. These, however, are exceptionally severe phenomena, and do not encompass controversies stemming from user debates, differences of opinions, and off-topic content, all of which can naturally come up in a discussion without going so far as to compromise its development. This dissertation proposes a framework for the systematic analysis of social media discussions that take place in the presence of controversial themes, disagreements, and mixed opinions from participating users. For this, we develop a feature-based model to describe key elements of a discussion, such as its salient topics, the level of activity from users, the sentiments it expresses, and the user feedback it receives. Initially, we build our feature model to characterize adversarial discussions surrounding political campaigns on Twitter, with a focus on the factual and sentimental nature of their topics and the role played by different users involved. We then extend our approach to Reddit discussions, leveraging community feedback signals to define a new notion of controversy and to highlight conversational archetypes that arise from frequent and interesting interaction patterns. We use our feature model to build logistic regression classifiers that can predict future instances of controversy in Reddit communities centered on politics, world news, sports, and personal relationships. Finally, our model also provides the basis for a comparison of different communities in the health domain, where topics and activity vary considerably despite their shared overall focus. In each of these cases, our framework provides insight into how user behavior can shape a community’s individual definition of controversy and its overall identity.Social-Media Communities wie Reddit und Twitter ermöglichen es Nutzern, ihre Ansichten zu eigenen Themen zu äußern und mit anderen Nutzern in Kontakt zu treten, die diese Ansichten teilen oder ablehnen. Dies kann zu produktiven Diskussionen mit einer Konsensbildung führen oder zu strittigen Auseinandersetzungen über auftretende Meinungsverschiedenheiten. Frühere Arbeiten zu diesem Komplex konzentrierten sich in erster Linie darauf, besondere Fälle von asozialem Verhalten wie Hassrede und "Trolling" zu identifizieren, da diese eine Gefahr für die Gesprächskultur und den Wert einer Community darstellen. Die sind jedoch außergewöhnlich schwerwiegende Phänomene, die keinesfalls bei jeder Kontroverse auftreten die sich aus einfachen Diskussionen, Meinungsverschiedenheiten und themenfremden Inhalten ergeben. All diese Reibungspunkte können auch ganz natürlich in einer Diskussion auftauchen, ohne dass diese gleich den ganzen Gesprächsverlauf gefährden. Diese Dissertation stellt ein Framework für die systematische Analyse von Social-Media Diskussionen vor, die vornehmlich von kontroversen Themen, strittigen Standpunkten und Meinungsverschiedenheiten der teilnehmenden Nutzer geprägt sind. Dazu entwickeln wir ein Feature-Modell, um Schlüsselelemente einer Diskussion zu beschreiben. Dazu zählen der Aktivitätsgrad der Benutzer, die Wichtigkeit der einzelnen Aspekte, die Stimmung, die sie ausdrückt, und das Benutzerfeedback. Zunächst bauen wir unser Feature-Modell so auf, um bei Diskussionen gegensätzlicher politischer Kampagnen auf Twitter die oben genannten Schlüsselelemente zu bestimmen. Der Schwerpunkt liegt dabei auf den sachlichen und emotionalen Aspekten der Themen im Bezug auf die Rollen verschiedener Nutzer. Anschließend erweitern wir unseren Ansatz auf Reddit-Diskussionen und nutzen das Community-Feedback, um einen neuen Begriff der Kontroverse zu definieren und Konversationsarchetypen hervorzuheben, die sich aus Interaktionsmustern ergeben. Wir nutzen unser Feature-Modell, um ein Logistischer Regression Verfahren zu entwickeln, das zukünftige Kontroversen in Reddit-Communities in den Themenbereichen Politik, Weltnachrichten, Sport und persönliche Beziehungen vorhersagen kann. Schlussendlich bietet unser Modell auch die Grundlage für eine Vergleichbarkeit verschiedener Communities im Gesundheitsbereich, auch wenn dort die Themen und die Nutzeraktivität, trotz des gemeinsamen Gesamtfokus, erheblich variieren. In jedem der genannten Themenbereiche gibt unser Framework Erkenntnisgewinne, wie das Verhalten der Nutzer die spezifisch Definition von Kontroversen der Community prägt
Topology Reconstruction of Dynamical Networks via Constrained Lyapunov Equations
The network structure (or topology) of a dynamical network is often
unavailable or uncertain. Hence, we consider the problem of network
reconstruction. Network reconstruction aims at inferring the topology of a
dynamical network using measurements obtained from the network. In this
technical note we define the notion of solvability of the network
reconstruction problem. Subsequently, we provide necessary and sufficient
conditions under which the network reconstruction problem is solvable. Finally,
using constrained Lyapunov equations, we establish novel network reconstruction
algorithms, applicable to general dynamical networks. We also provide
specialized algorithms for specific network dynamics, such as the well-known
consensus and adjacency dynamics.Comment: 8 page
A likelihood-based framework for the analysis of discussion threads.
Contains fulltext :
103410.pdf (publisher's version ) (Closed access
A likelihood-based framework for the analysis of discussion threads
Online discussion threads are conversational cascades in the form of posted messages that can be generally found in social systems that comprise many-to-many interaction such as blogs, news aggregators or bulletin board systems. We propose a framework based on generative models of growing trees to analyse the structure and evolution of discussion threads. We consider the growth of a discussion to be determined by an interplay between popularity, novelty and a trend (or bias) to reply to the thread originator. The relevance of these features is estimated using a full likelihood approach and allows to characterise the habits and communication patterns of a given platform and/or community. We apply the proposed framework on four popular websites: Slashdot, Barrapunto (a Spanish version of Slashdot), Meneame (a Spanish Digg-clone) and the article discussion pages of the English Wikipedia. Our results provide significant insight into understanding how discussion cascades grow and have potential applications in broader contexts such as community management or design of communication platforms