Search CORE

13 research outputs found

The Complete Picture of the Twitter Social Graph

Author: Gabielkov Maksym
Legout Arnaud
Publication venue: HAL CCSD
Publication date: 10/12/2012
Field of study

International audienceIn this work, we collected the entire Twitter social graph that consists of 537 million Twitter accounts connected by 23.95 billion links, and performed a preliminary analysis of the collected data. In order to collect the social graph, we implemented a distributed crawler on the PlanetLab infrastructure that collected all information in 4 months. Our preliminary analysis already revealed some interesting properties. Whereas there are 537 million Twitter accounts, only 268 million already sent at least one tweet and no more than 54 million have been recently active. In addition, 40% of the accounts are not followed by anybody and 25% do not follow anybody. Finally, we found that the Twitter policies, but also social conventions (like the follow-back convention) have a huge impact on the structure of the Twitter social graph

INRIA a CCSD electronic archive server

HAL Descartes

Hal-Diderot

LiveRank: How to Refresh Old Datasets

Author: Huynh The Dang
Mathieu Fabien
Viennot Laurent
Publication venue: 'Informa UK Limited'
Publication date: 01/01/2015
Field of study

This paper considers the problem of refreshing a dataset. More precisely , given a collection of nodes gathered at some time (Web pages, users from an online social network) along with some structure (hyperlinks, social relationships), we want to identify a significant fraction of the nodes that still exist at present time. The liveness of an old node can be tested through an online query at present time. We call LiveRank a ranking of the old pages so that active nodes are more likely to appear first. The quality of a LiveRank is measured by the number of queries necessary to identify a given fraction of the active nodes when using the LiveRank order. We study different scenarios from a static setting where the Liv-eRank is computed before any query is made, to dynamic settings where the LiveRank can be updated as queries are processed. Our results show that building on the PageRank can lead to efficient LiveRanks, for Web graphs as well as for online social networks

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

Hal-Diderot

A Random Growth Model with any Real or Theoretical Degree Distribution

Author: A Clauset
A Sallaberry
AD Broido
AT Stephen
B Bollobás
G Ghoshal
G Lima-Mendez
MEJ Newman
N Pržulj
P Erdős
R Albert
Publication venue
Publication date: 01/12/2020
Field of study

The degree distributions of complex networks are usually considered to be power law. However, it is not the case for a large number of them. We thus propose a new model able to build random growing networks with (almost) any wanted degree distribution. The degree distribution can either be theoretical or extracted from a real-world network. The main idea is to invert the recurrence equation commonly used to compute the degree distribution in order to find a convenient attachment function for node connections - commonly chosen as linear. We compute this attachment function for some classical distributions, as the power-law, broken power-law, geometric and Poisson distributions. We also use the model on an undirected version of the Twitter network, for which the degree distribution has an unusual shape. We finally show that the divergence of chosen attachment functions is heavily links to the heavy-tailed property of the obtained degree distributions.Comment: 23 pages, 3 figure

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

Trollslayer: Crowdsourcing and Characterization of Abusive Birds in Twitter

Author: Garcia-Recuero Alvaro
Morawin Aneta
Tyson Gareth
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 14/12/2018
Field of study

As of today, abuse is a pressing issue to participants and administrators of Online Social Networks (OSN). Abuse in Twitter can spawn from arguments generated for influencing outcomes of a political election, the use of bots to automatically spread misinformation, and generally speaking, activities that deny, disrupt, degrade or deceive other participants and, or the network. Given the difficulty in finding and accessing a large enough sample of abuse ground truth from the Twitter platform, we built and deployed a custom crawler that we use to judiciously collect a new dataset from the Twitter platform with the aim of characterizing the nature of abusive users, a.k.a abusive birds, in the wild. We provide a comprehensive set of features based on users' attributes, as well as social-graph metadata. The former includes metadata about the account itself, while the latter is computed from the social graph among the sender and the receiver of each message. Attribute-based features are useful to characterize user's accounts in OSN, while graph-based features can reveal the dynamics of information dissemination across the network. In particular, we derive the Jaccard index as a key feature to reveal the benign or malicious nature of directed messages in Twitter. To the best of our knowledge, we are the first to propose such a similarity metric to characterize abuse in Twitter.Comment: SNAMS 201

arXiv.org e-Print Archive

Crossref

Un modèle de graphes aléatoires croissants pour n'importe quelle distribution des degrés

Author: Giroire Frédéric
Pérennes Stéphane
Trolliet Thibaud
Publication venue: HAL CCSD
Publication date: 01/06/2021
Field of study

International audienceThe degree distributions of complex networks are usually considered to be power law. However, it is not the case for a large number of them. We thus propose a new model able to build random growing networks with almost any wanted degree distribution. The degree distribution can either be theoretical or extracted from a real-world network. The main idea is to invert the recurrence equation commonly used to compute the degree distribution in order to find a convenient attachment function for node connections - commonly chosen as linear. We compute this attachment function for some classical distributions, as the power-law, broken power-law, and the geometric distributions. We also use the model on an undirected version of the Twitter network, for which the degree distribution has an unusual shape.Les distributions de degrés des réseaux du monde réel sont généralement considérées comme des lois de puissance. Cependant, ce n'est pas le cas pour un grand nombre d'entre eux. Nous proposons donc un nouveau modèle de graphes aléatoires croissants capable de construire des graphes avec presque toute distribution de degrés souhaitée. La distribution des degrés voulue peut être soit théorique, soit extraite d'un réseau du monde réel. L'idée principale est d'inverser l'équation de récurrence couramment utilisée pour calculer la distribution des degrés, afin de trouver une fonction d'attachement adéquate pour le choix des nœuds recevant les nouvelles connexions - généralement choisie comme linéaire. Nous calculons cette fonction d'attachement pour certaines distributions classiques, telles que les distributions de loi de puissance, loi de puissance brisée, et géométrique. Nous utilisons également le modèle sur une version non dirigée du réseau social des suivis de Twitter, pour lequel la distribution des degrés a une forme inhabituelle

INRIA a CCSD electronic archive server

Oskar Bordeaux

Unlocking the power of Twitter communities for startups

Author: Almeida Ana de
António Nuno
Batista Fernando
Cardoso Elsa
Peixoto Ana Rita
Ribeiro Ricardo
Publication venue
Publication date: 01/01/2023
Field of study

Peixoto, A. R., Almeida, A. D., António, N., Batista, F., Ribeiro, R., & Cardoso, E. (2023). Unlocking the power of Twitter communities for startups. Applied Network Science, 8, 1-21. [66]. https://doi.org/10.21203/rs.3.rs-3062630/v1, https://doi.org/10.1007/s41109-023-00593-0 --- This work was partially supported by Fundação para a Ciência e a Tecnologia, I.P. (FCT) namely by UIDB/04466/2020 and UIDP/04466/2020 (ISTAR_Iscte); UIDB/04152/2020 (MagIC/NOVA IMS); UIDB/50021/2020 (INESC-ID); and UIDB/03126/2020 (CIES_Iscte).Social media platforms offer cost-effective digital marketing opportunities to monitor the market, create user communities, and spread positive opinions. They allow companies with fewer budgets, like startups, to achieve their goals and grow. In fact, studies found that startups with active engagement on those platforms have a higher chance of succeeding and receiving funding from venture capitalists. Our study explores how startups utilize social media platforms to foster social communities. We also aim to characterize the individuals within these communities. The findings from this study underscore the importance of social media for startups. We used network analysis and visualization techniques to investigate the communities of Portuguese IT startups through their Twitter data. For that, a social digraph has been created, and its visualization shows that each startup created a community with a degree of intersecting followers and following users. We characterized those users using user node-level measures. The results indicate that users who are followed by or follow Portuguese IT startups are of these types: “Person”, “Company,” “Blog,” “Venture Capital/Investor,” “IT Event,” “Incubators/Accelerators,” “Startup,” and “University.” Furthermore, startups follow users who post high volumes of tweets and have high popularity levels, while those who follow them have low activity and are unpopular. The attained results reveal the power of Twitter communities and offer essential insights for startups to consider when building their social media strategies. Lastly, this study proposes a methodological process for social media community analysis on platforms like Twitter.publishersversionpublishe

Repositório Institucional do ISCTE-IUL

Repositório da Universidade Nova de Lisboa

Interest Clustering Coefficient: a New Metric for Directed Networks like Twitter

Author: Cohen Nathann
Giroire Frédéric
Hogie Luc
Pérennes Stéphane
Trolliet Thibaud
Publication venue
Publication date: 02/08/2020
Field of study

We study here the clustering of directed social graphs. The clustering coefficient has been introduced to capture the social phenomena that a friend of a friend tends to be my friend. This metric has been widely studied and has shown to be of great interest to describe the characteristics of a social graph. In fact, the clustering coefficient is adapted for a graph in which the links are undirected, such as friendship links (Facebook) or professional links (LinkedIn). For a graph in which links are directed from a source of information to a consumer of information, it is no more adequate. We show that former studies have missed much of the information contained in the directed part of such graphs. We thus introduce a new metric to measure the clustering of a directed social graph with interest links, namely the interest clustering coefficient. We compute it (exactly and using sampling methods) on a very large social graph, a Twitter snapshot with 505 million users and 23 billion links. We additionally provide the values of the formerly introduced directed and undirected metrics, a first on such a large snapshot. We exhibit that the interest clustering coefficient is larger than classic directed clustering coefficients introduced in the literature. This shows the relevancy of the metric to capture the informational aspects of directed graphs.Comment: 15 pages, 9 figure

arXiv.org e-Print Archive

Crossref

INRIA a CCSD electronic archive server

Discovery, retrieval, and analysis of the 'Star wars' botnet in twitter

Author: Echeverria J
Zhou S
Publication venue: 'American College of Medical Physics (ACMP)'
Publication date: 31/07/2017
Field of study

It is known that many Twitter users are bots, which are accounts controlled and sometimes created by computers. Twitter bots can send spam tweets, manipulate public opinion and be used for online fraud. Here we report the discovery, retrieval, and analysis of the ‘Star Wars’ botnet in Twitter, which consists of more than 350,000 bots tweeting random quotations exclusively from Star Wars novels. The botnet contains a single type of bot, showing exactly the same properties throughout the botnet. It is unusually large, many times larger than other available datasets. It provides a valuable source of ground truth for research on Twitter bots. We analysed and revealed rich details on how the botnet was designed and created. As of this writing, the Star Wars bots are still alive in Twitter. They have survived since their creation in 2013, despite the increasing efforts in recent years to detect and remove Twitter bots. We also reflect on the ‘unconventional’ way in which we discovered the Star Wars bots, and discuss the current problems and future challenges of Twitter bot detection

UCL Discovery

Interest clustering coefficient: a new metric for directed networks like Twitter

Author: Cohen Nathann
Giroire Frédéric
Hogie Luc
Pérennes Stéphane
Trolliet Thibaud
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2021
Field of study

International audienceThe clustering coefficient has been introduced to capture the social phenomena that a friend of a friend tends to be my friend. This metric has been widely studied and has shown to be of great interest to describe the characteristics of a social graph. But, the clustering coefficient is originally defined for a graph in which the links are undirected, such as friendship links (Facebook) or professional links (LinkedIn). For a graph in which links are directed from a source of information to a consumer of information, it is no more adequate. We show that former studies have missed much of the information contained in the directed part of such graphs. In this article, we introduce a new metric to measure the clustering of directed social graphs with interest links, namely the interest clustering coefficient. We compute it (exactly and using sampling methods) on a very large social graph, a Twitter snapshot with 505 million users and 23 billion links, as well as other various datasets. We additionally provide the values of the formerly introduced directed and undirected metrics, a first on such a large snapshot. We observe a higher value of the interest clustering coefficient than classic directed clustering coefficients, showing the importance of this metric. By studying the bidirectional edges of the Twitter graph, we also show that the interest clustering coefficient is more adequate to capture the interest part of the graph while classic ones are more adequate to capture the social part. We also introduce a new model able to build random networks with a high value of interest clustering coefficient. We finally discuss the interest of this new metric for link recommendation

INRIA a CCSD electronic archive server

Message Propagation and Social Influence in Twitter

Author: Narayana Vishali
Publication venue
Publication date: 01/07/2017
Field of study

Twitter data has potentially unlimited value and numerous applications and is known for its increase in users over time. Twitter facilitates information diffusion at an exponential rate and also the creation of networks of users with a common interest. People reacting to the spread of an epidemic or a natural disaster are greatly influenced by the information diffusion in social media. Twitter, being a popular micro-blogging network provides an effective way to measure diffusion in terms of speed and strength. Our research is based on previous work on models related to topic diffusion and user influence. A topic is defined by a set of keywords.This research concentrates on the implementation of algorithms for computation of diffusion of a topic in twitter. The degree of influence of the users who tweet on the topic is also addressed. We have presented two different approaches to compute user influence based on topic potential. We compare two diffusion models proposed in the literature, namely potentials and connections. For testing and empirical analyses we use tweets related to “flu”, “food poisoning”, and “politics”.Computer Scienc

SHAREOK repository