Search CORE

23 research outputs found

A Data-driven Study of Influences in Twitter Communities

Author: Nguyen Huy
Zheng Rong
Publication venue
Publication date: 16/07/2013
Field of study

This paper presents a quantitative study of Twitter, one of the most popular micro-blogging services, from the perspective of user influence. We crawl several datasets from the most active communities on Twitter and obtain 20.5 million user profiles, along with 420.2 million directed relations and 105 million tweets among the users. User influence scores are obtained from influence measurement services, Klout and PeerIndex. Our analysis reveals interesting findings, including non-power-law influence distribution, strong reciprocity among users in a community, the existence of homophily and hierarchical relationships in social influences. Most importantly, we observe that whether a user retweets a message is strongly influenced by the first of his followees who posted that message. To capture such an effect, we propose the first influencer (FI) information diffusion model and show through extensive evaluation that compared to the widely adopted independent cascade model, the FI model is more stable and more accurate in predicting influence spreads in Twitter communities.Comment: 11 page

arXiv.org e-Print Archive

Crossref

QDEE: Question Difficulty and Expertise Estimation in Community Question Answering Sites

Author: Moosavi Sobhan
Parthasarathy Srinivasan
Ramnath Rajiv
Sun Jiankai
Publication venue
Publication date: 20/04/2018
Field of study

In this paper, we present a framework for Question Difficulty and Expertise Estimation (QDEE) in Community Question Answering sites (CQAs) such as Yahoo! Answers and Stack Overflow, which tackles a fundamental challenge in crowdsourcing: how to appropriately route and assign questions to users with the suitable expertise. This problem domain has been the subject of much research and includes both language-agnostic as well as language conscious solutions. We bring to bear a key language-agnostic insight: that users gain expertise and therefore tend to ask as well as answer more difficult questions over time. We use this insight within the popular competition (directed) graph model to estimate question difficulty and user expertise by identifying key hierarchical structure within said model. An important and novel contribution here is the application of "social agony" to this problem domain. Difficulty levels of newly posted questions (the cold-start problem) are estimated by using our QDEE framework and additional textual features. We also propose a model to route newly posted questions to appropriate users based on the difficulty level of the question and the expertise of the user. Extensive experiments on real world CQAs such as Yahoo! Answers and Stack Overflow data demonstrate the improved efficacy of our approach over contemporary state-of-the-art models. The QDEE framework also allows us to characterize user expertise in novel ways by identifying interesting patterns and roles played by different users in such CQAs.Comment: Accepted in the Proceedings of the 12th International AAAI Conference on Web and Social Media (ICWSM 2018). June 2018. Stanford, CA, US

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Organizational Chart Inference

Author: Adamic L.
Aoki M.
Clauset A.
DiMicco J.
Hyndman R.
Krebs V.
Kroch A.
Merton R.
Warwick-Booth L.
Zhang J.
Zhang J.
Publication venue
Publication date: 24/07/2015
Field of study

Nowadays, to facilitate the communication and cooperation among employees, a new family of online social networks has been adopted in many companies, which are called the "enterprise social networks" (ESNs). ESNs can provide employees with various professional services to help them deal with daily work issues. Meanwhile, employees in companies are usually organized into different hierarchies according to the relative ranks of their positions. The company internal management structure can be outlined with the organizational chart visually, which is normally confidential to the public out of the privacy and security concerns. In this paper, we want to study the IOC (Inference of Organizational Chart) problem to identify company internal organizational chart based on the heterogeneous online ESN launched in it. IOC is very challenging to address as, to guarantee smooth operations, the internal organizational charts of companies need to meet certain structural requirements (about its depth and width). To solve the IOC problem, a novel unsupervised method Create (ChArT REcovEr) is proposed in this paper, which consists of 3 steps: (1) social stratification of ESN users into different social classes, (2) supervision link inference from managers to subordinates, and (3) consecutive social classes matching to prune the redundant supervision links. Extensive experiments conducted on real-world online ESN dataset demonstrate that Create can perform very well in addressing the IOC problem.Comment: 10 pages, 9 figures, 1 table. The paper is accepted by KDD 201

arXiv.org e-Print Archive

Crossref

A Distributed Framework for Social Network Analysis and Visualization

Author: Arroyo Daniel Ortiz
Davidsen Søren Atmakuri
Larsen Henrik Legind
Publication venue
Publication date: 01/09/2013
Field of study

VBN

Soccer Team Vectors

Author: A Constantinou
AE Elo
C Leitner
F Pedregosa
K Pelechrinis
LM Hvattum
RA Bradley
S Neumann
Y Bengio
ZS Harris
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 31/03/2020
Field of study

In this work we present STEVE - Soccer TEam VEctors, a principled approach for learning real valued vectors for soccer teams where similar teams are close to each other in the resulting vector space. STEVE only relies on freely available information about the matches teams played in the past. These vectors can serve as input to various machine learning tasks. Evaluating on the task of team market value estimation, STEVE outperforms all its competitors. Moreover, we use STEVE for similarity search and to rank soccer teams.Comment: 11 pages, 1 figure; This paper was presented at the 6th Workshop on Machine Learning and Data Mining for Sports Analytics at ECML/PKDD 2019, W\"urzburg, Germany, 201

arXiv.org e-Print Archive

Crossref

Resolution of ranking hierarchies in directed networks

Author: Barucca Paolo
Letizia Elisa
Lillo Fabrizio
Publication venue
Publication date: 04/07/2017
Field of study

Identifying hierarchies and rankings of nodes in directed graphs is fundamental in many applications such as social network analysis, biology, economics, and finance. A recently proposed method identifies the hierarchy by finding the ordered partition of nodes which minimises a score function, termed agony. This function penalises the links violating the hierarchy in a way depending on the strength of the violation. To investigate the resolution of ranking hierarchies we introduce an ensemble of random graphs, the Ranked Stochastic Block Model. We find that agony may fail to identify hierarchies when the structure is not strong enough and the size of the classes is small with respect to the whole network. We analytically characterise the resolution threshold and we show that an iterated version of agony can partly overcome this resolution limit.Comment: 27 pages, 9 figure

arXiv.org e-Print Archive

Crossref

Directory of Open Access Journals

Archivio istituzionale della Ricerca - Scuola Normale Superiore

UCL Discovery

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

FigShare

Corporate payments networks and credit risk rating

Author: Letizia Elisa
Lillo Fabrizio
Publication venue
Publication date: 22/09/2018
Field of study

Aggregate and systemic risk in complex systems are emergent phenomena depending on two properties: the idiosyncratic risks of the elements and the topology of the network of interactions among them. While a significant attention has been given to aggregate risk assessment and risk propagation once the above two properties are given, less is known about how the risk is distributed in the network and its relations with the topology. We study this problem by investigating a large proprietary dataset of payments among 2.4M Italian firms, whose credit risk rating is known. We document significant correlations between local topological properties of a node (firm) and its risk. Moreover we show the existence of an homophily of risk, i.e. the tendency of firms with similar risk profile to be statistically more connected among themselves. This effect is observed when considering both pairs of firms and communities or hierarchies identified in the network. We leverage this knowledge to show the predictability of the missing rating of a firm using only the network properties of the associated node

arXiv.org e-Print Archive

Archivio istituzionale della Ricerca - Scuola Normale Superiore

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna