23 research outputs found

    A Data-driven Study of Influences in Twitter Communities

    Full text link
    This paper presents a quantitative study of Twitter, one of the most popular micro-blogging services, from the perspective of user influence. We crawl several datasets from the most active communities on Twitter and obtain 20.5 million user profiles, along with 420.2 million directed relations and 105 million tweets among the users. User influence scores are obtained from influence measurement services, Klout and PeerIndex. Our analysis reveals interesting findings, including non-power-law influence distribution, strong reciprocity among users in a community, the existence of homophily and hierarchical relationships in social influences. Most importantly, we observe that whether a user retweets a message is strongly influenced by the first of his followees who posted that message. To capture such an effect, we propose the first influencer (FI) information diffusion model and show through extensive evaluation that compared to the widely adopted independent cascade model, the FI model is more stable and more accurate in predicting influence spreads in Twitter communities.Comment: 11 page

    QDEE: Question Difficulty and Expertise Estimation in Community Question Answering Sites

    Full text link
    In this paper, we present a framework for Question Difficulty and Expertise Estimation (QDEE) in Community Question Answering sites (CQAs) such as Yahoo! Answers and Stack Overflow, which tackles a fundamental challenge in crowdsourcing: how to appropriately route and assign questions to users with the suitable expertise. This problem domain has been the subject of much research and includes both language-agnostic as well as language conscious solutions. We bring to bear a key language-agnostic insight: that users gain expertise and therefore tend to ask as well as answer more difficult questions over time. We use this insight within the popular competition (directed) graph model to estimate question difficulty and user expertise by identifying key hierarchical structure within said model. An important and novel contribution here is the application of "social agony" to this problem domain. Difficulty levels of newly posted questions (the cold-start problem) are estimated by using our QDEE framework and additional textual features. We also propose a model to route newly posted questions to appropriate users based on the difficulty level of the question and the expertise of the user. Extensive experiments on real world CQAs such as Yahoo! Answers and Stack Overflow data demonstrate the improved efficacy of our approach over contemporary state-of-the-art models. The QDEE framework also allows us to characterize user expertise in novel ways by identifying interesting patterns and roles played by different users in such CQAs.Comment: Accepted in the Proceedings of the 12th International AAAI Conference on Web and Social Media (ICWSM 2018). June 2018. Stanford, CA, US

    Organizational Chart Inference

    Full text link
    Nowadays, to facilitate the communication and cooperation among employees, a new family of online social networks has been adopted in many companies, which are called the "enterprise social networks" (ESNs). ESNs can provide employees with various professional services to help them deal with daily work issues. Meanwhile, employees in companies are usually organized into different hierarchies according to the relative ranks of their positions. The company internal management structure can be outlined with the organizational chart visually, which is normally confidential to the public out of the privacy and security concerns. In this paper, we want to study the IOC (Inference of Organizational Chart) problem to identify company internal organizational chart based on the heterogeneous online ESN launched in it. IOC is very challenging to address as, to guarantee smooth operations, the internal organizational charts of companies need to meet certain structural requirements (about its depth and width). To solve the IOC problem, a novel unsupervised method Create (ChArT REcovEr) is proposed in this paper, which consists of 3 steps: (1) social stratification of ESN users into different social classes, (2) supervision link inference from managers to subordinates, and (3) consecutive social classes matching to prune the redundant supervision links. Extensive experiments conducted on real-world online ESN dataset demonstrate that Create can perform very well in addressing the IOC problem.Comment: 10 pages, 9 figures, 1 table. The paper is accepted by KDD 201

    Soccer Team Vectors

    Full text link
    In this work we present STEVE - Soccer TEam VEctors, a principled approach for learning real valued vectors for soccer teams where similar teams are close to each other in the resulting vector space. STEVE only relies on freely available information about the matches teams played in the past. These vectors can serve as input to various machine learning tasks. Evaluating on the task of team market value estimation, STEVE outperforms all its competitors. Moreover, we use STEVE for similarity search and to rank soccer teams.Comment: 11 pages, 1 figure; This paper was presented at the 6th Workshop on Machine Learning and Data Mining for Sports Analytics at ECML/PKDD 2019, W\"urzburg, Germany, 201

    Resolution of ranking hierarchies in directed networks

    Get PDF
    Identifying hierarchies and rankings of nodes in directed graphs is fundamental in many applications such as social network analysis, biology, economics, and finance. A recently proposed method identifies the hierarchy by finding the ordered partition of nodes which minimises a score function, termed agony. This function penalises the links violating the hierarchy in a way depending on the strength of the violation. To investigate the resolution of ranking hierarchies we introduce an ensemble of random graphs, the Ranked Stochastic Block Model. We find that agony may fail to identify hierarchies when the structure is not strong enough and the size of the classes is small with respect to the whole network. We analytically characterise the resolution threshold and we show that an iterated version of agony can partly overcome this resolution limit.Comment: 27 pages, 9 figure

    Corporate payments networks and credit risk rating

    Get PDF
    Aggregate and systemic risk in complex systems are emergent phenomena depending on two properties: the idiosyncratic risks of the elements and the topology of the network of interactions among them. While a significant attention has been given to aggregate risk assessment and risk propagation once the above two properties are given, less is known about how the risk is distributed in the network and its relations with the topology. We study this problem by investigating a large proprietary dataset of payments among 2.4M Italian firms, whose credit risk rating is known. We document significant correlations between local topological properties of a node (firm) and its risk. Moreover we show the existence of an homophily of risk, i.e. the tendency of firms with similar risk profile to be statistically more connected among themselves. This effect is observed when considering both pairs of firms and communities or hierarchies identified in the network. We leverage this knowledge to show the predictability of the missing rating of a firm using only the network properties of the associated node
    corecore