21,287 research outputs found
LEARNING WITH SPARSITY FOR DETECTING INFLUENTIAL NODES IN IMPLICIT INFORMATION DIFFUSION NETWORKS
The diffusion of information and spreading influence are ubiquitous in social networks. How to model and extract useful information from diffusion networks especially in social media domain is still an open research area that requires significant attention. Many real applications pose new challenges in modeling information diffusion process. In particular, the first challenge comes from the fact that the underlying network structure over which the propagation spreads is unknown or unobserved. It is often the case that one can only observe that when nodes got infected by which contagion but without the knowledge about who infecting whom. The second challenge comes from the simultaneous transmissions of multiple correlated contagions through an implicit network. The third one comes from strong temporal effect in the diffusion process which needs to be carefully modeled.
In my thesis, we address two fundamental tasks, forecasting and influential-node detection, in an implicit diffusion network by a unified approach. In particular, we first proposed a sparse linear influence model (SLIM) which takes a nice form of a convex optimization problem. We further extended SLIM to multi-task sparse linear influence model (MSLIM), which could model diffusion networks with multiple correlated contagions. MSLIM, as a richer model than SLIM, not only improves prediction accuracy, but also allows to select influential nodes on a finer grid, i.e., select different sets of influential nodes for different contagions. For SLIM and MSLIM, we developed both deterministic and stochastic optimization algorithms for solving the corresponding problems and showed the fast theoretical convergence guarantees.
Another contribution of the thesis is the development of a general purpose system, called Slow Intelligent System (SIS), which is able to continuously learn and improve performance over time. We proposed the component-based SIS and developed the software with applications to face recognition task. Furthermore, we utilized the idea of the SIS to systematize the information diffusion process modeling and influential node detection and proposed SIS-based SLIM/MSLIM approaches, which further improve the flexibility and scalability of learning from implicit diffusion networks. We demonstrated the superiority of the proposed approaches on several real datasets from social media domains
Emergence of influential spreaders in modified rumor models
The burst in the use of online social networks over the last decade has
provided evidence that current rumor spreading models miss some fundamental
ingredients in order to reproduce how information is disseminated. In
particular, recent literature has revealed that these models fail to reproduce
the fact that some nodes in a network have an influential role when it comes to
spread a piece of information. In this work, we introduce two mechanisms with
the aim of filling the gap between theoretical and experimental results. The
first model introduces the assumption that spreaders are not always active
whereas the second model considers the possibility that an ignorant is not
interested in spreading the rumor. In both cases, results from numerical
simulations show a higher adhesion to real data than classical rumor spreading
models. Our results shed some light on the mechanisms underlying the spreading
of information and ideas in large social systems and pave the way for more
realistic diffusion models.Comment: 14 Pages, 6 figures, accepted for publication in Journal of
Statistical Physic
Searching for superspreaders of information in real-world social media
A number of predictors have been suggested to detect the most influential
spreaders of information in online social media across various domains such as
Twitter or Facebook. In particular, degree, PageRank, k-core and other
centralities have been adopted to rank the spreading capability of users in
information dissemination media. So far, validation of the proposed predictors
has been done by simulating the spreading dynamics rather than following real
information flow in social networks. Consequently, only model-dependent
contradictory results have been achieved so far for the best predictor. Here,
we address this issue directly. We search for influential spreaders by
following the real spreading dynamics in a wide range of networks. We find that
the widely-used degree and PageRank fail in ranking users' influence. We find
that the best spreaders are consistently located in the k-core across
dissimilar social platforms such as Twitter, Facebook, Livejournal and
scientific publishing in the American Physical Society. Furthermore, when the
complete global network structure is unavailable, we find that the sum of the
nearest neighbors' degree is a reliable local proxy for user's influence. Our
analysis provides practical instructions for optimal design of strategies for
"viral" information dissemination in relevant applications.Comment: 12 pages, 7 figure
Identifying influencers in a social network : the value of real referral data
Individuals influence each other through social interactions and marketers aim to leverage this interpersonal influence to attract new customers. It still remains a challenge to identify those customers in a social network that have the most influence on their social connections. A common approach to the influence maximization problem is to simulate influence cascades through the network based on the existence of links in the network using diffusion models. Our study contributes to the literature by evaluating these principles using real-life referral behaviour data. A new ranking metric, called Referral Rank, is introduced that builds on the game theoretic concept of the Shapley value for assigning each individual in the network a value that reflects the likelihood of referring new customers. We also explore whether these methods can be further improved by looking beyond the one-hop neighbourhood of the influencers. Experiments on a large telecommunication data set and referral data set demonstrate that using traditional simulation based methods to identify influencers in a social network can lead to suboptimal decisions as the results overestimate actual referral cascades. We also find that looking at the influence of the two-hop neighbours of the customers improves the influence spread and product adoption. Our findings suggest that companies can take two actions to improve their decision support system for identifying influential customers: (1) improve the data by incorporating data that reflects the actual referral behaviour of the customers or (2) extend the method by looking at the influence of the connections in the two-hop neighbourhood of the customers
Human-Centric Cyber Social Computing Model for Hot-Event Detection and Propagation
The file attached to this record is the author's final peer reviewed version. The Publisher's final version can be found by following the DOI link.Microblogging networks have gained popularity in recent years as a platform enabling expressions of human emotions, through which users can conveniently produce contents on public events, breaking news, and/or products. Subsequently, microblogging networks generate massive amounts of data that carry opinions and mass sentiment on various topics. Herein, microblogging is regarded as a useful platform for detecting and propagating new hot events. It is also a useful channel for identifying high-quality posts, popular topics, key interests, and high-influence users. The existence of noisy data in the traditional social media data streams enforces to focus on human-centric computing. This paper proposes a human-centric social computing (HCSC) model for hot-event detection and propagation in microblogging networks. In the proposed HCSC model, all posts and users are preprocessed through hypertext induced topic search (HITS) for determining high-quality subsets of the users, topics, and posts. Then, a latent Dirichlet allocation (LDA)-based multiprototype user topic detection method is used for identifying users with high influence in the network. Furthermore, an influence maximization is used for final determination of influential users based on the user subsets. Finally, the users mined by influence maximization process are generated as the influential user sets for specific topics. Experimental results prove the superiority of our HCSC model against similar models of hot-event detection and information propagation
Collective Influence of Multiple Spreaders Evaluated by Tracing Real Information Flow in Large-Scale Social Networks
Identifying the most influential spreaders that maximize information flow is
a central question in network theory. Recently, a scalable method called
"Collective Influence (CI)" has been put forward through collective influence
maximization. In contrast to heuristic methods evaluating nodes' significance
separately, CI method inspects the collective influence of multiple spreaders.
Despite that CI applies to the influence maximization problem in percolation
model, it is still important to examine its efficacy in realistic information
spreading. Here, we examine real-world information flow in various social and
scientific platforms including American Physical Society, Facebook, Twitter and
LiveJournal. Since empirical data cannot be directly mapped to ideal
multi-source spreading, we leverage the behavioral patterns of users extracted
from data to construct "virtual" information spreading processes. Our results
demonstrate that the set of spreaders selected by CI can induce larger scale of
information propagation. Moreover, local measures as the number of connections
or citations are not necessarily the deterministic factors of nodes' importance
in realistic information spreading. This result has significance for rankings
scientists in scientific networks like the APS, where the commonly used number
of citations can be a poor indicator of the collective influence of authors in
the community.Comment: 11 pages, 4 figure
- …