13 research outputs found

    Community detection in bipartite signed networks is highly dependent on parameter choice

    Get PDF
    Decision-making processes often involve voting. Human interactions with exogenous entities such as legislations or products can be effectively modeled as two-mode (bipartite) signed networks-where people can either vote positively, negatively, or abstain from voting on the entities. Detecting communities in such networks could help us understand underlying properties: for example ideological camps or consumer preferences. While community detection is an established practice separately for bipartite and signed networks, it remains largely unexplored in the case of bipartite signed networks. In this paper, we systematically evaluate the efficacy of community detection methods on bipartite signed networks using a synthetic benchmark and real-world datasets. Our findings reveal that when no communities are present in the data, these methods often recover spurious communities. When communities are present, the algorithms exhibit promising performance, although their performance is highly susceptible to parameter choice. This indicates that researchers using community detection methods in the context of bipartite signed networks should not take the communities found at face value: it is essential to assess the robustness of parameter choices or perform domain-specific external validation

    Exploitation of information propagation patterns in social sensing

    Get PDF
    Online social media presents new opportunity for sensing the physical world. The sensors are essentially human, who share information in the broadcast social media. Such human sensors impose challenges like influence, bias, polarization, and data overload, unseen in the traditional sensor network. This dissertation addresses the aforementioned challenges by exploiting the propagation or prefential attachment patterns of the human sensors to distill a factual view of the events transpiring in the physical world. Our first contribution explores the correlated errors caused by the dependent sources. When people follow others, they are prone to broadcast information with unknown provenance. We show that using admission control mechanism to select an independent set of sensors improves the quality of reconstruction. The next contribution explores a different kind of correlated error caused by polarization and bias. During events related to conflict or disagreement, people take sides, and take a selective or preferential approach when broadcasting information. For example, a source might be less credible when it shares information conforming to its own bias. We present a maximum-likelihood estimation model to reconstruct the factual information in such cases, given the individual bias of the sources are already known. Our next two contributions relate to modeling polarization and unveiling polarization using maximum-likelihood and matrix factorization based mechanisms. These mechanisms allow us to automate the process of separating polarized content, and obtain a more faithful view of the events being sensed. Finally, we design and implement `SocialTrove', a summarization service that continuously execute in the cloud, as a platform to compute the reconstructions at scale. Our contributions have been integrated with `Apollo Social Sensing Toolkit', which builds a pipeline to collect, summarize, and analyze information from Twitter, and serves more than 40 users

    Proclivity or Popularity? Exploring Agent Heterogeneity in Network Formation

    Get PDF
    The Barabasi-Albert model (BA model) is the standard algorithm used to describe the emergent mechanism of a scale-free network. This dissertation argues that the BA model, and its variants, rarely take agent heterogeneity into account in the analysis of network formation. In social networks, however, people\u27s decisions to connect are strongly affected by the extent of similarity. In this dissertation, the author applies an agent-based modeling (ABM) approach to reassess the Barabasi-Albert model. This study proposes that, in forming social networks, agents are constantly balancing between instrumental and intrinsic preferences. After systematic simulation and subsequent analysis, this study finds that agents\u27 preference of popularity and proclivity strongly shapes various attributes of simulated social networks. Moreover, this analysis of simulated networks investigates potential ways to detect this balance within real-world networks. Particularly, the scale parameter of the power-distribution is found sensitive solely to agents\u27 preference popularity. Finally, this study employs the social media data (i.e., diffusion of different emotions) for Sina Weibo—a Chinese version Tweet—to valid the findings, and results suggest that diffusion of anger is more popularity-driven

    Collective attention in online social networks

    Get PDF
    Social media is an ever-present tool in modern society, and its widespread usage positions it as a valuable source of insights into society at large. The study of collective attention in particular is one application that benefits from the scale of social media data. In this thesis we will investigate how collective attention manifests on social media and how it can be understood. We approach this challenge from several perspectives across network and data science. We first focus on a period of increased media attention to climate change to see how robust the previously observed polarised structures are under a collective attention event. Our experiments will show that while the level of engagement with the climate change debate increases, there is little disruption to the existing polarised structure in the communication network. Understanding the climate media debate requires addressing a methodological concern about the most effective method for weighting bipartite network projections with respect to the accuracy of community detection. We test seven weighting schemes on constructed networks with known community structure and then use the preferred methodology we identify to study collective attention in the climate change debate on Twitter. Following on from this, we will investigate how collective attention changes over the course of a single event over a longer period, namely the COVID-19 pandemic. We measure how the disruption to in-person social interactions as a consequence of attempts to limit the spread of COVID-19 in England and Wales have affected social interaction patterns as they appear on Twitter. Using a dataset of tweets with location tags, we will see how the spatial attention to locations and collective attention to discussion topics are affected by social distancing and population movement restrictions in different stages of the pandemic. Finally we present a new analysis framework for collective attention events that allows direct comparisons across different time and volume scales, such as those seen in the climate change and COVID-19 experiments. We demonstrate that this approach performs better than traditional approaches that rely on binning the timeseries at certain resolutions and comment on the mechanistic properties highlighted by our new methodology.Engineering and Physical Sciences Research Council (EPSRC

    Personalized Expert Recommendation: Models and Algorithms

    Get PDF
    Many large-scale information sharing systems including social media systems, questionanswering sites and rating and reviewing applications have been growing rapidly, allowing millions of human participants to generate and consume information on an unprecedented scale. To manage the sheer growth of information generation, there comes the need to enable personalization of information resources for users — to surface high-quality content and feeds, to provide personally relevant suggestions, and so on. A fundamental task in creating and supporting user-centered personalization systems is to build rich user profile to aid recommendation for better user experience. Therefore, in this dissertation research, we propose models and algorithms to facilitate the creation of new crowd-powered personalized information sharing systems. Specifically, we first give a principled framework to enable personalization of resources so that information seekers can be matched with customized knowledgeable users based on their previous historical actions and contextual information; We then focus on creating rich user models that allows accurate and comprehensive modeling of user profiles for long tail users, including discovering user’s known-for profile, user’s opinion bias and user’s geo-topic profile. In particular, this dissertation research makes two unique contributions: First, we introduce the problem of personalized expert recommendation and propose the first principled framework for addressing this problem. To overcome the sparsity issue, we investigate the use of user’s contextual information that can be exploited to build robust models of personal expertise, study how spatial preference for personally-valuable expertise varies across regions, across topics and based on different underlying social communities, and integrate these different forms of preferences into a matrix factorization-based personalized expert recommender. Second, to support the personalized recommendation on experts, we focus on modeling and inferring user profiles in online information sharing systems. In order to tap the knowledge of most majority of users, we provide frameworks and algorithms to accurately and comprehensively create user models by discovering user’s known-for profile, user’s opinion bias and user’s geo-topic profile, with each described shortly as follows: —We develop a probabilistic model called Bayesian Contextual Poisson Factorization to discover what users are known for by others. Our model considers as input a small fraction of users whose known-for profiles are already known and the vast majority of users for whom we have little (or no) information, learns the implicit relationships between user?s known-for profiles and their contextual signals, and finally predict known-for profiles for those majority of users. —We explore user’s topic-sensitive opinion bias, propose a lightweight semi-supervised system called “BiasWatch” to semi-automatically infer the opinion bias of long-tail users, and demonstrate how user’s opinion bias can be exploited to recommend other users with similar opinion in social networks. — We study how a user’s topical profile varies geo-spatially and how we can model a user’s geo-spatial known-for profile as the last step in our dissertation for creation of rich user profile. We propose a multi-layered Bayesian hierarchical user factorization to overcome user heterogeneity and an enhanced model to alleviate the sparsity issue by integrating user contexts into the two-layered hierarchical user model for better representation of user’s geo-topic preference by others

    Graph neural networks for network analysis

    Get PDF
    With an increasing number of applications where data can be represented as graphs, graph neural networks (GNNs) are a useful tool to apply deep learning to graph data. Signed and directed networks are important forms of networks that are linked to many real-world problems, such as ranking from pairwise comparisons, and angular synchronization. In this report, we propose two spatial GNN methods for node clustering in signed and directed networks, a spectral GNN method for signed directed networks on both node clustering and link prediction, and two GNN methods for specific applications in ranking as well as angular synchronization. The methods are end-to-end in combining embedding generation and prediction without an intermediate step. Experimental results on various data sets, including several synthetic stochastic block models, random graph outlier models, and real-world data sets at different scales, demonstrate that our proposed methods can achieve satisfactory performance, for a wide range of noise and sparsity levels. The introduced models also complement existing methods through the possibility of including exogenous information, in the form of node-level features or labels. Their contribution not only aid the analysis of data which are represented by networks, but also form a body of work which presents novel architectures and task-driven loss functions for GNNs to be used in network analysis

    Personalized Expert Recommendation: Models and Algorithms

    Get PDF
    Many large-scale information sharing systems including social media systems, questionanswering sites and rating and reviewing applications have been growing rapidly, allowing millions of human participants to generate and consume information on an unprecedented scale. To manage the sheer growth of information generation, there comes the need to enable personalization of information resources for users — to surface high-quality content and feeds, to provide personally relevant suggestions, and so on. A fundamental task in creating and supporting user-centered personalization systems is to build rich user profile to aid recommendation for better user experience. Therefore, in this dissertation research, we propose models and algorithms to facilitate the creation of new crowd-powered personalized information sharing systems. Specifically, we first give a principled framework to enable personalization of resources so that information seekers can be matched with customized knowledgeable users based on their previous historical actions and contextual information; We then focus on creating rich user models that allows accurate and comprehensive modeling of user profiles for long tail users, including discovering user’s known-for profile, user’s opinion bias and user’s geo-topic profile. In particular, this dissertation research makes two unique contributions: First, we introduce the problem of personalized expert recommendation and propose the first principled framework for addressing this problem. To overcome the sparsity issue, we investigate the use of user’s contextual information that can be exploited to build robust models of personal expertise, study how spatial preference for personally-valuable expertise varies across regions, across topics and based on different underlying social communities, and integrate these different forms of preferences into a matrix factorization-based personalized expert recommender. Second, to support the personalized recommendation on experts, we focus on modeling and inferring user profiles in online information sharing systems. In order to tap the knowledge of most majority of users, we provide frameworks and algorithms to accurately and comprehensively create user models by discovering user’s known-for profile, user’s opinion bias and user’s geo-topic profile, with each described shortly as follows: —We develop a probabilistic model called Bayesian Contextual Poisson Factorization to discover what users are known for by others. Our model considers as input a small fraction of users whose known-for profiles are already known and the vast majority of users for whom we have little (or no) information, learns the implicit relationships between user?s known-for profiles and their contextual signals, and finally predict known-for profiles for those majority of users. —We explore user’s topic-sensitive opinion bias, propose a lightweight semi-supervised system called “BiasWatch” to semi-automatically infer the opinion bias of long-tail users, and demonstrate how user’s opinion bias can be exploited to recommend other users with similar opinion in social networks. — We study how a user’s topical profile varies geo-spatially and how we can model a user’s geo-spatial known-for profile as the last step in our dissertation for creation of rich user profile. We propose a multi-layered Bayesian hierarchical user factorization to overcome user heterogeneity and an enhanced model to alleviate the sparsity issue by integrating user contexts into the two-layered hierarchical user model for better representation of user’s geo-topic preference by others

    Partitioning and Scaling Signed Bipartite Graphs for Polarized Political Blogosphere

    No full text
    Blogosphere plays an increasingly important role as a forum for public debate. In this paper, given a mixed set of blogs debating a set of political issues from opposing camps, we use signed bipartite graphs for modeling debates, and we propose an algorithm for partitioning both the blogs, and the issues (i.e. topics, leaders, etc.) comprising the debate into binary opposing camps. Simultaneously, our algorithm scales both the blogs and the underlying issues on a univariate scale. Using this scale, a researcher can identify moderate and extreme blogs within each camp, and polarizing vs. unifying issues. Through performance evaluations we show that our proposed algorithm provides an effective solution to the problem, and performs much better than existing baseline algorithms adapted to solve this new problem. In our experiments, we used both real data from political blogosphere and US Congress records, as well as synthetic data which were obtained by varying polarization and degree distribution of the vertices of the graph to show the robustness of our algorithm
    corecore