214 research outputs found

    Network Representation Learning in Social Media

    Get PDF
    abstract: The popularity of social media has generated abundant large-scale social networks, which advances research on network analytics. Good representations of nodes in a network can facilitate many network mining tasks. The goal of network representation learning (network embedding) is to learn low-dimensional vector representations of social network nodes that capture certain properties of the networks. With the learned node representations, machine learning and data mining algorithms can be applied for network mining tasks such as link prediction and node classification. Because of its ability to learn good node representations, network representation learning is attracting increasing attention and various network embedding algorithms are proposed. Despite the success of these network embedding methods, the majority of them are dedicated to static plain networks, i.e., networks with fixed nodes and links only; while in social media, networks can present in various formats, such as attributed networks, signed networks, dynamic networks and heterogeneous networks. These social networks contain abundant rich information to alleviate the network sparsity problem and can help learn a better network representation; while plain network embedding approaches cannot tackle such networks. For example, signed social networks can have both positive and negative links. Recent study on signed networks shows that negative links have added value in addition to positive links for many tasks such as link prediction and node classification. However, the existence of negative links challenges the principles used for plain network embedding. Thus, it is important to study signed network embedding. Furthermore, social networks can be dynamic, where new nodes and links can be introduced anytime. Dynamic networks can reveal the concept drift of a user and require efficiently updating the representation when new links or users are introduced. However, static network embedding algorithms cannot deal with dynamic networks. Therefore, it is important and challenging to propose novel algorithms for tackling different types of social networks. In this dissertation, we investigate network representation learning in social media. In particular, we study representative social networks, which includes attributed network, signed networks, dynamic networks and document networks. We propose novel frameworks to tackle the challenges of these networks and learn representations that not only capture the network structure but also the unique properties of these social networks.Dissertation/ThesisDoctoral Dissertation Computer Science 201

    Learning Disentangled Representations in Signed Directed Graphs without Social Assumptions

    Full text link
    Signed graphs are complex systems that represent trust relationships or preferences in various domains. Learning node representations in such graphs is crucial for many mining tasks. Although real-world signed relationships can be influenced by multiple latent factors, most existing methods often oversimplify the modeling of signed relationships by relying on social theories and treating them as simplistic factors. This limits their expressiveness and their ability to capture the diverse factors that shape these relationships. In this paper, we propose DINES, a novel method for learning disentangled node representations in signed directed graphs without social assumptions. We adopt a disentangled framework that separates each embedding into distinct factors, allowing for capturing multiple latent factors. We also explore lightweight graph convolutions that focus solely on sign and direction, without depending on social theories. Additionally, we propose a decoder that effectively classifies an edge's sign by considering correlations between the factors. To further enhance disentanglement, we jointly train a self-supervised factor discriminator with our encoder and decoder. Throughout extensive experiments on real-world signed directed graphs, we show that DINES effectively learns disentangled node representations, and significantly outperforms its competitors in the sign prediction task.Comment: 26 pages, 11 figure

    On the Troll-Trust Model for Edge Sign Prediction in Social Networks

    Get PDF
    In the problem of edge sign prediction, we are given a directed graph (representing a social network), and our task is to predict the binary labels of the edges (i.e., the positive or negative nature of the social relationships). Many successful heuristics for this problem are based on the troll-trust features, estimating at each node the fraction of outgoing and incoming positive/negative edges. We show that these heuristics can be understood, and rigorously analyzed, as approximators to the Bayes optimal classifier for a simple probabilistic model of the edge labels. We then show that the maximum likelihood estimator for this model approximately corresponds to the predictions of a Label Propagation algorithm run on a transformed version of the original social graph. Extensive experiments on a number of real-world datasets show that this algorithm is competitive against state-of-the-art classifiers in terms of both accuracy and scalability. Finally, we show that troll-trust features can also be used to derive online learning algorithms which have theoretical guarantees even when edges are adversarially labeled.Comment: v5: accepted to AISTATS 201

    Graph neural networks for network analysis

    Get PDF
    With an increasing number of applications where data can be represented as graphs, graph neural networks (GNNs) are a useful tool to apply deep learning to graph data. Signed and directed networks are important forms of networks that are linked to many real-world problems, such as ranking from pairwise comparisons, and angular synchronization. In this report, we propose two spatial GNN methods for node clustering in signed and directed networks, a spectral GNN method for signed directed networks on both node clustering and link prediction, and two GNN methods for specific applications in ranking as well as angular synchronization. The methods are end-to-end in combining embedding generation and prediction without an intermediate step. Experimental results on various data sets, including several synthetic stochastic block models, random graph outlier models, and real-world data sets at different scales, demonstrate that our proposed methods can achieve satisfactory performance, for a wide range of noise and sparsity levels. The introduced models also complement existing methods through the possibility of including exogenous information, in the form of node-level features or labels. Their contribution not only aid the analysis of data which are represented by networks, but also form a body of work which presents novel architectures and task-driven loss functions for GNNs to be used in network analysis

    LEARNING ON GRAPHS: ALGORITHMS FOR CLASSIFICATION AND SEQUENTIAL DECISIONS

    Get PDF
    In recent years, networked data have become widespread due to the increasing importance of social networks and other web-related applications. This growing interest is driving researchers to design new algorithms for solving important problems that involve networked data. In this thesis we present a few practical yet principled algorithms for learning and sequential decision-making on graphs. Classification of networked data is an important problem that has recently received a great deal of attention from the machine learning community. This is due to its many important practical applications: computer vision, bioinformatics, spam detection and text categorization, just to cite a few of the more conspicuous examples. We focus our attention on the task called ``node classification'', often studied in the semi-supervised (transductive) setting. We present two algorithms, motivated by different theoretical frameworks. The first algorithm is studied in the well-known online adversarial setting, within which it enjoys an optimal mistake bound (up to logarithmic factors). The second algorithm is based on a game-theoretic approach, where each node of the network is maximizing its own payoff. The setting corresponds to a Graph Transduction Game in which the graph is a tree. For this special case, we show that the Nash Equilibrium of the game can be reached in linear time. We complement our theoretical findings with an extensive set of experiments using datasets from many different domains. In the second part of the thesis, we present a rapidly emerging theme in the analysis of networked data: signed networks, graphs whose edges carry a label encoding the positive or negative nature of the relationship between the connected nodes. For example, social networks and e-commerce offer several examples of signed relationships: Slashdot users can tag other users as friends or foes, Epinions users can rate each other positively or negatively, Ebay users develop trust and distrust towards sellers in the network. More generally, two individuals that are related because they rate similar products in a recommendation website may agree or disagree in their ratings. Many heuristics for link classification in social networks are based on a form of social balance summarized by the motto \u201cthe enemy of my enemy is my friend\u201d. This is equivalent to saying that the signs on the edges of a social graph tend to be consistent with some two-clustering structure of the nodes, where edges connecting nodes from the same cluster are positive and edges connecting nodes from different clusters are negative. We present algorithms for the batch transductive active learning setting, where the topology of the graph is known in advance and our algorithms can ask for the label of some specific edges during the training phase (before starting with the predictions). These algorithms can achieve different tradeoffs between the number of mistakes during the test phase and the number of labels required during the training phase. We also presented an experimental comparison against some state-of-the-art spectral heuristics presented in a previous work, where we show that the simplest or our algorithms is already competitive with the best of these heuristics. In the last chapter we present another way to exploit relational information for sequential predictions: the networks of bandits. Contextual bandits adequately formalize the exploration-exploitation trade-offs arising in several industrially relevant applications, such online advertisement and recommendation systems. Many practical applications have a strong social component whose integration in the bandit algorithm could lead to a significant performance improvement: for example, since often friends have similar taste, we may want to serve contents to a group of users by taking advantage of an underlying network of social relationships among them. We introduce a novel algorithmic approach to a particular networked bandit problem. More specifically, we run a bandit algorithm on each network node (e.g., user), allowing it to ``share'' feedback signals with the other nodes by employing the multi-task kernel. We derive the regret analysis of this algorithm and, finally, we report on the results of an experimental comparison between our approach and the state of the art techniques, on both artificial and real-world social networks

    Electrification of the Norwegian continental shelf: Discursive practices from the key actors How do the key actors perceive electrification of the NCS?

    Get PDF
    A ‘renationalisation’ of Norwegian climate policy, shifting from a global to a domestic approach to meet a 55 % emissions reduction target, also centres the debate towards Norway's biggest emitters of greenhouse gases: the petroleum industry on the Norwegian continental shelf. Just what exactly does “all emissions cuts to be made at home” mean for the petroleum industry? A question to which the industry proposes electrification of the Norwegian continental shelf as a preferred strategy and solution to the problem of increasing emissions. Norway's economic dependency on the petroleum industry also adds to the tension in this respect. As a climate policy topic, it captivates industry actors, politicians, environmental organisations, state bureaucracy, and the public. According to discourse theory, discourses carry a significant role in societal power structures. Thus, a perspective on the discursive practices of key industry actors, politicians, policy makers, and environmental organisations can provide valuable insights to nudge the transition towards the necessary measures to meet the emissions reduction target. The thesis executes three angles of inquiry: 1. looking at the Norwegian climate policy (both past and current) and how the electrification of the Norwegian continental shelf arises as a strategy; 2. the discursive practices and story lines of central actors within the field of electrification; and 3. the official climate policies on electrification as a strategy to reach climate targets. With these inquiries, the study aims to give insights into whether and to what extent the electrification of the Norwegian continental shelf is an appropriate measure to reach the 55 % emission cuts target by 2030. We adopt a discourse analysis framework and approach to our study, consisting of 13 key actor interviews and document analysis to detect story lines and discourses on the topic. The analysis finds many narratives that are categorised and condensed into five main story lines, one of which emerges as dominant. Based on the interviews with representatives from central actors, in addition to document analysis surrounding topic, the five storylines are: SL1: Full on electrification SL2: Electrification, yes, but? SL3: Yes, but by other means SL4: Shut it down! SL5: Forget About Norway! The first four story lines focus on reductions in CO2 emissions in Norway, while the fifth focuses on the international mechanisms of purchasing CO2 quotas abroad, instead of taking national emission reductions. We therefore find that most actors in our study argue for reducing emissions nationally, instead of using the international mechanism, which is a shift from the early 2000s. The study finds the second storyline, “Electrification, yes, but?” as dominant and almost hegemonic. Given its support by the most influential parties in parliament, on both sides of the political left-right spectrum, it's embedded in the state bureaucracy and can gain support from the SL1 and SL3 storylines. The SL2 storyline is a sort of middle-ground storyline that seems strategic in its purpose, due to its great flexibility to those who must defend their actions regarding electrification. The thesis finds enabling and constraining aspects in the dominant story line, as well as discourse coalition and institutionalisation, consistent with certain characteristics of discourses. Furthermore, the study finds the discourse around the electrification of the Norwegian continental shelf to be volatile and abundantly dynamic, of which many actors have changed their position in the last decade. The concluding remarks of the thesis find that the dominant storyline, although influenced by factors such as prices on CO2 emissions and electricity prices, also falls subject to some nuances of greenwashing, legitimising oil and gas activities in the domestic political landscape as a way of securing a “license to operate”.
    • 

    corecore