151,047 research outputs found

    Discriminative Probabilistic Models for Relational Data

    Full text link
    In many supervised learning tasks, the entities to be labeled are related to each other in complex ways and their labels are not independent. For example, in hypertext classification, the labels of linked pages are highly correlated. A standard approach is to classify each entity independently, ignoring the correlations between them. Recently, Probabilistic Relational Models, a relational version of Bayesian networks, were used to define a joint probabilistic model for a collection of related entities. In this paper, we present an alternative framework that builds on (conditional) Markov networks and addresses two limitations of the previous approach. First, undirected models do not impose the acyclicity constraint that hinders representation of many important relational dependencies in directed models. Second, undirected models are well suited for discriminative training, where we optimize the conditional likelihood of the labels given the features, which generally improves classification accuracy. We show how to train these models effectively, and how to use approximate probabilistic inference over the learned model for collective classification of multiple related entities. We provide experimental results on a webpage classification task, showing that accuracy can be significantly improved by modeling relational dependencies.Comment: Appears in Proceedings of the Eighteenth Conference on Uncertainty in Artificial Intelligence (UAI2002

    Capturing Edge Attributes via Network Embedding

    Full text link
    Network embedding, which aims to learn low-dimensional representations of nodes, has been used for various graph related tasks including visualization, link prediction and node classification. Most existing embedding methods rely solely on network structure. However, in practice we often have auxiliary information about the nodes and/or their interactions, e.g., content of scientific papers in co-authorship networks, or topics of communication in Twitter mention networks. Here we propose a novel embedding method that uses both network structure and edge attributes to learn better network representations. Our method jointly minimizes the reconstruction error for higher-order node neighborhood, social roles and edge attributes using a deep architecture that can adequately capture highly non-linear interactions. We demonstrate the efficacy of our model over existing state-of-the-art methods on a variety of real-world networks including collaboration networks, and social networks. We also observe that using edge attributes to inform network embedding yields better performance in downstream tasks such as link prediction and node classification

    A Survey of Heterogeneous Information Network Analysis

    Full text link
    Most real systems consist of a large number of interacting, multi-typed components, while most contemporary researches model them as homogeneous networks, without distinguishing different types of objects and links in the networks. Recently, more and more researchers begin to consider these interconnected, multi-typed data as heterogeneous information networks, and develop structural analysis approaches by leveraging the rich semantic meaning of structural types of objects and links in the networks. Compared to widely studied homogeneous network, the heterogeneous information network contains richer structure and semantic information, which provides plenty of opportunities as well as a lot of challenges for data mining. In this paper, we provide a survey of heterogeneous information network analysis. We will introduce basic concepts of heterogeneous information network analysis, examine its developments on different data mining tasks, discuss some advanced topics, and point out some future research directions.Comment: 45 pages, 12 figure

    Stochastic Block Models with Multiple Continuous Attributes

    Full text link
    The stochastic block model (SBM) is a probabilistic model for community structure in networks. Typically, only the adjacency matrix is used to perform SBM parameter inference. In this paper, we consider circumstances in which nodes have an associated vector of continuous attributes that are also used to learn the node-to-community assignments and corresponding SBM parameters. While this assumption is not realistic for every application, our model assumes that the attributes associated with the nodes in a network's community can be described by a common multivariate Gaussian model. In this augmented, attributed SBM, the objective is to simultaneously learn the SBM connectivity probabilities with the multivariate Gaussian parameters describing each community. While there are recent examples in the literature that combine connectivity and attribute information to inform community detection, our model is the first augmented stochastic block model to handle multiple continuous attributes. This provides the flexibility in biological data to, for example, augment connectivity information with continuous measurements from multiple experimental modalities. Because the lack of labeled network data often makes community detection results difficult to validate, we highlight the usefulness of our model for two network prediction tasks: link prediction and collaborative filtering. As a result of fitting this attributed stochastic block model, one can predict the attribute vector or connectivity patterns for a new node in the event of the complementary source of information (connectivity or attributes, respectively). We also highlight two biological examples where the attributed stochastic block model provides satisfactory performance in the link prediction and collaborative filtering tasks

    A Survey of Signed Network Mining in Social Media

    Full text link
    Many real-world relations can be represented by signed networks with positive and negative links, as a result of which signed network analysis has attracted increasing attention from multiple disciplines. With the increasing prevalence of social media networks, signed network analysis has evolved from developing and measuring theories to mining tasks. In this article, we present a review of mining signed networks in the context of social media and discuss some promising research directions and new frontiers. We begin by giving basic concepts and unique properties and principles of signed networks. Then we classify and review tasks of signed network mining with representative algorithms. We also delineate some tasks that have not been extensively studied with formal definitions and also propose research directions to expand the field of signed network mining.Comment: 37 page

    Link Classification and Tie Strength Ranking in Online Social Networks with Exogenous Interaction Networks

    Full text link
    Online social networks (OSNs) have become the main medium for connecting people, sharing knowledge and information, and for communication. The social connections between people using these OSNs are formed as virtual links (e.g., friendship and following connections) that connect people. These links are the heart of today's OSNs as they facilitate all of the activities that the members of a social network can do. However, many of these networks suffer from noisy links, i.e., links that do not reflect a real relationship or links that have a low intensity, that change the structure of the network and prevent accurate analysis of these networks. Hence, a process for assessing and ranking the links in a social network is crucial in order to sustain a healthy and real network. Here, we define link assessment as the process of identifying noisy and non-noisy links in a network. In this paper, we address the problem of link assessment and link ranking in social networks using external interaction networks. In addition to a friendship social network, additional exogenous interaction networks are utilized to make the assessment process more meaningful. We employed machine learning classifiers for assessing and ranking the links in the social network of interest using the data from exogenous interaction networks. The method was tested with two different datasets, each containing the social network of interest, with the ground truth, along with the exogenous interaction networks. The results show that it is possible to effectively assess the links of a social network using only the structure of a single network of the exogenous interaction networks, and also using the structure of the whole set of exogenous interaction networks. The experiments showed that some classifiers do better than others regarding both link classification and link ranking.Comment: preprint for the MSMMUSE post-proceeding

    Predicting risky behavior in social communities

    Full text link
    Predicting risk profiles of individuals in networks (e.g.~susceptibility to a particular disease, or likelihood of smoking) is challenging for a variety of reasons. For one, `local' features (such as an individual's demographic information) may lack sufficient information to make informative predictions; this is especially problematic when predicting `risk,' as the relevant features may be precisely those that an individual is disinclined to reveal in a survey. Secondly, even if such features are available, they still may miss crucial information, as `risk' may be a function not just of an individual's features but also those of their friends and social communities. Here, we predict individual's risk profiles as a function of both their local features and those of their friends. Instead of modeling influence from the social network directly (which proved difficult as friendship links may be sparse and partially observed), we instead model influence by discovering social communities in the network that may be related to risky behavior. The result is a model that predicts risk as a function of local features, while making up for their deficiencies and accounting for social influence by uncovering community structure in the network. We test our model by predicting risky behavior among adolescents from the Add health data set, and hometowns among users in a Facebook ego net. Compared to prediction by features alone, our model demonstrates better predictive accuracy when measured as a whole, and in particular when measured as a function of network "richness.

    Representation Learning on Graphs: Methods and Applications

    Full text link
    Machine learning on graphs is an important and ubiquitous task with applications ranging from drug design to friendship recommendation in social networks. The primary challenge in this domain is finding a way to represent, or encode, graph structure so that it can be easily exploited by machine learning models. Traditionally, machine learning approaches relied on user-defined heuristics to extract features encoding structural information about a graph (e.g., degree statistics or kernel functions). However, recent years have seen a surge in approaches that automatically learn to encode graph structure into low-dimensional embeddings, using techniques based on deep learning and nonlinear dimensionality reduction. Here we provide a conceptual review of key advancements in this area of representation learning on graphs, including matrix factorization-based methods, random-walk based algorithms, and graph neural networks. We review methods to embed individual nodes as well as approaches to embed entire (sub)graphs. In doing so, we develop a unified framework to describe these recent approaches, and we highlight a number of important applications and directions for future work.Comment: Published in the IEEE Data Engineering Bulletin, September 2017; version with minor correction

    DOLORES: Deep Contextualized Knowledge Graph Embeddings

    Full text link
    We introduce a new method DOLORES for learning knowledge graph embeddings that effectively captures contextual cues and dependencies among entities and relations. First, we note that short paths on knowledge graphs comprising of chains of entities and relations can encode valuable information regarding their contextual usage. We operationalize this notion by representing knowledge graphs not as a collection of triples but as a collection of entity-relation chains, and learn embeddings for entities and relations using deep neural models that capture such contextual usage. In particular, our model is based on Bi-Directional LSTMs and learn deep representations of entities and relations from constructed entity-relation chains. We show that these representations can very easily be incorporated into existing models to significantly advance the state of the art on several knowledge graph prediction tasks like link prediction, triple classification, and missing relation type prediction (in some cases by at least 9.5%).Comment: 10 pages, 6 figure

    Link Prediction in Social Networks: the State-of-the-Art

    Full text link
    In social networks, link prediction predicts missing links in current networks and new or dissolution links in future networks, is important for mining and analyzing the evolution of social networks. In the past decade, many works have been done about the link prediction in social networks. The goal of this paper is to comprehensively review, analyze and discuss the state-of-the-art of the link prediction in social networks. A systematical category for link prediction techniques and problems is presented. Then link prediction techniques and problems are analyzed and discussed. Typical applications of link prediction are also addressed. Achievements and roadmaps of some active research groups are introduced. Finally, some future challenges of the link prediction in social networks are discussed.Comment: 38 pages, 13 figures, Science China: Information Science, 201
    • …
    corecore