24,606 research outputs found
Collective Multi-relational Network Mining
Our world is becoming increasingly interconnected, and the study of networks and graphs are becoming more important than ever. Domains such as biological and pharmaceutical networks, online social networks, the World Wide Web, recommender systems, and scholarly networks are just a few examples that include explicit or implicit network structures. Most networks are formed between different types of nodes and contain different types of links. Leveraging these multi-relational and heterogeneous structures is an important factor in developing better models for these real-world networks. Another important aspect of developing models for network data to make predictions about entities such as nodes or links, is the connections between such entities. These connections invalidate the i.i.d. assumptions about the data in most traditional machine learning methods. Hence, unlike models for non-network data where predictions about entities are made independently of each other, the inter-connectivity of the entities in networks should cause the inferred information about one entity to change the models belief about other related entities.
In this dissertation, I present models that can effectively leverage the multi-relational nature of networks and collectively make predictions on links and nodes. In both tasks, I empirically show the importance of considering the multi-relational characteristics and collective predictions. In the first part, I present models to make predictions on nodes by leveraging the graph structure, links generation sequence, and making collective predictions. I apply the node classification methods to detect social spammers in evolving multi-relational social networks and show their effectiveness in identifying spammers without the need of using the textual content. In the second part, I present a generalized augmented multi-relational bi-typed network. I then propose a template for link inference models on these networks and show their application in pharmaceutical discoveries and recommender systems. In the third part, I show that my proposed collective link prediction model is an instance of a general graph-based prediction model that relies on a neighborhood graph for predictions. I then propose a framework that can dynamically adapt the neighborhood graph based on the state of variables from intermediate inference results, as well as structural properties of the relations connecting them to improve the predictive performance of the model
Link Prediction via Generalized Coupled Tensor Factorisation
This study deals with the missing link prediction problem: the problem of
predicting the existence of missing connections between entities of interest.
We address link prediction using coupled analysis of relational datasets
represented as heterogeneous data, i.e., datasets in the form of matrices and
higher-order tensors. We propose to use an approach based on probabilistic
interpretation of tensor factorisation models, i.e., Generalised Coupled Tensor
Factorisation, which can simultaneously fit a large class of tensor models to
higher-order tensors/matrices with com- mon latent factors using different loss
functions. Numerical experiments demonstrate that joint analysis of data from
multiple sources via coupled factorisation improves the link prediction
performance and the selection of right loss function and tensor model is
crucial for accurately predicting missing links
Conditional network embeddings
Network Embeddings (NEs) map the nodes of a given network into -dimensional Euclidean space . Ideally, this mapping is such that 'similar' nodes are mapped onto nearby points, such that the NE can be used for purposes such as link prediction (if 'similar' means being 'more likely to be connected') or classification (if 'similar' means 'being more likely to have the same label'). In recent years various methods for NE have been introduced, all following a similar strategy: defining a notion of similarity between nodes (typically some distance measure within the network), a distance measure in the embedding space, and a loss function that penalizes large distances for similar nodes and small distances for dissimilar nodes.
A difficulty faced by existing methods is that certain networks are fundamentally hard to embed due to their structural properties: (approximate) multipartiteness, certain degree distributions, assortativity, etc. To overcome this, we introduce a conceptual innovation to the NE literature and propose to create \emph{Conditional Network Embeddings} (CNEs); embeddings that maximally add information with respect to given structural properties (e.g. node degrees, block densities, etc.). We use a simple Bayesian approach to achieve this, and propose a block stochastic gradient descent algorithm for fitting it efficiently.
We demonstrate that CNEs are superior for link prediction and multi-label classification when compared to state-of-the-art methods, and this without adding significant mathematical or computational complexity. Finally, we illustrate the potential of CNE for network visualization
Weighted Random Walk Sampling for Multi-Relational Recommendation
In the information overloaded web, personalized recommender systems are
essential tools to help users find most relevant information. The most
heavily-used recommendation frameworks assume user interactions that are
characterized by a single relation. However, for many tasks, such as
recommendation in social networks, user-item interactions must be modeled as a
complex network of multiple relations, not only a single relation. Recently
research on multi-relational factorization and hybrid recommender models has
shown that using extended meta-paths to capture additional information about
both users and items in the network can enhance the accuracy of recommendations
in such networks. Most of this work is focused on unweighted heterogeneous
networks, and to apply these techniques, weighted relations must be simplified
into binary ones. However, information associated with weighted edges, such as
user ratings, which may be crucial for recommendation, are lost in such
binarization. In this paper, we explore a random walk sampling method in which
the frequency of edge sampling is a function of edge weight, and apply this
generate extended meta-paths in weighted heterogeneous networks. With this
sampling technique, we demonstrate improved performance on multiple data sets
both in terms of recommendation accuracy and model generation efficiency
Predicting Anchor Links between Heterogeneous Social Networks
People usually get involved in multiple social networks to enjoy new services
or to fulfill their needs. Many new social networks try to attract users of
other existing networks to increase the number of their users. Once a user
(called source user) of a social network (called source network) joins a new
social network (called target network), a new inter-network link (called anchor
link) is formed between the source and target networks. In this paper, we
concentrated on predicting the formation of such anchor links between
heterogeneous social networks. Unlike conventional link prediction problems in
which the formation of a link between two existing users within a single
network is predicted, in anchor link prediction, the target user is missing and
will be added to the target network once the anchor link is created. To solve
this problem, we use meta-paths as a powerful tool for utilizing heterogeneous
information in both the source and target networks. To this end, we propose an
effective general meta-path-based approach called Connector and Recursive
Meta-Paths (CRMP). By using those two different categories of meta-paths, we
model different aspects of social factors that may affect a source user to join
the target network, resulting in the formation of a new anchor link. Extensive
experiments on real-world heterogeneous social networks demonstrate the
effectiveness of the proposed method against the recent methods.Comment: To be published in "Proceedings of the 2016 IEEE/ACM International
Conference on Advances in Social Networks Analysis and Mining (ASONAM)
- …