Search CORE

38,656 research outputs found

Understanding Coarsening for Embedding Large-Scale Graphs

Author: Akyildiz Taha Atahan
Aljundi Amro Alabsi
Kaya Kamer
Publication venue
Publication date: 10/09/2020
Field of study

A significant portion of the data today, e.g, social networks, web connections, etc., can be modeled by graphs. A proper analysis of graphs with Machine Learning (ML) algorithms has the potential to yield far-reaching insights into many areas of research and industry. However, the irregular structure of graph data constitutes an obstacle for running ML tasks on graphs such as link prediction, node classification, and anomaly detection. Graph embedding is a compute-intensive process of representing graphs as a set of vectors in a d-dimensional space, which in turn makes it amenable to ML tasks. Many approaches have been proposed in the literature to improve the performance of graph embedding, e.g., using distributed algorithms, accelerators, and pre-processing techniques. Graph coarsening, which can be considered a pre-processing step, is a structural approximation of a given, large graph with a smaller one. As the literature suggests, the cost of embedding significantly decreases when coarsening is employed. In this work, we thoroughly analyze the impact of the coarsening quality on the embedding performance both in terms of speed and accuracy. Our experiments with a state-of-the-art, fast graph embedding tool show that there is an interplay between the coarsening decisions taken and the embedding quality.Comment: 10 pages, 6 figures, submitted to 2020 IEEE International Conference on Big Dat

arXiv.org e-Print Archive

Sabanci University Research Database

Representation Learning for Attributed Multiplex Heterogeneous Network

Author: Bhagat Smriti
Bojchevski Aleksandar
Hamilton Will
Huang Xiao
Kingma Diederik P
Lin Zhouhan
Mikolov Tomas
Mikolov Tomas
Tang Lei
Taskar Ben
Thomas
Yang Cheng
Yang Zhilin
Zhang Hongming
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 20/05/2019
Field of study

Network embedding (or graph embedding) has been widely used in many real-world applications. However, existing methods mainly focus on networks with single-typed nodes/edges and cannot scale well to handle large networks. Many real-world networks consist of billions of nodes and edges of multiple types, and each node is associated with different attributes. In this paper, we formalize the problem of embedding learning for the Attributed Multiplex Heterogeneous Network and propose a unified framework to address this problem. The framework supports both transductive and inductive learning. We also give the theoretical analysis of the proposed framework, showing its connection with previous works and proving its better expressiveness. We conduct systematical evaluations for the proposed framework on four different genres of challenging datasets: Amazon, YouTube, Twitter, and Alibaba. Experimental results demonstrate that with the learned embeddings from the proposed framework, we can achieve statistically significant improvements (e.g., 5.99-28.23% lift by F1 scores; p<<0.01, t-test) over previous state-of-the-art methods for link prediction. The framework has also been successfully deployed on the recommendation system of a worldwide leading e-commerce company, Alibaba Group. Results of the offline A/B tests on product recommendation further confirm the effectiveness and efficiency of the framework in practice.Comment: Accepted to KDD 2019. Website: https://sites.google.com/view/gatn

arXiv.org e-Print Archive

Crossref