Learning Embeddings for Graphs and Other High Dimensional Data

Abstract

An immense amount of data is nowadays produced on a daily basis and extracting knowledge from such data proves fruitful for many scientific purposes. Machine learning algorithms are means to such end and have morphed from a nascent research field to omnipresent algorithms running in the background of many applications we use on a daily basis. Low-dimensionality of data, however, is highly conducive to efficient machine learning methods. However, real-world data is seldom low-dimensional; on the contrary, real-world data can be starkly high-dimensional. Such high-dimensional data is exemplified by graph-structured data, such as biological networks of protein-protein interaction, social networks, etc., on which machine learning techniques in their traditional form cannot easily be applied. The focus of this report is thus to explore algorithms whose aim is to generate representation vectors that best encode structural information of the vertices of graphs. The vectors can be in turn passed onto down-stream machine learning algorithms to classify nodes or predict links among them. This study is firstly prefaced by introducing dimensionality reduction techniques for data residing in geometric spaces, followed by two techniques for embedding vertices of graphs into low-dimensional spaces

    Similar works