153,986 research outputs found

    Learning Interpretable Features of Graphs and Time Series Data

    Get PDF
    Graphs and time series are two of the most ubiquitous representations of data of modern time. Representation learning of real-world graphs and time-series data is a key component for the downstream supervised and unsupervised machine learning tasks such as classification, clustering, and visualization. Because of the inherent high dimensionality, representation learning, i.e., low dimensional vector-based embedding of graphs and time-series data is very challenging. Learning interpretable features incorporates transparency of the feature roles, and facilitates downstream analytics tasks in addition to maximizing the performance of the downstream machine learning models. In this thesis, we leveraged tensor (multidimensional array) decomposition for generating interpretable and low dimensional feature space of graphs and time-series data found from three domains: social networks, neuroscience, and heliophysics. We present the theoretical models and empirical results on node embedding of social networks, biomarker embedding on fMRI-based brain networks, and prediction and visualization of multivariate time-series-based flaring and non-flaring solar events

    Graph Convolutional Matrix Completion

    Get PDF
    We consider matrix completion for recommender systems from the point of view of link prediction on graphs. Interaction data such as movie ratings can be represented by a bipartite user-item graph with labeled edges denoting observed ratings. Building on recent progress in deep learning on graph-structured data, we propose a graph auto-encoder framework based on differentiable message passing on the bipartite interaction graph. Our model shows competitive performance on standard collaborative filtering benchmarks. In settings where complimentary feature information or structured data such as a social network is available, our framework outperforms recent state-of-the-art methods.Comment: 9 pages, 3 figures, updated with additional experimental evaluatio

    DeepWalk: Online Learning of Social Representations

    Full text link
    We present DeepWalk, a novel approach for learning latent representations of vertices in a network. These latent representations encode social relations in a continuous vector space, which is easily exploited by statistical models. DeepWalk generalizes recent advancements in language modeling and unsupervised feature learning (or deep learning) from sequences of words to graphs. DeepWalk uses local information obtained from truncated random walks to learn latent representations by treating walks as the equivalent of sentences. We demonstrate DeepWalk's latent representations on several multi-label network classification tasks for social networks such as BlogCatalog, Flickr, and YouTube. Our results show that DeepWalk outperforms challenging baselines which are allowed a global view of the network, especially in the presence of missing information. DeepWalk's representations can provide F1F_1 scores up to 10% higher than competing methods when labeled data is sparse. In some experiments, DeepWalk's representations are able to outperform all baseline methods while using 60% less training data. DeepWalk is also scalable. It is an online learning algorithm which builds useful incremental results, and is trivially parallelizable. These qualities make it suitable for a broad class of real world applications such as network classification, and anomaly detection.Comment: 10 pages, 5 figures, 4 table

    New Deep Neural Networks for Unsupervised Feature Learning on Graph Data

    Get PDF
    Graph data are ubiquitous in the real world, such as social networks, biological networks. To analyze graph data, a fundamental task is to learn node features to benefit downstream tasks, such as node classification, community detection. Inspired by the powerful feature learning capability of deep neural networks on various tasks, it is important and necessary to explore deep neural networks for feature learning on graphs. Different from the regular image and sequence data, graph data encode the complicated relational information between different nodes, which challenges the classical deep neural networks. Moreover, in real-world applications, the label of nodes in graph data is usually not available, which makes the feature learning on graphs more difficult. To address these challenging issues, this thesis is focusing on designing new deep neural networks to effectively explore the relational information for unsupervised feature learning on graph data. First, to address the sparseness issue of the relational information, I propose a new proximity generative adversarial network which can discover the underlying relational information for learning better node representations. Meanwhile, a new self-paced network embedding method is designed to address the unbalance issue of the relational information when learning node representations. Additionally, to deal with rich attributes associated to nodes, I develop a new deep neural network to capture various relational information in both topological structure and node attributes for enhancing network embedding. Furthermore, to preserve the relational information in the hidden layers of deep neural networks, I develop a novel graph convolutional neural network (GCN) based on conditional random fields, which is the first algorithm applying this kind of graphical models to graph neural networks in an unsupervised manner

    Learning with Graphs using Kernels from Propagated Information

    Get PDF
    Traditional machine learning approaches are designed to learn from independent vector-valued data points. The assumption that instances are independent, however, is not always true. On the contrary, there are numerous domains where data points are cross-linked, for example social networks, where persons are linked by friendship relations. These relations among data points make traditional machine learning diffcult and often insuffcient. Furthermore, data points themselves can have complex structure, for example molecules or proteins constructed from various bindings of different atoms. Networked and structured data are naturally represented by graphs, and for learning we aimto exploit their structure to improve upon non-graph-based methods. However, graphs encountered in real-world applications often come with rich additional information. This naturally implies many challenges for representation and learning: node information is likely to be incomplete leading to partially labeled graphs, information can be aggregated from multiple sources and can therefore be uncertain, or additional information on nodes and edges can be derived from complex sensor measurements, thus being naturally continuous. Although learning with graphs is an active research area, learning with structured data, substantially modeling structural similarities of graphs, mostly assumes fully labeled graphs of reasonable sizes with discrete and certain node and edge information, and learning with networked data, naturally dealing with missing information and huge graphs, mostly assumes homophily and forgets about structural similarity. To close these gaps, we present a novel paradigm for learning with graphs, that exploits the intermediate results of iterative information propagation schemes on graphs. Originally developed for within-network relational and semi-supervised learning, these propagation schemes have two desirable properties: they capture structural information and they can naturally adapt to the aforementioned issues of real-world graph data. Additionally, information propagation can be efficiently realized by random walks leading to fast, flexible, and scalable feature and kernel computations. Further, by considering intermediate random walk distributions, we can model structural similarity for learning with structured and networked data. We develop several approaches based on this paradigm. In particular, we introduce propagation kernels for learning on the graph level and coinciding walk kernels and Markov logic sets for learning on the node level. Finally, we present two application domains where kernels from propagated information successfully tackle real-world problems

    A Brief Survey of Deep Learning Approaches for Learning Analytics on MOOCs

    Get PDF
    Massive Open Online Course (MOOC) systems have become prevalent in recent years and draw more attention, a.o., due to the coronavirus pandemic’s impact. However, there is a well-known higher chance of dropout from MOOCs than from conventional off-line courses. Researchers have implemented extensive methods to explore the reasons behind learner attrition or lack of interest to apply timely interventions. The recent success of neural networks has revolutionised extensive Learning Analytics (LA) tasks. More recently, the associated deep learning techniques are increasingly deployed to address the dropout prediction problem. This survey gives a timely and succinct overview of deep learning techniques for MOOCs’ learning analytics. We mainly analyse the trends of feature processing and the model design in dropout prediction, respectively. Moreover, the recent incremental improvements over existing deep learning techniques and the commonly used public data sets have been presented. Finally, the paper proposes three future research directions in the field: knowledge graphs with learning analytics, comprehensive social network analysis, composite behavioural analysis

    Forecasting the Missing Links in Heterogeneous Social Networks

    Get PDF
    Social network analysis has gained attention from several researchers in the past time because of its wide application in capturing social interactions. One of the aims of social network analysis is to recover missing links between the users which may exist in the future but have not yet appeared due to incomplete data. The prediction of hidden or missing links in criminal networks is also a significant problem. The collection of criminal data from these networks appears to be incomplete and inconsistent which is reflected in the structure in the form of missing nodes and links. Many machine learning algorithms are applied for this detection using supervised techniques. But, supervised machine learning algorithms require large datasets for training the link prediction model for achieving optimum results. In this research, we have used a Facebook dataset to solve the problem of link prediction in a network. The two machine learning classifiers applied are LogisticRegression and K-Nearest Neighbour where KNN has higher accuracy than LR. In this article, we have proposed an algorithm Graph Sample Aggregator with Low Reciprocity, (GraphSALR), for the generation of node embeddings in larger graphs which use node feature information
    • …