6 research outputs found

    Topological based classification of paper domains using graph convolutional networks

    Full text link
    The main approaches for node classification in graphs are information propagation and the association of the class of the node with external information. State of the art methods merge these approaches through Graph Convolutional Networks. We here use the association of topological features of the nodes with their class to predict this class. Moreover, combining topological information with information propagation improves classification accuracy on the standard CiteSeer and Cora paper classification task. Topological features and information propagation produce results almost as good as text-based classification, without no textual or content information. We propose to represent the topology and information propagation through a GCN with the neighboring training node classification as an input and the current node classification as output. Such a formalism outperforms state of the art methods

    A combined network and machine learning approaches for product market forecasting

    Full text link
    Sustainable financial markets play an important role in the functioning of human society. Still, the detection and prediction of risk in financial markets remain challenging and draw much attention from the scientific community. Here we develop a new approach based on combined network theory and machine learning to study the structure and operations of financial product markets. Our network links are based on the similarity of firms' products and are constructed using the Securities Exchange Commission (SEC) filings of US listed firms. We find that several features in our network can serve as good precursors of financial market risks. We then combine the network topology and machine learning methods to predict both successful and failed firms. We find that the forecasts made using our method are much better than other well-known regression techniques. The framework presented here not only facilitates the prediction of financial markets but also provides insight and demonstrate the power of combining network theory and machine learning

    Regional based query in graph active learning

    Full text link
    Graph convolution networks (GCN) have emerged as the leading method to classify node classes in networks, and have reached the highest accuracy in multiple node classification tasks. In the absence of available tagged samples, active learning methods have been developed to obtain the highest accuracy using the minimal number of queries to an oracle. The current best active learning methods use the sample class uncertainty as selection criteria. However, in graph based classification, the class of each node is often related to the class of its neighbors. As such, the uncertainty in the class of a node's neighbor may be a more appropriate selection criterion. We here propose two such criteria, one extending the classical uncertainty measure, and the other extending the page-rank algorithm. We show that the latter is optimal when the fraction of tagged nodes is low, and when this fraction grows to one over the average degree, the regional uncertainty performs better than all existing methods. While we have tested this methods on graphs, such methods can be extended to any classification problem, where a distance metrics can be defined between the input samples. All the code used can be accessed at : https://github.com/louzounlab/graph-al All the datasets used can be accessed at : https://github.com/louzounlab/DataSetsComment: 9 pages, 7 figure

    Topological based classification using graph convolutional networks

    Full text link
    In colored graphs, node classes are often associated with either their neighbors class or with information not incorporated in the graph associated with each node. We here propose that node classes are also associated with topological features of the nodes. We use this association to improve Graph machine learning in general and specifically, Graph Convolutional Networks (GCN). First, we show that even in the absence of any external information on nodes, a good accuracy can be obtained on the prediction of the node class using either topological features, or using the neighbors class as an input to a GCN. This accuracy is slightly less than the one that can be obtained using content based GCN. Secondly, we show that explicitly adding the topology as an input to the GCN does not improve the accuracy when combined with external information on nodes. However, adding an additional adjacency matrix with edges between distant nodes with similar topology to the GCN does significantly improve its accuracy, leading to results better than all state of the art methods in multiple datasets.Comment: arXiv admin note: text overlap with arXiv:1904.0778

    Exposing individual differences through network topology

    Full text link
    Social animals, including humans, have a broad range of personality traits, which can be used to predict individual behavioral responses and decisions. Current methods to quantify individual personality traits in humans rely on self-report questionnaires, which require time and effort to collect and rely on active cooperation. However, personality differences naturally manifest in social interactions such as online social networks. Here, we demonstrate that the topology of an online social network can be used to characterize the personality traits of its members. We analyzed the directed social graph formed by the users of the LiveJournal (LJ) blogging platform. Individual users personality traits, inferred from their self-reported domains of interest (DOIs), were associated with their network measures. Empirical clustering of DOIs by topological similarity exposed two main self-emergent DOI groups that were in alignment with the personality meta-traits plasticity and stability. Closeness, a global topological measure of network centrality, was significantly higher for bloggers associated with plasticity (vs. stability). A local network motif (a triad of 3 connected bloggers) that correlated with closeness also separated the personality meta-traits. Finally, topology-based classification of DOIs (without analyzing the content of the blogs) attained > 70% accuracy (average AUC of the test-set). These results indicate that personality traits are evident and detectable in network topology. This has serious implications for user privacy. But, if used responsibly, network identification of personality traits could aid in early identification of health-related risks, at the population level

    Social Science Guided Feature Engineering: A Novel Approach to Signed Link Analysis

    Full text link
    Many real-world relations can be represented by signed networks with positive links (e.g., friendships and trust) and negative links (e.g., foes and distrust). Link prediction helps advance tasks in social network analysis such as recommendation systems. Most existing work on link analysis focuses on unsigned social networks. The existence of negative links piques research interests in investigating whether properties and principles of signed networks differ from those of unsigned networks, and mandates dedicated efforts on link analysis for signed social networks. Recent findings suggest that properties of signed networks substantially differ from those of unsigned networks and negative links can be of significant help in signed link analysis in complementary ways. In this article, we center our discussion on a challenging problem of signed link analysis. Signed link analysis faces the problem of data sparsity, i.e. only a small percentage of signed links are given. This problem can even get worse when negative links are much sparser than positive ones as users are inclined more towards positive disposition rather than negative. We investigate how we can take advantage of other sources of information for signed link analysis. This research is mainly guided by three social science theories, Emotional Information, Diffusion of Innovations, and Individual Personality. Guided by these, we extract three categories of related features and leverage them for signed link analysis. Experiments show the significance of the features gleaned from social theories for signed link prediction and addressing the data sparsity challenge.Comment: This worked is published at ACM Transactions on Intelligent Systems and Technology(ACM TIST), 201
    corecore