593 research outputs found

    Embedding Graphs under Centrality Constraints for Network Visualization

    Full text link
    Visual rendering of graphs is a key task in the mapping of complex network data. Although most graph drawing algorithms emphasize aesthetic appeal, certain applications such as travel-time maps place more importance on visualization of structural network properties. The present paper advocates two graph embedding approaches with centrality considerations to comply with node hierarchy. The problem is formulated first as one of constrained multi-dimensional scaling (MDS), and it is solved via block coordinate descent iterations with successive approximations and guaranteed convergence to a KKT point. In addition, a regularization term enforcing graph smoothness is incorporated with the goal of reducing edge crossings. A second approach leverages the locally-linear embedding (LLE) algorithm which assumes that the graph encodes data sampled from a low-dimensional manifold. Closed-form solutions to the resulting centrality-constrained optimization problems are determined yielding meaningful embeddings. Experimental results demonstrate the efficacy of both approaches, especially for visualizing large networks on the order of thousands of nodes.Comment: Submitted to IEEE Transactions on Visualization and Computer Graphic

    One-class classifiers based on entropic spanning graphs

    Get PDF
    One-class classifiers offer valuable tools to assess the presence of outliers in data. In this paper, we propose a design methodology for one-class classifiers based on entropic spanning graphs. Our approach takes into account the possibility to process also non-numeric data by means of an embedding procedure. The spanning graph is learned on the embedded input data and the outcoming partition of vertices defines the classifier. The final partition is derived by exploiting a criterion based on mutual information minimization. Here, we compute the mutual information by using a convenient formulation provided in terms of the α\alpha-Jensen difference. Once training is completed, in order to associate a confidence level with the classifier decision, a graph-based fuzzy model is constructed. The fuzzification process is based only on topological information of the vertices of the entropic spanning graph. As such, the proposed one-class classifier is suitable also for data characterized by complex geometric structures. We provide experiments on well-known benchmarks containing both feature vectors and labeled graphs. In addition, we apply the method to the protein solubility recognition problem by considering several representations for the input samples. Experimental results demonstrate the effectiveness and versatility of the proposed method with respect to other state-of-the-art approaches.Comment: Extended and revised version of the paper "One-Class Classification Through Mutual Information Minimization" presented at the 2016 IEEE IJCNN, Vancouver, Canad

    How Many Pairwise Preferences Do We Need to Rank A Graph Consistently?

    Full text link
    We consider the problem of optimal recovery of true ranking of nn items from a randomly chosen subset of their pairwise preferences. It is well known that without any further assumption, one requires a sample size of Ω(n2)\Omega(n^2) for the purpose. We analyze the problem with an additional structure of relational graph G([n],E)G([n],E) over the nn items added with an assumption of \emph{locality}: Neighboring items are similar in their rankings. Noting the preferential nature of the data, we choose to embed not the graph, but, its \emph{strong product} to capture the pairwise node relationships. Furthermore, unlike existing literature that uses Laplacian embedding for graph based learning problems, we use a richer class of graph embeddings---\emph{orthonormal representations}---that includes (normalized) Laplacian as its special case. Our proposed algorithm, {\it Pref-Rank}, predicts the underlying ranking using an SVM based approach over the chosen embedding of the product graph, and is the first to provide \emph{statistical consistency} on two ranking losses: \emph{Kendall's tau} and \emph{Spearman's footrule}, with a required sample complexity of O(n2χ(Gˉ))23O(n^2 \chi(\bar{G}))^{\frac{2}{3}} pairs, χ(Gˉ)\chi(\bar{G}) being the \emph{chromatic number} of the complement graph Gˉ\bar{G}. Clearly, our sample complexity is smaller for dense graphs, with χ(Gˉ)\chi(\bar G) characterizing the degree of node connectivity, which is also intuitive due to the locality assumption e.g. O(n43)O(n^\frac{4}{3}) for union of kk-cliques, or O(n53)O(n^\frac{5}{3}) for random and power law graphs etc.---a quantity much smaller than the fundamental limit of Ω(n2)\Omega(n^2) for large nn. This, for the first time, relates ranking complexity to structural properties of the graph. We also report experimental evaluations on different synthetic and real datasets, where our algorithm is shown to outperform the state-of-the-art methods.Comment: In Thirty-Third AAAI Conference on Artificial Intelligence, 201

    Advances in Learning and Understanding with Graphs through Machine Learning

    Get PDF
    Graphs have increasingly become a crucial way of representing large, complex and disparate datasets from a range of domains, including many scientific disciplines. Graphs are particularly useful at capturing complex relationships or interdependencies within or even between datasets, and enable unique insights which are not possible with other data formats. Over recent years, significant improvements in the ability of machine learning approaches to automatically learn from and identify patterns in datasets have been made. However due to the unique nature of graphs, and the data they are used to represent, employing machine learning with graphs has thus far proved challenging. A review of relevant literature has revealed that key challenges include issues arising with macro-scale graph learning, interpretability of machine learned representations and a failure to incorporate the temporal dimension present in many datasets. Thus, the work and contributions presented in this thesis primarily investigate how modern machine learning techniques can be adapted to tackle key graph mining tasks, with a particular focus on optimal macro-level representation, interpretability and incorporating temporal dynamics into the learning process. The majority of methods employed are novel approaches centered around attempting to use artificial neural networks in order to learn from graph datasets. Firstly, by devising a novel graph fingerprint technique, it is demonstrated that this can successfully be applied to two different tasks whilst out-performing established baselines, namely graph comparison and classification. Secondly, it is shown that a mapping can be found between certain topological features and graph embeddings. This, for perhaps the the first time, suggests that it is possible that machines are learning something analogous to human knowledge acquisition, thus bringing interpretability to the graph embedding process. Thirdly, in exploring two new models for incorporating temporal information into the graph learning process, it is found that including such information is crucial to predictive performance in certain key tasks, such as link prediction, where state-of-the-art baselines are out-performed. The overall contribution of this work is to provide greater insight into and explanation of the ways in which machine learning with respect to graphs is emerging as a crucial set of techniques for understanding complex datasets. This is important as these techniques can potentially be applied to a broad range of scientific disciplines. The thesis concludes with an assessment of limitations and recommendations for future research

    Multi-view Graph Embedding with Hub Detection for Brain Network Analysis

    Full text link
    Multi-view graph embedding has become a widely studied problem in the area of graph learning. Most of the existing works on multi-view graph embedding aim to find a shared common node embedding across all the views of the graph by combining the different views in a specific way. Hub detection, as another essential topic in graph mining has also drawn extensive attentions in recent years, especially in the context of brain network analysis. Both the graph embedding and hub detection relate to the node clustering structure of graphs. The multi-view graph embedding usually implies the node clustering structure of the graph based on the multiple views, while the hubs are the boundary-spanning nodes across different node clusters in the graph and thus may potentially influence the clustering structure of the graph. However, none of the existing works in multi-view graph embedding considered the hubs when learning the multi-view embeddings. In this paper, we propose to incorporate the hub detection task into the multi-view graph embedding framework so that the two tasks could benefit each other. Specifically, we propose an auto-weighted framework of Multi-view Graph Embedding with Hub Detection (MVGE-HD) for brain network analysis. The MVGE-HD framework learns a unified graph embedding across all the views while reducing the potential influence of the hubs on blurring the boundaries between node clusters in the graph, thus leading to a clear and discriminative node clustering structure for the graph. We apply MVGE-HD on two real multi-view brain network datasets (i.e., HIV and Bipolar). The experimental results demonstrate the superior performance of the proposed framework in brain network analysis for clinical investigation and application
    • …