593 research outputs found
Embedding Graphs under Centrality Constraints for Network Visualization
Visual rendering of graphs is a key task in the mapping of complex network
data. Although most graph drawing algorithms emphasize aesthetic appeal,
certain applications such as travel-time maps place more importance on
visualization of structural network properties. The present paper advocates two
graph embedding approaches with centrality considerations to comply with node
hierarchy. The problem is formulated first as one of constrained
multi-dimensional scaling (MDS), and it is solved via block coordinate descent
iterations with successive approximations and guaranteed convergence to a KKT
point. In addition, a regularization term enforcing graph smoothness is
incorporated with the goal of reducing edge crossings. A second approach
leverages the locally-linear embedding (LLE) algorithm which assumes that the
graph encodes data sampled from a low-dimensional manifold. Closed-form
solutions to the resulting centrality-constrained optimization problems are
determined yielding meaningful embeddings. Experimental results demonstrate the
efficacy of both approaches, especially for visualizing large networks on the
order of thousands of nodes.Comment: Submitted to IEEE Transactions on Visualization and Computer Graphic
One-class classifiers based on entropic spanning graphs
One-class classifiers offer valuable tools to assess the presence of outliers
in data. In this paper, we propose a design methodology for one-class
classifiers based on entropic spanning graphs. Our approach takes into account
the possibility to process also non-numeric data by means of an embedding
procedure. The spanning graph is learned on the embedded input data and the
outcoming partition of vertices defines the classifier. The final partition is
derived by exploiting a criterion based on mutual information minimization.
Here, we compute the mutual information by using a convenient formulation
provided in terms of the -Jensen difference. Once training is
completed, in order to associate a confidence level with the classifier
decision, a graph-based fuzzy model is constructed. The fuzzification process
is based only on topological information of the vertices of the entropic
spanning graph. As such, the proposed one-class classifier is suitable also for
data characterized by complex geometric structures. We provide experiments on
well-known benchmarks containing both feature vectors and labeled graphs. In
addition, we apply the method to the protein solubility recognition problem by
considering several representations for the input samples. Experimental results
demonstrate the effectiveness and versatility of the proposed method with
respect to other state-of-the-art approaches.Comment: Extended and revised version of the paper "One-Class Classification
Through Mutual Information Minimization" presented at the 2016 IEEE IJCNN,
Vancouver, Canad
How Many Pairwise Preferences Do We Need to Rank A Graph Consistently?
We consider the problem of optimal recovery of true ranking of items from
a randomly chosen subset of their pairwise preferences. It is well known that
without any further assumption, one requires a sample size of for
the purpose. We analyze the problem with an additional structure of relational
graph over the items added with an assumption of
\emph{locality}: Neighboring items are similar in their rankings. Noting the
preferential nature of the data, we choose to embed not the graph, but, its
\emph{strong product} to capture the pairwise node relationships. Furthermore,
unlike existing literature that uses Laplacian embedding for graph based
learning problems, we use a richer class of graph
embeddings---\emph{orthonormal representations}---that includes (normalized)
Laplacian as its special case. Our proposed algorithm, {\it Pref-Rank},
predicts the underlying ranking using an SVM based approach over the chosen
embedding of the product graph, and is the first to provide \emph{statistical
consistency} on two ranking losses: \emph{Kendall's tau} and \emph{Spearman's
footrule}, with a required sample complexity of pairs, being the \emph{chromatic
number} of the complement graph . Clearly, our sample complexity is
smaller for dense graphs, with characterizing the degree of node
connectivity, which is also intuitive due to the locality assumption e.g.
for union of -cliques, or for random
and power law graphs etc.---a quantity much smaller than the fundamental limit
of for large . This, for the first time, relates ranking
complexity to structural properties of the graph. We also report experimental
evaluations on different synthetic and real datasets, where our algorithm is
shown to outperform the state-of-the-art methods.Comment: In Thirty-Third AAAI Conference on Artificial Intelligence, 201
Advances in Learning and Understanding with Graphs through Machine Learning
Graphs have increasingly become a crucial way of representing large, complex and disparate datasets from a range of domains, including many scientific disciplines. Graphs are particularly useful at capturing complex relationships or interdependencies within or even between datasets, and enable unique insights which are not possible with other data formats. Over recent years, significant improvements in the ability of machine learning approaches to automatically learn from and identify patterns in datasets have been made.
However due to the unique nature of graphs, and the data they are used to represent, employing machine learning with graphs has thus far proved challenging. A review of relevant literature has revealed that key challenges include issues arising with macro-scale graph learning, interpretability of machine learned representations and a failure to incorporate the temporal dimension present in many datasets. Thus, the work and contributions presented in this thesis primarily investigate how modern machine learning techniques can be adapted to tackle key graph mining tasks, with a particular focus on optimal macro-level representation, interpretability and incorporating temporal dynamics into the learning process. The majority of methods employed are novel approaches centered around attempting to use artificial neural networks in order to learn from graph datasets.
Firstly, by devising a novel graph fingerprint technique, it is demonstrated that this can successfully be applied to two different tasks whilst out-performing established baselines, namely graph comparison and classification. Secondly, it is shown that a mapping can be found between certain topological features and graph embeddings. This, for perhaps the the first time, suggests that it is possible that machines are learning something analogous to human knowledge acquisition, thus bringing interpretability to the graph embedding process. Thirdly, in exploring two new models for incorporating temporal information into the graph learning process, it is found that including such information is crucial to predictive performance in certain key tasks, such as link prediction, where state-of-the-art baselines are out-performed.
The overall contribution of this work is to provide greater insight into and explanation of the ways in which machine learning with respect to graphs is emerging as a crucial set of techniques for understanding complex datasets. This is important as these techniques can potentially be applied to a broad range of scientific disciplines. The thesis concludes with an assessment of limitations and recommendations for future research
Multi-view Graph Embedding with Hub Detection for Brain Network Analysis
Multi-view graph embedding has become a widely studied problem in the area of
graph learning. Most of the existing works on multi-view graph embedding aim to
find a shared common node embedding across all the views of the graph by
combining the different views in a specific way. Hub detection, as another
essential topic in graph mining has also drawn extensive attentions in recent
years, especially in the context of brain network analysis. Both the graph
embedding and hub detection relate to the node clustering structure of graphs.
The multi-view graph embedding usually implies the node clustering structure of
the graph based on the multiple views, while the hubs are the boundary-spanning
nodes across different node clusters in the graph and thus may potentially
influence the clustering structure of the graph. However, none of the existing
works in multi-view graph embedding considered the hubs when learning the
multi-view embeddings. In this paper, we propose to incorporate the hub
detection task into the multi-view graph embedding framework so that the two
tasks could benefit each other. Specifically, we propose an auto-weighted
framework of Multi-view Graph Embedding with Hub Detection (MVGE-HD) for brain
network analysis. The MVGE-HD framework learns a unified graph embedding across
all the views while reducing the potential influence of the hubs on blurring
the boundaries between node clusters in the graph, thus leading to a clear and
discriminative node clustering structure for the graph. We apply MVGE-HD on two
real multi-view brain network datasets (i.e., HIV and Bipolar). The
experimental results demonstrate the superior performance of the proposed
framework in brain network analysis for clinical investigation and application
- …