1,156 research outputs found
Computing Vertex Centrality Measures in Massive Real Networks with a Neural Learning Model
Vertex centrality measures are a multi-purpose analysis tool, commonly used
in many application environments to retrieve information and unveil knowledge
from the graphs and network structural properties. However, the algorithms of
such metrics are expensive in terms of computational resources when running
real-time applications or massive real world networks. Thus, approximation
techniques have been developed and used to compute the measures in such
scenarios. In this paper, we demonstrate and analyze the use of neural network
learning algorithms to tackle such task and compare their performance in terms
of solution quality and computation time with other techniques from the
literature. Our work offers several contributions. We highlight both the pros
and cons of approximating centralities though neural learning. By empirical
means and statistics, we then show that the regression model generated with a
feedforward neural networks trained by the Levenberg-Marquardt algorithm is not
only the best option considering computational resources, but also achieves the
best solution quality for relevant applications and large-scale networks.
Keywords: Vertex Centrality Measures, Neural Networks, Complex Network Models,
Machine Learning, Regression ModelComment: 8 pages, 5 tables, 2 figures, version accepted at IJCNN 2018. arXiv
admin note: text overlap with arXiv:1810.1176
Advances in Learning and Understanding with Graphs through Machine Learning
Graphs have increasingly become a crucial way of representing large, complex and disparate datasets from a range of domains, including many scientific disciplines. Graphs are particularly useful at capturing complex relationships or interdependencies within or even between datasets, and enable unique insights which are not possible with other data formats. Over recent years, significant improvements in the ability of machine learning approaches to automatically learn from and identify patterns in datasets have been made.
However due to the unique nature of graphs, and the data they are used to represent, employing machine learning with graphs has thus far proved challenging. A review of relevant literature has revealed that key challenges include issues arising with macro-scale graph learning, interpretability of machine learned representations and a failure to incorporate the temporal dimension present in many datasets. Thus, the work and contributions presented in this thesis primarily investigate how modern machine learning techniques can be adapted to tackle key graph mining tasks, with a particular focus on optimal macro-level representation, interpretability and incorporating temporal dynamics into the learning process. The majority of methods employed are novel approaches centered around attempting to use artificial neural networks in order to learn from graph datasets.
Firstly, by devising a novel graph fingerprint technique, it is demonstrated that this can successfully be applied to two different tasks whilst out-performing established baselines, namely graph comparison and classification. Secondly, it is shown that a mapping can be found between certain topological features and graph embeddings. This, for perhaps the the first time, suggests that it is possible that machines are learning something analogous to human knowledge acquisition, thus bringing interpretability to the graph embedding process. Thirdly, in exploring two new models for incorporating temporal information into the graph learning process, it is found that including such information is crucial to predictive performance in certain key tasks, such as link prediction, where state-of-the-art baselines are out-performed.
The overall contribution of this work is to provide greater insight into and explanation of the ways in which machine learning with respect to graphs is emerging as a crucial set of techniques for understanding complex datasets. This is important as these techniques can potentially be applied to a broad range of scientific disciplines. The thesis concludes with an assessment of limitations and recommendations for future research
Laplacian Mixture Modeling for Network Analysis and Unsupervised Learning on Graphs
Laplacian mixture models identify overlapping regions of influence in
unlabeled graph and network data in a scalable and computationally efficient
way, yielding useful low-dimensional representations. By combining Laplacian
eigenspace and finite mixture modeling methods, they provide probabilistic or
fuzzy dimensionality reductions or domain decompositions for a variety of input
data types, including mixture distributions, feature vectors, and graphs or
networks. Provable optimal recovery using the algorithm is analytically shown
for a nontrivial class of cluster graphs. Heuristic approximations for scalable
high-performance implementations are described and empirically tested.
Connections to PageRank and community detection in network analysis demonstrate
the wide applicability of this approach. The origins of fuzzy spectral methods,
beginning with generalized heat or diffusion equations in physics, are reviewed
and summarized. Comparisons to other dimensionality reduction and clustering
methods for challenging unsupervised machine learning problems are also
discussed.Comment: 13 figures, 35 reference
Fine-grained Search Space Classification for Hard Enumeration Variants of Subset Problems
We propose a simple, powerful, and flexible machine learning framework for
(i) reducing the search space of computationally difficult enumeration variants
of subset problems and (ii) augmenting existing state-of-the-art solvers with
informative cues arising from the input distribution. We instantiate our
framework for the problem of listing all maximum cliques in a graph, a central
problem in network analysis, data mining, and computational biology. We
demonstrate the practicality of our approach on real-world networks with
millions of vertices and edges by not only retaining all optimal solutions, but
also aggressively pruning the input instance size resulting in several fold
speedups of state-of-the-art algorithms. Finally, we explore the limits of
scalability and robustness of our proposed framework, suggesting that
supervised learning is viable for tackling NP-hard problems in practice.Comment: AAAI 201
Enabling Massive Deep Neural Networks with the GraphBLAS
Deep Neural Networks (DNNs) have emerged as a core tool for machine learning.
The computations performed during DNN training and inference are dominated by
operations on the weight matrices describing the DNN. As DNNs incorporate more
stages and more nodes per stage, these weight matrices may be required to be
sparse because of memory limitations. The GraphBLAS.org math library standard
was developed to provide high performance manipulation of sparse weight
matrices and input/output vectors. For sufficiently sparse matrices, a sparse
matrix library requires significantly less memory than the corresponding dense
matrix implementation. This paper provides a brief description of the
mathematics underlying the GraphBLAS. In addition, the equations of a typical
DNN are rewritten in a form designed to use the GraphBLAS. An implementation of
the DNN is given using a preliminary GraphBLAS C library. The performance of
the GraphBLAS implementation is measured relative to a standard dense linear
algebra library implementation. For various sizes of DNN weight matrices, it is
shown that the GraphBLAS sparse implementation outperforms a BLAS dense
implementation as the weight matrix becomes sparser.Comment: 10 pages, 7 figures, to appear in the 2017 IEEE High Performance
Extreme Computing (HPEC) conferenc
- …