6 research outputs found

    Investigating Extensions to Random Walk Based Graph Embedding

    Full text link
    Graph embedding has recently gained momentum in the research community, in particular after the introduction of random walk and neural network based approaches. However, most of the embedding approaches focus on representing the local neighborhood of nodes and fail to capture the global graph structure, i.e. to retain the relations to distant nodes. To counter that problem, we propose a novel extension to random walk based graph embedding, which removes a percentage of least frequent nodes from the walks at different levels. By this removal, we simulate farther distant nodes to reside in the close neighborhood of a node and hence explicitly represent their connection. Besides the common evaluation tasks for graph embeddings, such as node classification and link prediction, we evaluate and compare our approach against related methods on shortest path approximation. The results indicate, that extensions to random walk based methods (including our own) improve the predictive performance only slightly - if at all

    Greedy PIG: Adaptive Integrated Gradients

    Full text link
    Deep learning has become the standard approach for most machine learning tasks. While its impact is undeniable, interpreting the predictions of deep learning models from a human perspective remains a challenge. In contrast to model training, model interpretability is harder to quantify and pose as an explicit optimization problem. Inspired by the AUC softmax information curve (AUC SIC) metric for evaluating feature attribution methods, we propose a unified discrete optimization framework for feature attribution and feature selection based on subset selection. This leads to a natural adaptive generalization of the path integrated gradients (PIG) method for feature attribution, which we call Greedy PIG. We demonstrate the success of Greedy PIG on a wide variety of tasks, including image feature attribution, graph compression/explanation, and post-hoc feature selection on tabular data. Our results show that introducing adaptivity is a powerful and versatile method for making attribution methods more powerful

    Exploring Graph Traversal Algorithms for Knowledge Graphs

    Get PDF

    Extracting Multi-objective Multigraph Features for the Shortest Path Cost Prediction: Statistics-based or Learning-based?

    Get PDF
    Efficient airport airside ground movement (AAGM) is key to successful operations of urban air mobility. Recent studies have introduced the use of multi-objective multigraphs (MOMGs) as the conceptual prototype to formulate AAGM. Swift calculation of the shortest path costs is crucial for the algorithmic heuristic search on MOMGs, however, previous work chiefly focused on single-objective simple graphs (SOSGs), treated cost enquires as search problems, and failed to keep a low level of computational time and storage complexity. This paper concentrates on the conceptual prototype MOMG, and investigates its node feature extraction, which lays the foundation for efficient prediction of shortest path costs. Two extraction methods are implemented and compared: a statistics-based method that summarises 22 node physical patterns from graph theory principles, and a learning-based method that employs node embedding technique to encode graph structures into a discriminative vector space. The former method can effectively evaluate the node physical patterns and reveals their individual importance for distance prediction, while the latter provides novel practices on processing multigraphs for node embedding algorithms that can merely handle SOSGs. Three regression models are applied to predict the shortest path costs to demonstrate the performance of each. Our experiments on randomly generated benchmark MOMGs show that (i) the statistics-based method underperforms on characterising small distance values due to severe overestimation, (ii) a subset of essential physical patterns can achieve comparable or slightly better prediction accuracy than that based on a complete set of patterns, and (iii) the learning-based method consistently outperforms the statistics-based method, while maintaining a competitive level of computational complexity

    Extracting Multi-objective Multigraph Features for the Shortest Path Cost Prediction: Statistics-based or Learning-based?

    Get PDF
    Efficient airport airside ground movement (AAGM) is key to successful operations of urban air mobility. Recent studies have introduced the use of multi-objective multigraphs (MOMGs) as the conceptual prototype to formulate AAGM. Swift calculation of the shortest path costs is crucial for the algorithmic heuristic search on MOMGs, however, previous work chiefly focused on single-objective simple graphs (SOSGs), treated cost enquires as search problems, and failed to keep a low level of computational time and storage complexity. This paper concentrates on the conceptual prototype MOMG, and investigates its node feature extraction, which lays the foundation for efficient prediction of shortest path costs. Two extraction methods are implemented and compared: a statistics-based method that summarises 22 node physical patterns from graph theory principles, and a learning-based method that employs node embedding technique to encode graph structures into a discriminative vector space. The former method can effectively evaluate the node physical patterns and reveals their individual importance for distance prediction, while the latter provides novel practices on processing multigraphs for node embedding algorithms that can merely handle SOSGs. Three regression models are applied to predict the shortest path costs to demonstrate the performance of each. Our experiments on randomly generated benchmark MOMGs show that (i) the statistics-based method underperforms on characterising small distance values due to severe overestimation, (ii) a subset of essential physical patterns can achieve comparable or slightly better prediction accuracy than that based on a complete set of patterns, and (iii) the learning-based method consistently outperforms the statistics-based method, while maintaining a competitive level of computational complexity
    corecore