9 research outputs found

    Cascade-aware partitioning of large graph databases

    Get PDF
    Graph partitioning is an essential task for scalable data management and analysis. The current partitioning methods utilize the structure of the graph, and the query log if available. Some queries performed on the database may trigger further operations. For example, the query workload of a social network application may contain re-sharing operations in the form of cascades. It is beneficial to include the potential cascades in the graph partitioning objectives. In this paper, we introduce the problem of cascade-aware graph partitioning that aims to minimize the overall cost of communication among parts/servers during cascade processes. We develop a randomized solution that estimates the underlying cascades, and use it as an input for partitioning of large-scale graphs. Experiments on 17 real social networks demonstrate the effectiveness of the proposed solution in terms of the partitioning objectives

    Scalable Graph Convolutional Network Training on Distributed-Memory Systems

    Full text link
    Graph Convolutional Networks (GCNs) are extensively utilized for deep learning on graphs. The large data sizes of graphs and their vertex features make scalable training algorithms and distributed memory systems necessary. Since the convolution operation on graphs induces irregular memory access patterns, designing a memory- and communication-efficient parallel algorithm for GCN training poses unique challenges. We propose a highly parallel training algorithm that scales to large processor counts. In our solution, the large adjacency and vertex-feature matrices are partitioned among processors. We exploit the vertex-partitioning of the graph to use non-blocking point-to-point communication operations between processors for better scalability. To further minimize the parallelization overheads, we introduce a sparse matrix partitioning scheme based on a hypergraph partitioning model for full-batch training. We also propose a novel stochastic hypergraph model to encode the expected communication volume in mini-batch training. We show the merits of the hypergraph model, previously unexplored for GCN training, over the standard graph partitioning model which does not accurately encode the communication costs. Experiments performed on real-world graph datasets demonstrate that the proposed algorithms achieve considerable speedups over alternative solutions. The optimizations achieved on communication costs become even more pronounced at high scalability with many processors. The performance benefits are preserved in deeper GCNs having more layers as well as on billion-scale graphs.Comment: To appear in PVLDB'2

    Partitioning sparse deep neural networks for scalable training and inference

    Get PDF
    The state-of-the-art deep neural networks (DNNs) have significant computational and data management requirements. The size of both training data and models continue to increase. Sparsification and pruning methods are shown to be effective in removing a large fraction of connections in DNNs. The resulting sparse networks present unique challenges to further improve the computational efficiency of training and inference in deep learning. Both the feedforward (inference) and backpropagation steps in stochastic gradient descent (SGD) algorithm for training sparse DNNs involve consecutive sparse matrix-vector multiplications (SpMVs). We first introduce a distributed-memory parallel SpMV-based solution for the SGD algorithm to improve its scalability. The parallelization approach is based on row-wise partitioning of weight matrices that represent neuron connections between consecutive layers. We then propose a novel hypergraph model for partitioning weight matrices to reduce the total communication volume and ensure computational load-balance among processors. Experiments performed on sparse DNNs demonstrate that the proposed solution is highly efficient and scalable. By utilizing the proposed matrix partitioning scheme, the performance of our solution is further improved significantly

    Scalable graph convolutional network training on distributed-memory systems

    Get PDF
    Graph Convolutional Networks (GCNs) are extensively utilized for deep learning on graphs. The large data sizes of graphs and their vertex features make scalable training algorithms and distributed memory systems necessary. Since the convolution operation on graphs induces irregular memory access patterns, designing a memory- and communication-efficient parallel algorithm for GCN training poses unique challenges. We propose a highly parallel training algorithm that scales to large processor counts. In our solution, the large adjacency and vertex-feature matrices are partitioned among processors. We exploit the vertex-partitioning of the graph to use non-blocking point-to-point communication operations between processors for better scalability. To further minimize the parallelization overheads, we introduce a sparse matrix partitioning scheme based on a hypergraph partitioning model for full-batch training. We also propose a novel stochastic hypergraph model to encode the expected communication volume in mini-batch training. We show the merits of the hypergraph model, previously unexplored for GCN training, over the standard graph partitioning model which does not accurately encode the communication costs. Experiments performed on real-world graph datasets demonstrate that the proposed algorithms achieve considerable speedups over alternative solutions. The optimizations achieved on communication costs become even more pronounced at high scalability with many processors. The performance benefits are preserved in deeper GCNs having more layers as well as on billion-scale graphs

    Temporal cascade model for analyzing spread in evolving networks

    Get PDF
    Current approaches for modeling propagation in networks (e.g., of diseases, computer viruses, rumors) cannot adequately capture temporal properties such as order/duration of evolving connections or dynamic likelihoods of propagation along connections. Temporal models on evolving networks are crucial in applications that need to analyze dynamic spread. For example, a disease spreading virus has varying transmissibility based on interactions between individuals occurring with different frequency, proximity, and venue population density. Similarly, propagation of information having a limited active period, such as rumors, depends on the temporal dynamics of social interactions. To capture such behaviors, we first develop the Temporal Independent Cascade (T-IC) model with a spread function that efficiently utilizes a hypergraph-based sampling strategy and dynamic propagation probabilities. We prove this function to be submodular, with guarantees of approximation quality. This enables scalable analysis on highly granular temporal networks where other models struggle, such as when the spread across connections exhibits arbitrary temporally evolving patterns. We then introduce the notion of ‘reverse spread’ using the proposed T-IC processes, and develop novel solutions to identify both sentinel/detector nodes and highly susceptible nodes. Extensive analysis on real-world datasets shows that the proposed approach significantly outperforms the alternatives in modeling both if and how spread occurs, by considering evolving network topology alongside granular contact/interaction information. Our approach has numerous applications, such as virus/rumor/influence tracking. Utilizing T-IC, we explore vital challenges of monitoring the impact of various intervention strategies over real spatio-temporal contact networks where we show our approach to be highly effective

    Locality-aware and load-balanced static task scheduling for MapReduce

    No full text
    Task scheduling for MapReduce jobs has been an active area of research with the objective of decreasing the amount of data transferred during the shuffle phase via exploiting data locality. In the literature, generally only the scheduling of reduce tasks is considered with the assumption that scheduling of map tasks is already determined by the input data placement. However, in cloud or HPC deployments of MapReduce, the input data is located in a remote storage and scheduling map tasks gains importance. Here, we propose models for simultaneous scheduling of map and reduce tasks in order to improve data locality and balance the processors’ loads in both map and reduce phases. Our approach is based on graph and hypergraph models which correctly encode the interactions between map and reduce tasks. Partitions produced by these models are decoded to schedule map and reduce tasks. A two-constraint formulation utilized in these models enables balancing processors’ loads in both map and reduce phases. The partitioning objective in the hypergraph models correctly encapsulates the minimization of data transfer when a local combine step is performed prior to shuffle, whereas the partitioning objective in the graph models achieve the same feat when a local combine is not performed. We show the validity of our scheduling on the MapReduce parallelizations of two important kernel operations – sparse matrix–vector multiplication (SpMV) and generalized sparse matrix–matrix multiplication (SpGEMM) – that are widely encountered in big data analytics and scientific computations. Compared to random scheduling, our models lead to tremendous savings in data transfer by reducing data traffic from several hundreds of megabytes to just a few megabytes in the shuffle phase and consequently leading up to 2.6x and 4.2x speedup for SpMV and SpGEMM, respectively.Research Council of Turkey (TUBITAK

    Foresight plus : serverless spatio-temporal traffic forecasting

    No full text
    Building a real-time spatio-temporal forecasting system is a challenging problem with many practical applications such as traffic and road network management. Most forecasting research focuses on achieving (often marginal) improvements in evaluation metrics such as MAE/MAPE on static benchmark datasets, with less attention paid to building practical pipelines which achieve timely and accurate forecasts when the network is under heavy load. Transport authorities also need to leverage dynamic data sources such as roadworks and vehicle-level flow data, while also supporting ad-hoc inference workloads at low cost. Our cloud-based forecasting solution Foresight, developed in collaboration with Transport for the West Midlands (TfWM), is able to ingest, aggregate and process streamed traffic data, enhanced with dynamic vehicle-level flow and urban event information, to produce regularly scheduled forecasts with high accuracy. In this work, we extend Foresight with several novel enhancements, into a new system which we term Foresight Plus. New features include an efficient method for extending the forecasting scale, enabling predictions further into the future. We also augment the inference architecture with a new, fully serverless design which offers a more cost-effective solution and which seamlessly handles sporadic inference workloads over multiple forecasting scales. We observe that Graph Neural Network (GNN) forecasting models are robust to extensions of the forecasting scale, achieving consistent performance up to 48 hours ahead. This is in contrast to the 1 hour forecasting periods popularly considered in this context. Further, our serverless inference solution is shown to be more cost-effective than provisioned alternatives in corresponding use-cases. We identify the optimal memory configuration of serverless resources to achieve an attractive cost-to-performance ratio

    Real-time spatio-temporal forecasting with dynamic urban event and vehicle-level flow information

    Get PDF
    Building a real-time spatio-temporal forecasting system is a challenging problem which has many practical applications such as traffic and road network management. Most forecasting research typically focuses on the average quality of predictive models, with much less attention paid to building a practical pipeline and achieving timely and accurate forecasts when the network is under heavy load. Additionally, transport authorities face the issue of how to effectively leverage various dynamic data sources, such as urban events (e.g., scheduled roadworks on the road network, cultural events) and vehicle-level flow data. In this paper, we investigate the practical challenges of real-time forecasting, and present Foresight, a cloud-based system for spatio-temporal forecasting developed in collaboration with Transport for the West Midlands (TfWM). Foresight can ingest, aggregate and process streamed traffic data to produce road network forecasts continuously. We adapt spatio-temporal machine learning methods to incorporate dynamic urban events and vehicle-level flow data, and experimentally evaluate a variety of predictive models in our setting. We employ a data-driven approach to identify peak times in the network, and provide insights on how the performance of forecasting solutions varies for these times when accurate forecasts are most important. We observe that incorporating roadworks into a Graph Neural Network (GNN) model can provide up to a 29.1% performance improvement (MAPE) at a 60-minute forecasting horizon. Further, modelling traffic propagation using vehicle-level flow data in order to support graph-based learning can yield performance gains of 8.8% (MAE) at peak times